Thursday, November 6, 2008

Middleboxes No Longer Considered Harmful

In this paper, the authors attempt to build a system that allows "middleboxes" such as NATs and firewalls to be used without violating two architectural rules that current middleboxes ignore: each entity in the internet has a unique global fixed IP address, and that network elements only process packets addressed to them (which is odd, given that routers MUST process elements not addressed to them, and one could certainly call them network entities, given the fact they have IP addresses). It seems to me that rule 1 is more interesting, in that I can see how not having globally-addressable elements makes things difficult in the internet, but I'm not sure about rule 2's importance.

Nevertheless, the authors are building an infrastructure to allow intermediaries, which are middleboxes that do not physically sit in front of a network endpoint and can be composed. Their system, called DOA (an ironic name) functions somewhat like a DNS system at a routing level: packets are addressed to EIDs which resolve either to other EIDs (the intermediaries) or to an IP address. EIDs are stored in a DHT and are self-certified to help with security; however, even the authors point out that this does not guard against MITM attacks, nor does it lessen the need for DHT security (which is already a difficult problem).

The paper goes on to describe implementations of NAT-like and firewall-like DOA boxes. To be honest, I found these sections incredibly dense and bogged down in details; each of the two types of boxes requires what seems like an immense amount of complexity. The performance isn't terrible, although the numbers reported are somewhat odd: for example, they report min and max for DNS but median for EID lookup in the DHT, which does not mean the numbers are comparable.

Overall, DOA seems like a very complex system with many security issues to deal with.

Monday, November 3, 2008

Development of the Domain Name System

This paper outlines the process of designing the Domain Name System as the Internet grew to a size that made a centralized name-to-IP address mapping impossible. As a result, designers attempted to create a distributed database that is decentralized and performant enough to scale to large numbers of names. Reflecting the direction that the network was taking, the DNS system was designed in a hierarchical manner where organizations are responsible for their own subset of the naming world.

In addition, DNS was supposed to be extensible to cover more than just name-to-IP mappings. As far as I know, the only areas where this was successful was for MX mappings and reverse DNS; as the paper points out, because creating new mappings require agreement between the community to give semantics to the new mapping. The most difficult part is community-centric work, not anything technical.

DNS has clearly been successful for email and for name resolution. However, in terms of the extensibility, I would say that the difficulties outlined in the paper still exist and have not been resolved. In addition, the caching, although it is important to obtain decent performance (or, as the paper says, "tolerable performance") I am not convinced that the caching actually helps for most cases: since DNS has become a mechanism for load balancing, the most popular sites will have low TTL values to ensure proper load balance between actual machines; in addition, these are the same names that could benefit from caching.

The paper touches slightly on the security issues, but caching necessarily can cause problems due to spoofing; if someone spoofs a name server and separately ensures that a query is cached at a local name server with a high TTL, they can nearly permanently override the proper owners; I believe similar attacks have occurred in practice. To address this, I believe security has been added through DNSSEC although I'm not familiar with how it works.

One interesting thing in the paper was that they envisioned the DNS also being used for things like having a universal address for files; I think this basically came about through the URI scheme which leverages naming from DNS without having to add another record type; names map to IP addresses and the path in the rest of the URI results in a protocol-specific path on the server. Still, this was an interesting forward-thinking item that did not happen due to the problems outlined above in defining a well-understood semantic meaning to the mapping in the DNS database records.

Wednesday, October 29, 2008

Chord

This paper presents Chord, one of the first implementations of a DHT. Using consistent hashing, where both the keys and the nodes live in the same single-dimensional space, the DHT assigns keys to nodes by looking at the closest node in the space that follows the key. Thus, there is no ambiguity in a stable DHT when it comes to which node is responsible for which key, and, because the nodes are distributed evenly in the space, with high probability each node is responsible for about the same number of keys.

Dealing with a query is done by either responsing to the query (if the current node is responsible for the queried key) or by forwarding the query to a node closer to the key in the identifier space. Using a skiplist like structure (called "finger pointers" here), Chord can use log N queries to find an item in the hash because the skiplist maintains pointers to power-of-2 distances; this means that in each forwarding operation, Chord makes the query get at least half the remaining distance closer than it was before.

The evaluation in this paper is interesting. One optimization they make is to have "virtual nodes" such that a physical node is logically several nodes at once; the one problem I see this causing (which doesn't seem to be addressed in the paper) is that this results in larger amounts of nodes going down when a single physical node disconnects or fails. An evaluation of this effect is important. The nodes that fail, then, are not in fact random as in section 6.4; nonrandom failures may cause Chord to fail.

Lastly, there is no thought given to robustness in the face of malicious nodes. This seems like an important aspect of a DHT, or any overlay network because these overlays are over the public internet.

Overall however, this is a nice paper and although it is missing some important considerations, I think it is a classic in terms of presenting a simple, elegant data structure that uses an overlay network and some performance results.

Looking Up Data in P2P Systems

This paper introduces the Distributed Hash Table data structure, which is a mapping from key to which node in an overlay network has the corresponding data to that key. DHTs are useful in creating a distributed store of key-value pairs existing in an overlay network, while remaining resilient to nodes entering and leaving the network as well as node failures.

The DHT implementations in the paper work by, when receiving a request for a lookup, either answering the lookup if the node is responsible for that key, or by forwarding the request to a node "closer" to the node responsible for the key. As a result, the paper divides the DHT implementations into those looking at distances in a single dimension (Chord, Pastry) and those looking at distances in multiple dimensions (CAN). Of these, the simplest is Chord, and the most complex routing algorithm described is that of CAN.

The open questions section is the most useful here: real-world fault tolerance under many simultaneous connections, disconnections, and failures is difficult to get right and it is unclear if any of the existing DHTs do a particularly good job. For example, the routing algorithm as described for CAN requires, after some disconnections and/or failures, a redistribution of the space which can be expensive. Although the paper does not say as much, it seems that perhaps the DHTs are not as robust (to failures as well as to malicious users) as their authors would like, nor are they as performant as users would like. These seem like the major issues to address.

Tuesday, October 28, 2008

Active Networks Vision and Reality

This paper presents lessons learned through the ANTS implementation of active networks, which are a kind of overlay network that allow users to write the routing code executed on each node within the network. The basic architecture is that each node has a runtime executing, while packets choose (by reference: an MD5 hash is a "pointer" to the code) which routing program to run. Such programs can be arbitrarily injected into the network, as long as they have been certified by some central authority that says they are not malicious.

The security issues of ANTS take much consideration in the design. They attempt to mitigate the threat of running arbitrary code by limiting the kinds of operations that can occur, the state available to read and write, and by using a central certifying authority to guarantee safety. In fact, if one is going to use a certifying authority, it may make sense to shortcut the safety guarantees of the runtime and allow more efficient machine code to run on each router instead of java bytecodes running in a VM. It seems the reason the implementation works as it does is that the designers wanted to have arbitrary non-certified code running, but could not figure out how to make global guarantees in terms of not allowing a single user or program from consuming more than a fair share of global resources.

This could be tackled by having more coupling between routing nodes, using some sort of gossip protocol or other way to exchange information. Such algorithms could guarantee that only some fraction of the routers can lie at a time, making the information somewhat reliable. However, there may just be no way to represent global resource usage in a way that reliably guarantees the information is accurate enough to make a decision whether to run a program that is not certified.

Resilient Overlay Networks

In this paper, an application-layer overlay network is built using UDP in order to build a more resilient routing infrastructure that, because of its smaller size, can respond to disruptions and slowdowns faster and more flexibly than the traditional BGP approach between ASs. The particular implementation of these application-level routers in this paper uses a single-hop intermediary between each node in order to route around failures; this gives essentiall n-2 paths where before there was as single path (as controlled by information obtained via BGP).

With this path multiplicity, applications can choose how which paths (or under which metrics paths should be chosen) depending on their requirements. In addition to this application-level choice, routers can be more powerful in terms of having much more detailed policies for routing, since the network is essentially a private one built on top of the public internet and because the networks targeted are small relative to the size of ASs. Thus routers can spend more time per packet than a gateway router in the internet can.

One thing that jumps out to me is that RON (unlike the next paper) is geared towards a network that has nodes that can be implicitly trusted. For example, the use of policy tags to bypass some of the per-packet computation is subject to trusting the policy tags you receive; there is nothing to prevent a malicious user from rewriting policy tags to ensure they get the highest level of service while others do not.

The evaluation results are very interesting. In almost every metric, RON improves the metric *in most cases*, but there are always some measurements under which RON does substantially worse. For example, in dealing with packet loss, RON helps improve path reliability in most cases, but there are always a few percentage of samples that show worse behavior during outages: that is, some packets that would be routed if using just underlying connectivity do not get routed with RON. Similarly, RON can make loss rates worse (although the authors believe this is due to using bidirectional information instead of unidirectional; it remains to be seen if that would make a difference).

Monday, October 20, 2008

First-Principles Approach to Understanding Internet Router-Level Topology

This paper also attempts to understand how routers interconnect in the Internet, but instead of using power laws, they use a "first principles" approach by taking into account both technology restrictions of routers and the economic principles guiding ASs to interconnect routers--- this means looking at hierarchies of ASs and routers as well. That is, taking into account that ASs can be level-1, level-2, or level-3 and therefore have proportional traffic to their level.

Criticizing the previous paper's methodology, the authors rightly claim that power laws hold for graphs that look much different from each other; edge rerouting schemes can take one power law graph and turn it into another with very different connectivity and radius but with the same distribution, for example, of ranks. They also point out that there is little evidence the power law means anything for the topologies since there were so few data points.

I think the methodology here is much more sound. First, they actually use subsets of the internet as well as a model of how routers can connect hierarchically in the ideal case; the internet topology is found to be in a region that, measured with their "likelihood" metric, can only have come about by using non-random strategies to connect routers. Therefore generating graphs randomly, even if they follow some distribution of ranks and other metrics, does not seem like a good way to create a representative topology.