Wednesday, October 29, 2008

Chord

This paper presents Chord, one of the first implementations of a DHT. Using consistent hashing, where both the keys and the nodes live in the same single-dimensional space, the DHT assigns keys to nodes by looking at the closest node in the space that follows the key. Thus, there is no ambiguity in a stable DHT when it comes to which node is responsible for which key, and, because the nodes are distributed evenly in the space, with high probability each node is responsible for about the same number of keys.

Dealing with a query is done by either responsing to the query (if the current node is responsible for the queried key) or by forwarding the query to a node closer to the key in the identifier space. Using a skiplist like structure (called "finger pointers" here), Chord can use log N queries to find an item in the hash because the skiplist maintains pointers to power-of-2 distances; this means that in each forwarding operation, Chord makes the query get at least half the remaining distance closer than it was before.

The evaluation in this paper is interesting. One optimization they make is to have "virtual nodes" such that a physical node is logically several nodes at once; the one problem I see this causing (which doesn't seem to be addressed in the paper) is that this results in larger amounts of nodes going down when a single physical node disconnects or fails. An evaluation of this effect is important. The nodes that fail, then, are not in fact random as in section 6.4; nonrandom failures may cause Chord to fail.

Lastly, there is no thought given to robustness in the face of malicious nodes. This seems like an important aspect of a DHT, or any overlay network because these overlays are over the public internet.

Overall however, this is a nice paper and although it is missing some important considerations, I think it is a classic in terms of presenting a simple, elegant data structure that uses an overlay network and some performance results.

1 comment:

Matthias Goerner said...

Regarding the "virtual nodes". I think the reason is 1. to ensure a more equal distribution of how many keys a physical node has (if this is desired) and 2. to allow different physical nodes to have different amounts of keys depending on how many "virtual nodes" they host.
The "virtual nodes" on one host do not store a contigouos block in the ID space, and they have different keys, so there won't be two copies of one key on one machine. So the impact of one host killing several "virtual nodes" does not corrupt more data than running only one "virtual node" per host. I.e. the "virtual nodes" failing when a host goes down are as good as picked random.