Monday, November 17, 2008

Architecture for Internet Data Transfer

This paper advocates for and designs a service-oriented approach to data transfer, in which applications rely on a lower-layer transfer service to to send data from one location to another, instead of reimplementing different transfer methods in each application. The idea is that the service would be the only implementation of each transfer backend, and that by using the service, disparate applications can benefit without requiring reengineering of each application for each transfer type.

This service, called DOT, is designed to not only perform transfers, but to also cache data in case it can be used to shortcut future transfers; combined with the right kind of hashing, this allows a reduction in redundant data transfers even if the data do not look exactly the same. To me, that was the most interesting part of the paper; the parts outlining the interfaces and APIs are relatively straightforward. The service is receiver-pull, and data is "pointed to" by its hash as in other systems we've studied this semester. Applications pass OIDs (which contain the hash and some other data) and the receiver initiates chunk-oriented transfers; the chunking allows caching to be more effective (for example, email message replies combined with proper chunking algorithms allow chunk caching to bypass sending the previous message's text).

DOT also exports a multi-path plugin which allows multiple transfer methods to be used to speed up transfers, something that shows the power of the system: implementing something like that would be much more difficult in a less-modular, application-specific backend for each application. Benchmark results show savings has high as 52%, which is is substantial.

Lastly, the authors do a case study with the postfix email server that demonstrates the ease of changing apps to use DOT, as well as the potential savings.

Overall, the authors have a nice idea that could be useful for reducing the amount of work for adding new transfer methods to applications. However, it seems to me that they have a system where the problem is not egregious enough to neccessarily convince application writers to adopt their system. I don't believe that transfer methods are seen as a major problem in the internet, and thus the need for DOT seems less than obvious.

No comments: