Dan Ports's publications

[1] Dan R. K. Ports, Austin T. Clements, Irene Y. Zhang, Samuel Madden, and Barbara Liskov. Transactional caching of application data using recent snapshots. In Proceedings of the 22th ACM Symposium on Operating Systems Principles (SOSP '09), Big Sky, MT, USA, October 2009. ACM. Work in Progress report. [ bib | slides (.pdf) | .ps.gz | .pdf ]
Many of today's well-known websites use application data caches to reduce the bottleneck load on the database, as well as the computational load on the application servers. Distributed in-memory shared caches, exemplified by memcached, are one popular approach. These caches typically provide a get/put interface, akin to a distributed hash table; the application chooses what data to keep in the cache and keeps it up to date. By storing the cache entirely in memory and horizontally partitioning among nodes, in-memory caches provide quick response times and ease of scaling.

However, existing caches have no notion of transactional consistency: there is no way to ensure that two accesses to the cache reflect a view of the database at the same point in time. While the backing database goes to great lengths to ensure this property (serializable isolation), the caching layer violates these guarantees. The resulting inconsistencies can have unpleasant consequences if exposed to the user (e.g., attributing the latest bid to the wrong user on an auction site), or add complexity to application code by forcing it to cope with temporarily violated invariants.

We argue that transactional semantics are not incompatible with cache performance and scalability. We introduce a transactional cache, TxCache, which guarantees that all values retrieved from the cache or database during a transaction reflect a consistent snapshot of the database.

TxCache also strives to simplify application design by helping manage the cache. Instead of requiring applications to manually insert and check for values in the cache, TxCache provides a library with which programmers simply designate functions as cacheable, and the library checks the cache for previous calls with the same arguments. In particular, and unlike memcached, TxCache does not require applications to explicitly invalidate cached values; correctly identifying the values to invalidate is difficult because it requires global reasoning about the application.

[2] James Cowling, Dan R. K. Ports, Barbara Liskov, Raluca Ada Popa, and Abhijeet Gaikwad. Census: Location-aware membership management for large-scale distributed systems. In Proceedings of the 2009 USENIX Annual Technical Conference, San Diego, CA, USA, June 2009. USENIX. [ bib | slides (.pdf) | .ps.gz | .pdf ]
We present Census, a platform for building large-scale distributed applications. Census provides a membership service and a multicast mechanism. The membership service provides every node with a consistent view of the system membership, which may be global or partitioned into location-based regions. Census distributes membership updates with low overhead, propagates changes promptly, and is resilient to both crashes and Byzantine failures. We believe that Census is the first system to provide a consistent membership abstraction at very large scale, greatly simplifying the design of applications built atop large deployments such as multi-site data centers.

Census builds on a novel multicast mechanism that is closely integrated with the membership service. It organizes nodes into a reliable overlay composed of multiple distribution trees, using network coordinates to minimize latency. Unlike other multicast systems, it avoids the cost of using distributed algorithms to construct and maintain trees. Instead, each node independently produces the same trees from the consistent membership view. Census uses this multicast mechanism to distribute membership updates, along with application-provided messages.

We evaluate the platform under simulation and on a real-world deployment on PlanetLab. We find that it imposes minimal bandwidth overhead, is able to react quickly to node failures and changes in the system membership, and can scale to substantial size.

[3] Dan R. K. Ports and Tal Garfinkel. Towards application security on untrusted operating systems. In Proceedings of the 3rd Workshop on Hot Topics in Security (HotSec '08), San Jose, CA, USA, July 2008. USENIX. [ bib | slides (.pdf) | .ps.gz | .pdf ]
Complexity in commodity operating systems makes compromises inevitable. Consequently, a great deal of work has examined how to protect security-critical portions of applications from the OS through mechanisms such as microkernels, virtual machine monitors, and new processor architectures. Unfortunately, most work has focused on CPU and memory isolation and neglected OS semantics. Thus, while much is known about how to prevent OS and application processes from modifying each other, far less is understood about how different OS components can undermine application security if they turn malicious.

We consider this problem in the context of our work on Overshadow, a virtual-machine-based system for retrofitting protection in commodity operating systems. We explore how malicious behavior in each major OS subsystem can undermine application security, and present potential mitigations. While our discussion is presented in terms of Overshadow and Linux, many of the problems and solutions are applicable to other systems where trusted applications rely on untrusted, potentially malicious OS components.

[4] Xiaoxin Chen, Tal Garfinkel, E. Christopher Lewis, Pratap Subrahmanyam, Carl A. Waldspurger, Dan Boneh, Jeffrey Dwoskin, and Dan R. K. Ports. Overshadow: A virtualization-based approach to retrofitting protection in commodity operating systems. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08), Seattle, WA, USA, March 2008. ACM. [ bib | .ps.gz | .pdf ]
Commodity operating systems entrusted with securing sensitive data are remarkably large and complex, and consequently, frequently prone to compromise. To address this limitation, we introduce a virtual-machine-based system called Overshadow that protects the privacy and integrity of application data, even in the event of a total OS compromise. Overshadow presents an application with a normal view of its resources, but the OS with an encrypted view. This allows the operating system to carry out the complex task of managing an application's resources, without allowing it to read or modify them. Thus, Overshadow offers a last line of defense for application data.

Overshadow builds on multi-shadowing, a novel mechanism that presents different views of “physical” memory, depending on the context performing the access. This primitive offers an additional dimension of protection beyond the hierarchical protection domains implemented by traditional operating systems and processor architectures.

We present the design and implementation of Overshadow and show how its new protection semantics can be integrated with existing systems. Our design has been fully implemented and used to protect a wide range of unmodified legacy applications running on an unmodified Linux operating system. We evaluate the performance of our implementation, demonstrating that this approach is practical.

[5] Dan R. K. Ports, Austin T. Clements, and Erik D. Demaine. PersiFS: A versioned file system with an efficient representation. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP '05), Brighton, United Kingdom, October 2005. ACM. Poster and extended abstract. [ bib ]
[6] Austin T. Clements, Dan R. K. Ports, and David R. Karger. Arpeggio: Metadata searching and content sharing with Chord. In Proceedings of the 4th International Workshop on Peer-to-Peer Systems (IPTPS '05), volume 3640 of Lecture Notes in Computer Science, pages 58-68, Ithaca, NY, USA, February 2005. Springer. [ bib | slides (.pdf) | .ps.gz | .pdf ]
Arpeggio is a peer-to-peer file-sharing network based on the Chord lookup primitive. Queries for data whose metadata matches a certain criterion are performed efficiently by using a distributed keyword-set index, augmented with index-side filtering. We introduce index gateways, a technique for minimizing index maintenance overhead. Because file data is large, Arpeggio employs subrings to track live source peers without the cost of inserting the data itself into the network. Finally, we introduce postfetching, a technique that uses information in the index to improve the availability of rare files. The result is a system that provides efficient query operations with the scalability and reliability advantages of full decentralization, and a content distribution system tuned to the requirements and capabilities of a peer-to-peer network.

[7] Austin T. Clements, Dan R. K. Ports, and David R. Karger. Arpeggio: Efficient metadata-based searching and file transfer with DHTs. In Proceedings of the 2nd Project IRIS Student Workshop (ISW '04), Cambridge, MA, USA, November 2004. Poster and extended abstract. [ bib ]
Arpeggio is a peer-to-peer file-sharing network based on the Chord distributed hash table. Queries for files whose metadata matches a certain criterion are performed efficiently by using a distributed keyword-set index, augmented with index-side filtering. We introduce metadata gateways, a technique for minimizing index maintenance overhead. Arpeggio also uses the DHT for indirect storage of file contents, maintaining pointers from content to the live peers that provide it. Finally, we introduce postfetching, a technique that uses information in the index to improve the availability of rare files. The result is a system that provides efficient query operations with the scalability and reliability advantages of full decentralization, and a content distribution system tuned to the requirements of a peer-to-peer file-sharing network.

[8] Dan R. K. Ports. Arpeggio: Metadata indexing in a structured peer-to-peer network. Master's thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, February 2007. [ bib | .ps.gz | .pdf ]
Peer-to-peer networks require an efficient means for performing searches for files by metadata keywords. Unfortunately, current methods usually sacrifice either scalability or recall. Arpeggio is a peer-to-peer file-sharing network that uses the Chord lookup primitive as a basis for constructing distributed keyword-set index, augmented with index-side filtering, to address this problem. We introduce index gateways, a technique for minimizing index maintenance overhead. Arpeggio also includes a content distribution system for finding source peers for a file; we present a novel system that uses Chord subrings to track live source peers without the cost of inserting the data itself into the network, and supports postfetching: using information in the index to improve the availability of rare files. The result is a system that provides efficient query operations with the scalability and reliability advantages of full decentralization. We use analysis and simulation results to show that our indexing system has reasonable storage and bandwidth costs, and improves load distribution.

[9] Dan R. K. Ports, Austin T. Clements, and Irene Y. Zhang. Optimizing distributed read-only transactions using multiversion concurrency. 6.830 (Database Systems) Project Report, December 2007. [ bib | slides (.pdf) | .ps.gz | .pdf ]
Distributed transactional systems typically achieve efficiency by abandoning true serializability for weaker forms of consistency that are difficult to reason about because they expose the concurrency in the underlying system. We explore an alternate route: weakening causality instead of consistency. Our proposed algorithm achieves global serializability by sacrificing global causality, which we argue is reasonable in many situations. This allows our algorithm to achieve efficiency by permitting read-only transactions to operate on stale but locally available cache data. We present the details of a transactional block storage protocol that implements this form of concurrency control, as well as a performance evaluation of an experimental implementation of this protocol and comparison against conventional optimistic concurrency control.

[10] Dan R. K. Ports, Austin T. Clements, and Irene Y. Zhang. Plaid: Pattern language for abstract datatypes. 6.891 (Advanced Symbolic Programming) Project Report, May 2007. [ bib | slides (.pdf) | .ps.gz | .pdf ]
The expressiveness of traditional syntactic pattern matching is severely limited by its lack of abstraction. Because syntax patterns are mired in the built-in types understood by the pattern matching system, they lack the ability to express patterns over abstract data types (ADT's). More advanced pattern matching techniques, such as semantic matching, can overcome this, but at the per-ADT cost of the complex code required to add new pattern combinators to the system.

Plaid defines a new pattern language that captures a strict subset of Scheme capable of both regular computation, as well as reverse computation. This allows it to overcome both the limitations of syntactic patterns and the cost of semantic patterns by providing a means by which programmers can write a single specification of the mapping between the abstract and concrete representations of an ADT that simultaneously serves as constructor, predicate, accessor, and pattern combinator for that ADT. This specification is written virtually identically to how a regular constructor would be written.

Furthermore, the Plaid pattern language is capable of capturing non-determinism and decisions within pattern matching, thus admitting a very broad interpretation of what can be considered an ADT constructor. This leads to variety of interesting capabilities, such as the ability to view concrete data in multiple abstract ways, the ability to canonicalize multiple concrete representations in one abstract way, and the ability to imagine more convenient representations of existing data.

[11] Dan R. K. Ports, Austin T. Clements, and Jeff Arnold. Canopy: A controlled emulation environment for network system experimentation. 6.829 (Computer Networks) Project Report, December 2005. [ bib | .ps.gz | .pdf ]
Network systems are hard to debug because they are inherently parallel and non-deterministic. Canopy assists with network debugging by putting the entire network system into a controlled emulation environment constructed from virtual machines and a simulated network. This puts all variables under the user's control and provides a coherent, omniscient viewpoint of the entire system. To aid the user in observing and manipulating the system, Canopy provides tools such as traffic visualization, packet manipulation, rollback and replay.

[12] Austin T. Clements, Dan R. K. Ports, Ben A. Schmeckpeper, and Hector Yuen. PersiFS: A continuously versioned network file system. 6.824 (Distributed Systems Engineering) Project Report, May 2005. [ bib | slides (.pdf) | .ps.gz | .pdf ]
Most file systems are ephemeral, meaning that once a change has been made, there is no way to recall the previous contents of the file system. Backups, version control systems, and user interface improvements such as "trash cans" attempt to alleviate this problem; however, these are all rough approximations of persistent file system structures, giving users restricted access to a restricted set of past states of the file system. PersiFS is a fully persistent file system, providing access to any past state of the entire file system. PersiFS achieves full persistence without sacrificing access time to either current versions or past versions, using inordinate amounts of disk space, or requiring modification to existing applications.

[13] Dan R. K. Ports and Austin T. Clements. Structures for efficient file system-scale partial persistence. 6.897 (Advanced Data Structures) Project Report, May 2005. [ bib | slides (.pdf) | .ps.gz | .pdf ]
A persistent file system stores every previous state of each file, allowing convenient access to the full state of the file system as it appeared at any point in the past. Achieving this convenient feature presents a challenging data structural problem because the amount of data involved is so large: it must use as little space as possible, and provide efficient operations for modifying the current state and accessing both current and past states. We formalize persistent file systems as a problem in data structures, and analyze it in the context of the external memory model. We begin by considering the design of our initial solution to this problem from the PersiFS1 file system, which is based on a log of metadata changes and an indirection layer for storing file data. These "systems" data structures support the desired operations, but are not asymptotically efficient. Applying more advanced data structures, we refine the design into the next version, PersiFS2. We use B+-trees for file content indexing in order to improve the space efficiency of the system, and we present a novel partially-persistent B+-tree design, which can be used to track changes to files with logarithmic modification and query cost. PersiFS2 has been implemented as a working file system with these data structures, and our measurements indicate that the new file system data structure provides dramatically improved access time for previous revisions with a small increase in cost for modifications.


This file was generated by bibtex2html 1.92.