Stratified B-Trees and Versioned Dictionaries

Stratified B-trees and Versioned Dictionaries.
Andy Twigg, Andrew Byde, Grzegorz Miło´s, Tim Moreton, John Wilkesy and Tom Wilkie
Acunu, yGoogle firstname@acunu.com Abstract
External-memory versioned dictionaries are fundamental to file systems, databases and many other algorithms.
The ubiquitous data structure is the copy-onwrite
(CoW) B-tree. Unfortunately, it doesn’t inherit the
B-tree’s optimality properties; it has poor space utilization, cannot offer fast updates, and relies on random IO to scale. We describe the ‘stratified B-tree’, which is the first versioned dictionary offering fast updates and an optimal tradeoff between space, query and update costs.
1 Introduction
The (external-memory) dictionary is at the heart of any file system or database, and many other algorithms. A dictionary stores a mapping from keys to values. A versioned dictionary is a dictionary with an associated version tree, supporting the following operations:
update(k,v,x): associate value x to key k in leaf version v;
range query(k1,k2,v): return all keys (and values) in range [k1,k2] in version v;
clone(v): return a new child of version v that inherits all its keys and values.
Note that only leaf versions can be modified. If clone only works on leaf versions, we say the structure is partially-versioned; otherwise it is fully-versioned.
2 Related work
The B-tree was presented in 1972 [1], and it survives because it has many desirable properties; in particular, it uses optimal space, and offers point queries in optimal
O(logB N) IOs1. More details can be found in [7].
1We use the standard notation B to denote the block size, andN the total number of elements inserted. For the analysis, we assume entries
(including pointers) are of equal size, so B is the number of entries per block. A versioned B-tree is of great interest to storage and file systems. In 1986, Driscoll et al. [8] presented the ‘path-copying’ technique to make

References: 1(3):173–189, 1972. 5(4):264–275, 1996. New York, NY, USA, 2007. ACM. [4] Jeff Bonwick and Matt Ahrens. The zettabyte file system, 2008. In USENIX Annual Technical Conference, pages 43–60, 1992. McGraw-Hill Higher Education, 2nd edition, 2001. In STOC ’86, pages 109–121, New York, NY, USA, 1986 USA, 1999. IEEE Computer Society. [11] Dave Hitz and James Lau. File system design for an nfs file server appliance, 1994. SIGMOD Rec., 20(2):426–435, 1991. Berkeley, CA, USA, 2003. USENIX Association.

Stratified B-Trees and Versioned Dictionaries

You May Also Find These Documents Helpful

Cse 373 Final Note

Cse 373 Final Note

Database Normalization and Service Request Sr-ta-001

Database Normalization and Service Request Sr-ta-001

Analysis Of 'E-Structors'

Analysis Of 'E-Structors'

The Giving Tree Belonging Analysis

The Giving Tree Belonging Analysis

The Independent Notebook Analysis

The Independent Notebook Analysis

01: Database Normalization and Ref

01: Database Normalization and Ref

Data Structure

Data Structure

Data Structure

Data Structure

Should the business move to the cloud

Should the business move to the cloud

NoSQL

NoSQL

Data Warehousing

Data Warehousing

The Boolean Operators Are Key in Searching for Information in Databases as Well as on the Internet. Discuss.

The Boolean Operators Are Key in Searching for Information in Databases as Well as on the Internet. Discuss.

data structures

data structures

The Deep Web Outline

The Deep Web Outline

Semantic Reranking

Semantic Reranking

Related Topics