§ Cache oblivious B trees

Central idea: assume a memory model where computation is free, only cost is pulling data from cache into memory. Cache has total size MM, can hold blocks of size BB. So it can hold M/BM/B blocks of main memory. Memory memory has infinite size. Cost is number of transfers. We assume that the algorithm does not know M or B . We assume that the cache replacement strategy is optimal (kick out block that is going to be used farthest in the future). This is an OK assumption to make since an LRU cache using twice the memory of a "oracular" cache performs equally well (citation?) These data structures are cool since they essentially "Adapt" to varying cache hierarchies and even multiple level cache hierarchies. We study how to build cache-oblivious B-trees.

§ Building optimal cache-oblivious B trees to solve search

  • We use a balanced BST. We want to find an order to store nodes in memory such that when we search for an element, we minimize number of blocks we need to pull in.
  • All standard orders such as level order, pre-order, post-order fail.
  • Corrrect order is "VEB (Van Em De Boas) order": carve a tree at the middle level of its edges. Layout a "triangle" or smaller collection of nodes linearly. Then Recursively layout the trees, linearly in memory.
  • Supposedly if the number of nodes is NN, we wil have roughly (N)\sqrt(N)nodes on the top, and then (N)\sqrt(N) triangles at the bottom.

§ Analysis Claim: we need to pull O(logBN)O(\log_B N) blocks for any BB for any search query

NN is the number of nodes in the BST. Note that in the analysis, we know what B is , even though the algorithm does not .
  • We look at a particuar level of recursion. We will call it a "level of detail" straddling B.
  • We will have large triangles of size B\geq B, inside which there are smaller triangles of size B\leq B (reminds me of sierpinski).
  • We know that the algorithm recursively lays it out, and triangle stores everything "inside" it in a contiguous region . So we stop at the requisite size where we know that the tree's triangles themselves contain triangles which fit into the block size.
  • A little triangle of size less than B can live in at most two memory blocks by straddling a block boundary: by eg. having (B1)(B-1) bits in one block and a single bit in another block.
1 2 3 4 5 6 7 8 <- index
|     |       | <- boundary
|-xxxxxxx-----| <-  data
  • The question is that on a root-to-leaf bpath, how many such triangles do we need to visit. Since we repeatedly divide the nodes in half with respect to height until the little triangle has number of nodes less than BB, the height is going to be O(logB)O(\log B) since it's still a binary tree.
  • total height in O(logN)O(\log N).
  • so height of "chunked tree" where we view each triangle as a single node is logN/logB=logBn\log N / \log B = \log_B n.
  • insight : ou data structure construction in some sense permits us to "binary search on BB" since we divide the data structure into levels based on BB. if B=NB = N, then the full data structure fits into memory and we're good.

§ Black box: ordered file maintainince

We need a black box: ordered file maintainance (linked list for arrays)
  • Store nn elements in specified order in an array of linear size O(N)O(N). Array permits gaps.
  • updates: delete element, insert elements between 2 elements.
  • cannot do this in linear time, but we can move elements in an interval of size log2(N)\log^2(N) amortized.
  • We need O(1)O(1) scans for the data structure.

§ Next: dynamic BST (inserts and deletes): layout

we take a VEB static tree on top of an ordered file. Tree is a segtree that has max of nodes. Leaves are the members of the ordered file.

§ Updates

  • search for node.
  • update ordered file.
  • propogate updates into the tree. This will have to be done in post-order because we need the leaves to be fixed before we can update the parent max.

§ Updates: analysis.

  • look at level of detail that straddles BB.
  • Let us look at the bottom 2 levels.
  • Note that when we perform post-order inside a triangle that has 3 triangles of size B\leq B, we need to alternate between parent triangle and child triangle. Since the parent triangle is of size B\leq B and can therefore take at most 2B2B blocks of memory, similarly the child can take at most 2B2Bblocks of memory.
  • So if our cache can hold 44 blocks of memory, we're done. We won't need to kick anything out when performing the post-order traversal.
  • For levels that are above the bottom 2 levels, we're still OK. there are not many triangles! / not many nodes! ( 1:16:00 in the video)

§ References