By the end of this unit, you should be able to
- Explain the roles and functionality of a disk manager
- Explain the roles and functionality of a Buffer Pool
- Explain the LRU and Clock Replacement Policy
- Explain the different collision handling mechanisms of Hash Table
- Explain the structure of a B+ Tree
- Explain the lookup / insert / delete operations over a B+ Tree.
- Disk manager maps tables to data files
- Each data file contains a collection of pages (pages vs block)
- Each page contains a collection of tuples, header and indices.
- Buffer Pool serves as a middle man between DB operation and the physical disk, it is a cache.
- Buffer pool contains a set of frames
The Buffer Pool maintains the following information,
- Page to frame mapping
- Whether a page is pinned. A page is pinned means it is being accessed currently. When a data operation (query/insert/update/delete) is performed it will pin all the required pages. When the operation is completed, the pages are unpinned. Note that in a full scale system, we might see concurrent accesses to pages, thus, in most of the situation, the Buffer Pool maintains a pin counter per slot.
- Whether a page is dirty.
- Extra information to support the eviction policy.
The Least Recently Used (LRU) Policy works as follows,
- pinned frames should not be evicted
- keep track of the timestamp when frame's pin was set to 0.
- evict the page in the frame with the smallest (oldest) timestamp.
In details steps.
Initialize all frames to be empty with pincount 0
- when a page is being pinned
- if the page is already in the pool, increase the pincount, return the page in the frame.
- otherwise,
- find the frames with 0 pincount, find the one oldest timestamp, write it to disk if it is dirty.
- load the new page to this vacant frame, set pincount = 1
- when a page is being unpinned, decrease the pincount and update the timestamp to the current time.
You have 4 frames in the buffer pool. Given this access pattern (buffer is empty at start)
A B C D A F A D G D G E D F
- What is the hit rate if you use LRU policy? Show the final state of the buffer pool. (Hit rate is the number of page hits / total page pin requests)
- Same question, but for MRU policy.
- When would MRU be better than LRU?
The idea behind Bucket Hashing (a.k.a extensible hashing) is to
- store hashed values in buckets (relative small fixed size arrays, so that the sequential search is not expensive). All buckets have the same size.
- use a
$n$ least significant bits (LSB) of the hashed value to decide in which bucket it should be placed. - increase
$n$ and add new bucket gradually as some bucket becomes full.
The Bucket Hashing algorithm starts with a global slot array, a bucket.
It maintains a set of integer values of
To insert a value
- lookup the bucket for
$X$ based on the last$G$ bits of$X$ .- if the bucket
$i$ is not full, insert$X$ there. - otherwise
- if the bucket
$i$ having$L < G$ - add a new bucket
$j$ having same$L$ as$i$ , - redistribute data from
$i$ to$i$ and$j$ . - increase
$L$ for both buckets$i$ and$j$ . - update the slot array
- add a new bucket
- otherwise
- double the slot array
- add a new bucket
$j$ having same$L$ as$i$ - redistribute data from
$i$ to$i$ and$j$ . - increase
$L$ for both buckets$i$ and$j$ . - increase
$G$ - update the slot array
- if the bucket
- if the bucket
Consider an extendible hashing scheme (Bucket Hashing):
- Hash function is the binary representation of the key, e.g. H(3) = 000011
- Starting from an empty table, insert 15,3,7,14
- Draw the final table hash table, including the slot array
- Size of each bucket is 2
A B+ Tree is a generalization of a perfect balanced binary search tree, with the following adjustment
- Let
$d$ denotes the order. - The root node may have
$1$ to$2*d$ entries. - Each non-root node may have
$d$ to$2*d$ entries (half-full). - The each in a node consist of a key and a value, except for the first entry i.e.
$v_1, (k_2, v_2), (k_2, v_2), ... , (k_{n}, v_n)$ . - The values in non-leaf nodes are reference to the children.
- The values in the leaf nodes are reference to records in the data file/page/block.
- Given two consecutive entries
$(k_i, v_i), (k_{i+1}, v_{i+1})$ , where$v_{i}$ is a reference to a subtree, for all values in subtree, their index values must be in between$k_i$ and$k_{i+1}$ . For the first entry, its lower bound is definedby the key coming from the parent. (Similar observation applies to the last entry.) - All the leaf nodes for a doublely linked list, which enables the range search operation.
To search for a value with key
- find the value
$v$ between$k_1$ and$k_2$ such that$k_1 \leq k < k_2$ . Note that$k_1$ might not exist if$k_2$ is the first entry's key,$k_2$ might not exist if$k_1$ is the last entry's key.- if the current node is a non-leaf node, we move the current node to the node pointed by
$v$ , we repeat the process recursively - if the current node is a leaf node, we return the disk data pointed by
$v$ .
- if the current node is a non-leaf node, we move the current node to the node pointed by
It is clear that the time complexity is
To insert a value with key
- find the leaf node
$L$ with the current interval where$k$ should fit, (same as the look-up operation). - insert_to_leaf(
$L$ ,$k$ ) - def insert_to_leaf(
$L$ ,k)- if
$L$ is not full, just insert, we are done! - otherwise
- create a new node
$L'$ . - (update linked list, only for leaf node), let
$L'' = succ(L)$ , then$succ(L) = L'$ and$succ(L') = L''$ . - redistribute entries evenly between
$L$ and$L'$ . - copy up the middle key, with the value as a pointer to
$L'$ , that is to insert a new data entry in the parent node. insert_to_nonleaf(parent($L$ ), middle_key)
- create a new node
- if
- def insert_to_nonleaf(
$N$ , k)- if
$N$ is not full, just insert, we are done! - otherwise
- create a ne wnode
$N'$ . - redistribute entries evenly between
$N$ and$N'$ . - push up (note, not copy) the middle key, with the value as a pointer to
$N'$ .- if
$N$ is not a root node, insert_to_nonleaf(parent(N), middle_key) - otherwise, create an empty node as the new root node, insert_to_nonleaf(parent(N), middle_key)
- if
- create a ne wnode
- if
Given a node
To delete a value with key
-
find the leaf node
$L$ with the current interval where$k$ should fit, (same as the look-up operation). -
delete_from_leaf(
$L$ ,$k$ ) -
def delete_from_leaf(
$L$ ,$k$ )- remove the entry with
$k$ - if
$L$ is at least half-full (i.e.$|L -{k}| \geq d$ ), we are done! - otherwise
- if
$L$ has a sibling$L'$ , and$k'$ is the key from the parent that divides$L$ and$L'$ , such that$|L \cup {k'} \cup L'-{k}| \geq 2*d$ - find the new middle key in
$L \cup {k'} \cup L'-{k}$ , say$k''$ , replace$k'$ by$k''$ in$parent(L)$ - if
$|L \cup {k'} \cup L'-{k}|-1 \geq 2*d$ - re-distribute
$L \cup {k'} \cup L' - {k,k''}$ among the two leaf nodes. - otherwise, re-distribute
$L \cup {k'} \cup L'- {k}$ among the two leaf nodes, i.e.$k''$ is copied up.
- re-distribute
- find the new middle key in
- otherwise
- merge
$L$ with its sibling$L'$ into a new node as$L \cup {k'} \cup L' - {k}$ , where$k'$ is the key from the parent dividing$L$ and$L'$ . - delete_from_nonleaf(
$parent(L)$ ,$k'$ )
- merge
- if
- if
- remove the entry with
-
def delete_from_nonleaf(
$N$ ,$k$ )- remove the entry with
$k$ - if
$N$ is a root node and$|N -{k}| > 0$ , we are done! - if
$N$ is a root node and$|N -{k}| == 0$ , we remove$N$ entirely. - if
$N$ is not a root node and$N$ is at least half full, we are done. - otherwise
- if
$N$ has a sibling$N'$ , and$k'$ is the key from the parent that divides$N$ and$N'$ , such that$|N \cup N' - {k}| \geq 2*d$ ,- find the new middle key in
$N \cup {k'} \cup N'$ , say$k''$ , replace$k'$ by$k''$ in$parent(N)$ , redistribute$|N \cup N' - {k}|$ among the two nodes.
- find the new middle key in
- otherwise
- merge
$N$ with its sibling$N'$ into a new node as$N \cup {k'} \cup N' - {k}$ , where$k'$ is the key from the parent dividing$N$ and$N'$ . - delete_from_nonleaf(
$parent(N)$ ,$k'$ )
- merge
- if
- remove the entry with
Given the following B+ Tree with
- Draw the tree after inserting 13, 15, 18, 25, 4 then deleting 4, 25, 18, 15, 13
- What did you observe?