1 Introduction
The operation of B-Trees is an important aspect of database systems;
virtually every commercial system includes indexes based on
B-Trees. In the past, cs186 has included a B-Tree assignment that
involved implementing a B-Tree index for the Minibase database
system. Creating such an index was technically quite difficult,
since the index needed to be stored in pages on disk that were moved
into memory when used. Luckily, your task will be considerably
simpler: implementing the "search", "insert" and "delete" functions for
an in-memory B-Tree that uses Integer keys.
2 B-Trees
The structure and behavior of B-Trees is described in Chapter 10 of the
textbook, and also in the lecture notes for
Lecture 17, which covered
tree-structured indexes. The textbook even includes a pseudo-code
describing the process of search, insert, and delete. I recommend
that you review these materials to make sure you understand how these
operations work.
3 The B-Tree Program
In class I demonstrated a B-Tree program I wrote in Java. The UI
appeared as follows:
This program begins with an empty B-Tree, and allows the user to add
and delete values. It also keeps track of the tree at each stage,
so the user can see how the tree changes with insertion and
deletion. In the tree display, leaf nodes are drawn blue, and the
non-leaves as black. When started from the command line, this
program accepts an integer argument indicating the maximum number of
keys permitted per node. In the example above, nodes may have no
more than two keys per node.
4 The Assignment
For this assignment, I will provide you with a Java .jar file
Btree.jar containing most of the classes required
to run the above program. All that will be missing is the Node
class, in which is implemented logic for search, insertion, and
deletion. The Node class is a subclass of AbstractNode, which
contains members for keeping track of the keys and children for each
node. A javadoc description of AbstractNode is available
here. I am also providing sample
java code for
Node.java containing stubbed out
versions of the search, insert, and delete methods.
All the necessary files may be found in
~cs186/fa03/Hw4/Hw4.tar.gz. Copy this file and untar it in an
appropriate directory. 'cd' to this directory. You can
compile the Node.java file with the following command:
javac -classpath
".:./BTree.jar:$CLASSPATH" Node.java
Once you have compiled Node.java, you can run the program with the
following command:
java -classpath
".:./BTree.jar:$CLASSPATH" BTreeMain 3
To run the above program allowing 3 keys per node.
5 Implementation Details
For search and insert, you should be able to follow the algorithms as
described in the text book.
For insert, do
not do any
redistribution of keys. When a node gets too full, just split
it. As described in the book, when leaves split, you copy a key
up, and when non-leaves split, push a key up.
For delete, in order to be consistent with the grading standard, you
must follow the following
- after deletion, if a node has too few keys, always try redistribute keys from
the sibling to the left. If there is no sibling to the left, use
the sibling to the right. If there are not enough keys to
redistribute, then merge the two. For example, if I delete the
key "1" from the following tree (which permits only 2 keys per node):
it should redistribute the key "2" from the right child to the left
child, since there are enough keys in this. Remember, the default
behavior is to try to redistribute keys from a sibling to the left, if
any. In the above case, there is no sibling to the left, so the
sibling to the right was used. And by definition, every node in
the tree has at least one sibling.
- when redistributing keys from a sibling, always only redistribute
one key. The book talks about redistributing the keys evenly
between the two nodes, but I want you to only redistribute one
key.
For example, in the following tree (which allows 4 keys per node),
if I delete key "2" , it should redistribute only one key, key
"3" from the right to the left.
Note: when the number of keys per node N is odd, then splitting a
non-leaf node can result in one node with slightly more than N/2 keys,
and one with slightly less than N/2. For example, if N is 7, and
a non-leaf node is overfull, then splitting it and pushing up the
middle key results in one node with 3 keys, and one one with 4.
This is one case where non-leaf
nodes are allowed to have fewer than N/2
keys. The other case is the root, which can always have any
number of keys.
I am providing code that should work properly with Java 1.2, 1.3, or
1.4. You probably will be able to work on the assignment on any
machine running java. Your code will be compiled and tested on
Pentagon, however, so you must make sure that it works correctly there.
6 How to test your program
Your code must be general enough to work with any number of keys per
node (greater than one, of course). You should test your program
with different numbers of keys per node. Make sure that your code
correctly handles
insertions that cause splits at any level of the tree, as well as
deletions that cause redistribution from right sibling to left sibling,
left sibling to right sibling, or sibling merges. Make sure you
check each of these for both leaf cases and non-leaf cases.
The program framework should help your debugging, since you can see the
state of the tree after each step. In addition, the "Undo" and
"Redo" buttons allow you to run an operation again, which is useful if
you are running in a debugger and want to set breakpoints.
7 What to hand in
You will only need to hand in the Node.java file. To
evaluate it, we will compile it and run it with our own test program
that will create trees, inserting and deleting elements.