Lab 6: Binary Search Trees
Due on Sunday, October 30th at 11:59 PM. This is a team lab. You and your assigned lab partner(s) will complete this lab together. Make sure that you are familiar with the Partner Etiquette guidelines. You may discuss the concepts of this lab with other classmates, but you may not share your code with anyone other than course staff and your lab partner(s). Do not look at solutions written by students other than your team. If your team needs help, please post on the Piazza forum or contact the instructor or ninjas. If you have any doubts about what is okay and what is not, it’s much safer to ask than to risk violating the Academic Integrity Policy.
Overview
For this lab, you will implement binary search tree (BST) data structure. You will then write a series of unit tests to confirm that your data structure works correctly.
Starting Code
Your starting code can be found in the appropriate repository for your team. The following is a description of the repository’s initial comments; bolded files are those which you may need to change.
adts/
: As with your last lab, the ADT interface declarations are stored here. Reference implementations of previously-covered data structures (like lists) are stored in the subdirectoryimpls
.linkedBST.h
/linkedBST__definitions.h
: The BST class you will be implementing.linkedBSTNode.h
: A simple node class to use in your BST implementation.linkedBSTNodeFunctions.h
/linkedBSTNodeFunctions__definitions.h
: A set of recursive functions on trees of nodes.main.cpp
: A small example of the use of a BST. This program counts the number of occurrences of each word in a text file. It then prints the results using a traversal of the tree.tests.cpp
: The unit testing program you will use to confirm that your BST is correct.
The vector
STL Class
The vector<T>
class, found in the vector
library, is the C++ STL implementation of an array list. It is used somewhat differently from our List<T>
interface. Here’s a handy translation guide:
Operation | List code |
vector code |
---|---|---|
Insert at front | myList.insertAtHead(value) |
no simple equivalent |
Insert at back | myList.insertAtTail(value) |
myVector.push_back(value) |
Determine size | myList.getSize() |
myVector.size() |
Get element by index | myList.get(index) |
myVector[index] |
Set element by index | no equivalent | myVector[index] = value |
Remove from back | myList.removeTail() |
myVector.pop_back() |
One other difference is that the pop_back
method is void
; you must retrieve the last element yourself if you want to see that element before it is removed.
The primary reason that we introduce vector
for this lab is because vectors can also be copied just like pair
s. Consider the following code:
List<int>* listPointer1 = new STLList<int>(); // create a pointer to a new List
List<int>* listPointer2 = listPointer1; // copy the pointer
listPointer1->insertAtTail(4); // add an element
cout << listPointer2->getSize() << endl; // prints 1; they point to the same List
List<int> list1; // create a statically-allocated list
List<int> list2 = list1; // illegal! Lists doesn't know how to copy themselves!
vector<int> vector1; // create a statically-allocated
vector1.push_back(4);
vector1.push_back(6);
vector<int> vector2 = vector1; // vectors do know how to copy themselves
cout << vector2.size() << endl; // prints 2 (since both elements were copied)
vector1[0] = 2; // assigns 2 to the first position of vector 1
cout << vector2[0] << endl; // prints 4; vector 2 is a different vector
Some methods of the BST
ADT interface use vector
to prevent you from having to worry about memory management (e.g. delete
) for the values they return.
Implementing a Binary Search Tree
Most of the work of this lab is implementing the binary search tree. Your implementation is spread across two files: linkedBST__definitions.h
(which contains the implementation of the LinkedBST
class’s methods) and linkedBSTNodeFunctions__definitions.h
(which contains helper functions for those methods to use). The LinkedBST
class is a template class with two templated type names: K
(the type of the keys in the BST) and V
(the type of the values). The LinkedBST
class itself tracks only two pieces of data:
size
, which contains the number of nodes in the BSTroot
, which contains theLinkedBSTNode
representing the root node of the BST
A single node stores information about its key, its value, and its left and right children. If a node does not have a child, then the corresponding pointer variable will be set to NULL
.
Many of the methods in the binary search tree are natural to implement recursively. For this reason, you will write a collection of recursive helper functions for your LinkedBST
methods to call. For instance, the height of a node is equal to one more than the largest of the heights of its two children. (The max
function can be found in the C++ algorithm
library.) Many of the public methods of LinkedBST
will be single-line calls to these recursive functions to begin the operation at the BST’s root node.
Traversals
The LinkedBST
class must provide implementations of traversal methods: methods which allow us to see every piece of data in the tree. A traversal is given by a list of key-value pairs; we will represent such a pair as a pair
object.
The BST
interface supports four different kinds of traversal:
- A pre-order traversal puts the current node first, followed by its left subtree and then its right subtree. For the example here, the pre-order traversal would be the sequence
F
,B
,A
,D
,C
,E
,H
,G
,I
. - A post-order traversal instead puts the current node after the left and right subtrees. The post-order traversal of this example is the sequence
A
,C
,E
,D
,B
,G
,I
,H
,F
. - An in-order traversal puts the current node between the left and right subtrees. The in-order traversal of this example is the sequence
A
,B
,C
,D
,E
,F
,G
,H
,I
. (The fact that this is sorted is not a coincidence!) - A level-order traversal reads the nodes from left to right and top to bottom, the same way we read English test. For this tree, the level-order traversal is
F
,B
,H
,A
,D
,G
,I
,C
,E
.
The first three forms of traversal are quite similar and are natural to define recursively. The level-order traversal is somewhat different and requires a breadth-first search approach similar to that which you used in the previous lab.
Wikipedia has some nice pictures of these traversals for reference. Note that your program is adding the elements to a list rather than “displaying” them as described on that Wikipedia page.
Testing the Binary Search Tree
In addition to implementing the LinkedBST
class and the helper methods, you must write unit tests for them as well. To get you started, some of the tests in tests.cpp
have been completed for you. For each remaining test, a comment containing the string “TODO
” has been provided. You must write one test for each of these comments that tests the appropriate method or function in a non-trivial way. For the recursive functions, you must provide at least one test that utilizes recursion; for BST
methods, you must provide a test that operates on a tree of height greater than one. For example, consider the following test:
TEST(negativeHeight) {
LinkedBSTNode<int,string>* tree = NULL;
CHECK_EQUAL(-1, subtreeHeight(tree));
}
If you were required to write a test for subtreeHeight
, this test would not suffice; it does not trigger recursion within the subtreeHeight
function. (A test for subtreeHeight
has already been provided in your lab.) You are welcome to include the above test (or tests like it) in your lab submission, but make sure that each of the functions and methods has at least one non-trivial test.
Please remember that these tests are a development tool as well as a lab requirement. You should write these tests as soon as possible and use them to help you find bugs and verify the correctness of your code. You may even wish to write the tests before you write your implementation just so you can make certain that you know what you expect your code to do. If you use manualTests
to figure out what’s happening in your code, you might also consider copying that code into tests.cpp
once you figure out what you expect from it.
Implementation Strategy
This lab involves writing more code than previous labs, so it is useful to plan an approach to your work. Below is a strategy which should organize your efforts; you are not required to proceed in this order, but it might help.
LinkedBSTNode
functions- Begin by writing an implementation of the
subtreeFind
function. Make sure to readlinkedBSTNodeFunctions.h
for a description of exactly what that funciton should do. - Run
make tests
and./tests
to determine if yoursubtreeFind
function works correctly. You will not be able to make the test pass since you haven’t yet writtensubtreeDelete
. However, the test should fail when callingsubtreeDelete
and not produce any other errors. - Implement
subtreeDelete
. Then, run./tests
to see if thesubtreeFind
test passes. If it does not, debugsubtreeDelete
until it does. - Repeat steps 1 and 2 for
subtreeHeight
. - For each of the remaining
LinkedBSTNode
helper functions, write an implementation and a test. You may wish to write the test before the implementation; it’s okay if you write it afterwards. Complete each of these helper functions one at a time, carefully referring tolinkedBSTNodeFunctions.h
.- Pay special attention to
subtreeInsert
andsubtreeRemove
, as they are required to create ordelete
a single node (respectively) unless an exception is thrown.
- Pay special attention to
- Begin by writing an implementation of the
LinkedBST
methods- As with
LinkedBSTNode
, begin by writing implementations for the existing tests. You should start with the constructor, destructor,insert
, andget
methods. All of these routines must work correctly for the first BST test to pass. - Once this test passes, write code for the provided tests of small methods (
getSize
andisEmpty
). Do not proceed until your tests for these methods pass. - Implement the methods of
LinkedBST
as you did the helper functions above: write and test one method at a time. These methods will generally require less code than the recursive helper methods; many of them just call the helper methods withthis->root
as one of the arguments.
- As with
- Debug any memory leaks as described below.
Once you are finished, you should be able to compile and run the word counting program. For your convenience, a few text documents have been provided in the test_data
directory.
Memory
For this lab, your program is required to run without memory errors or leaks. You should use valgrind
as you proceed in your testing to track memory errors. When you have a complete first draft of your implementation:
- Commit and push what you have (in case something goes wrong).
- Run
valgrind ./tests
to make sure that the test program does not cause memory errors. - Once memory errors are cleared, run
valgrind --leak-check=full ./tests
to identify and correct memory leaks.
Coding Style Requirements
You are required to observe some good coding practices:
-
You should pick meaningful variable names.
// Good int* pixels = new int[size]; // Bad int* p = new int[size];
-
You should use correct and consistent indentation. Lines of code within a block (that is, surrounded by
{
and}
) should be indented four spaces further than the lines surrounding them.// Good if (condition) { cout << "Test" << endl; } // Bad if (condition) { cout << "Test" << endl; }
-
You should use a block whenever possible, even if it’s not necessary. This helps you avoid subtle or messy bugs in the future.
// Good if (condition) { cout << "Something" << endl; } // Bad if (condition) cout << "Something" << endl;
-
Any new methods or fields in your header files should have comments explaining their purpose and behavior. You are permitted to omit documentation for methods that are inherited from other classes; that is, if your class has a
foo
method because its superclass has afoo
method, you don’t need to document that method.// Good public: /** * Saves the image represented by this object to a file. * @param filename The name of the file to save. * @return The number of pixels in the saved image. * @throws runtime_error If the image could not be saved due to an I/O error. */ int save(std::string filename); // Bad public: int save(std::string filename);
Your method/field documentation does not have to be in the format above, but you must describe the method’s behavior, its parameters, its return value, and any exceptions that it may throw. (If you’re indifferent, the above syntax is a good one to know; it’s a de facto standard used by Javadoc, Doxygen, and other tools that automatically process source code comments into other formats like searchable webpages.)
Peer Review
As this is a team assignment, you are required to complete a peer review of your lab partner or partners. You must do this even if everything is fine. You should complete this peer review after you have turned in your assignment. The peer review comes in the form of a simple Google Form and can be accessed here; in most cases, it will likely take less than a minute to complete. If you have trouble accessing this peer review, please make sure that you are logged into Google using your Swarthmore credentials.
Your peer review will be kept private; only course staff will have access to this information. You can also update your peer review after you have submitted it. However, if you do not submit a peer review, you will lose participation grade points. Please don’t forget!
Summary of Requirements
When you are finished, you should have
- A completed implementation of
LinkedBST
and the helper methods - Appropriate tests for these classes
- No memory errors!
- No memory leaks!
- Code which conforms to the style requirements above
- A completed peer review