Data Structures and Algorithms

Final Exam Study Guide

The final exam for this course is comprehensive: it will contain material from throughout the semester, albeit with a slight emphasis on more recent material. This study guide contains only material from after Test 3. To prepare for the final exam, please consult the following problems as well as the older study guides (listed below).

Priority Queues and Heaps

What is the difference between a queue and a priority queue?
What does it mean for a binary tree to be complete?
Consider the following lists of numbers. Write each as a complete binary tree using the representation we discussed in class.
- [2, 3, 9, 3, 6, 2]
- [11, 4, 4, 3, 4, 6, 9, 10, 2]
- [9, 12, 3, 14, 2, 1, 6, 8, 8, 8, 9, 14, 7]
Give pseudocode for the bubbleDown heap algorithm.
Give pseudocode for the heapify heap algorithm to turn these complete binary trees into heaps. Briefly explain why this algorithm is \(O(n)\).

Hash Tables

What is a hash? What is a hash collision?
We discussed two techniques for resolving hash collisions in class: linear probing and chaining. The technique used by a HashTable affects its structure.
- Briefly explain the difference between these hash collision resolution techniques.
- Give a C++ data type which could be used as the contents of each type of hash table. (You may use pair and other data types we discussed in class. You may also use a pair within a pair or assume the existence of a type triple for three elements.)
What is load factor? How is it used in a hash table?
What is the worst case time for inserting an element into a hash table? How does this worst case time relate to the hashing function?
Suppose you have an empty chaining hash table with three buckets and a maximum load factor of 0.75. Assume that keys and values are integers and that the hash function is the identity function; i.e., hash(5) is 5. We will insert the mappings \(4 \mapsto 8\), \(1 \mapsto 5\), \(0 \mapsto 6\), and \(6 \mapsto 2\). Draw the hash table after each addition. Assume that ensure_capacity doubles the number of buckets before rehashing.

Graphs

Graph for Question 3

What is a graph?
Define the following terms in relation to graphs:
- Weighted
- Directed
- Connected
Consider the graph on the right.
- Which path would a breadth first search find from vertex A to vertex G?
- Which path would Dijkstra’s algorithm find from vertex A to vertex G? Why is this path different from the BFS path above?
- Ignoring the weights on this graph, perform a topological sort of it. Write in order each of the marks you perform (e.g. mark E in-progress, mark G in-progress, mark G complete, etc.) and give the resulting topological sort as a list of vertices.
The graph data structure you used in class uses a dictionary to store the edges of the graph. Edges may be stored in a variety of different ways, so let us consider the adjacency matrix representation. Instead of a dictionary, we will store a two-dimensional array of weight values of size \(|V|^2\): that is, there is a row and a column for each vertex. If an edge exists from vertex A to vertex B, then the cell in row A and column B will contain its weight; otherwise, that cell contains a zero (for simplicity). Answer the following questions, using \(|V|\) for the number of vertices in the graph and \(|E|\) for the number of edges.
- What is the worst-case time complexity of adding an edge to this graph?
- What is the worst-case time complexity of producing a list of edges in this graph?
- What is the worst-case time complexity of determining the out-degree of a vertex?
- Suppose that we copy our edge data into a new matrix of size \((|V|+1)^2\) each time a new vertex is added. What is the time complexity of addVertex in this implementation?
- Consider the strategy that ArrayList uses to amortize the cost of copying data; a similar approach can be used here. Describe how the addVertex algorithm can be improved by allocating more space than is immediately necessary. The amortized worst-case time complexity of adding a new vertex should be \(O(n)\). Explain why.