CS35 Homework #8:

Hash Tables and Potholes

Due 11:59pm Tuesday, 4 December 2007

This assignment consists of two unrelated parts. You may work with a partner on this assignment if you wish.

Hash maps with separate chaining
Implement the Map interface using separate chaining as described in section 9.2.5 of the text. Use the code in labs/Nov-12-Mon as a starting point. At least one of your constructors should support a floating point load factor and a initial capacity as parameters. Your implementation should double the array size and rehash the keys when the load capacity is exceeded. You implementation should include a method collisions() that returns the total number of collisions observed in the hash table. To test your implementation, build Maps on the words in /usr/share/dict/words using a load factor of 0.5, 0.8, 0.9, and 0.95. Note the total number of collisions and the average number of collisions per word for each word. How do your results compare to the HashTableMap that uses linear probing? Compute a histogram of the chain lengths in each bucket in the final hash table for each load factor. An example histogram is shown below. This histogram is completely made up and may not look at all like the actual distribution.
length     freq (%)
0          20.00
1          40.00
2          30.00
3           5.00
4           2.00
5           1.00
6           1.00
7           0.50
9           0.25
10          0.24
15          0.01
Summarize the results of your experiments in a README file. Is hashing a viable alternative to tree-based searching? Is chaining better or worse than linear probing?
Potholes
This problem appeared as a problem in the 2007 ACM Mid-Atlantic regional programming contest. Suppose you are given a rectangular parking lot filled with a number of non-overlapping circular potholes. A contest will compare the skills of two competing pothole filling contractors. For the contest, the city government wants to divide the lot into two rectangular sections (either horizontally or vertically) that divide the area of the potholes in each section as evenly as possible. The dividing line must not cross any pothole. Pothole may be tangent to other potholes, the dividing line, or the boundary of the parking lot.

Your program should read input from standard in (similar to the maze homework assignment). The input is formatted as follows:

An example is shown below:
16.0 12.0
1.0 1.0 0.8
8.0 6.0 2.0
3.0 5.0 1.0
3.0 9.0 1.0
0.0 0.0 0.0

Your output should consist of the two endpoints (x1, y1, x2, y2) of a horizontal or vertical line that divides the potholes as evenly as possible. Design and implement a solution to this problem. As a catch, your solution must run in O(n lg n) time. A O(n^2) solution is a good place to start, but think about how you may improve your solution. It may help to sort something according to some criteria, but I'll let you figure out the details.

If you want to test your code on some example data sets, I have posted a few in the hw8 directory. To use them, you should be able to run java TestPothole < ptest1. I have also included some python code that will graphically view a test file using python pothole_viewer.py < ptest1. If you want to make your own potholes, I also have a pothole_maker.py file that you can experiment with. For a few tests, I have included an image of the best solution lines in the vertical an horizontal direction. See ptest1.png, ptest2.png, or ptest5.png

Submitting
Along with your Java source code, you should hand in a README file, and your Makefile if you used one. These files will be imported automatically via handin35.