CS21 Lab09: ZIP Codes

Due 11:59pm Tuesday, November 17, 2009

For this program you may work with one partner (you may not work in groups larger than two).

If you work with a partner:

The best way to work with a partner is to work together in the lab on all parts of the program together. You can work together on a copy of the code in one partner's cs21/labs/09 subdirectory and then email the file as an attachment to the other partner so that he/she can save a copy of your joint solution in his/her cs21/labs/09 subdirectory too.

Run update21, if you haven't already, to create the cs21/labs/09 directory. Then cd into your cs21/labs/09 directory and create the python programs for lab 09 in this directory (handin21 looks for your lab 09 assignments in your cs21/labs/09 directory). You should begin by copying your assignment from lab 08 into your lab 09 directory as shown below

$ update21

$ cd cs21/labs/09
$ cp ../08/zipcodes.py .
$ vim zipcodes.py
Make sure that you are editing the copy of zipcodes.py file in your labs/09 directory and not the copy in labs/08

If you did not finish all of lab 08, you may import a sample solution that implements the functions described instead of copying your solution from last week. Add from lab08 import * to the top of an empty zipcodes.py file and proceed from there. You may want to consult the online documentation for the lab08 library to see which functions are available.

This lab has some optional components that allow you to further practice your skills in Python. Optional portions will not be graded, but may be interesting for those wanting some extra challenges.

ZIP Code Database
What US city has the ZIP code 12345? What is the ZIP code for Truth or Consequences, NM? How far is it from Fairbanks, AK to Miami, FL? What US city with a population over 100,000 is closest to Minot, ND? Your assignment is to edit the file zipcodes.py to create a program that answers these types of questions.

Instead of using the file /usr/local/doc/bigzips.txt from the previous lab, you should begin by reading the file /usr/local/doc/zipcodes.txt which contains the extra data you need to write your program. This file is identical in format to bigzips.txt, but has 41824 entries which are already presorted by zip code. Thus you do not need to sort this larger file (with selection sort it would take minutes anyways). Each line of the file contains seven fields separated by commas representing the ZIP code, latitude, longitude, city name, county name, state, and population, respectively. The entry for Swarthmore is shown below:


Your program should prompt the user for two cities. The user can lookup a city by ZIP code or by a city name and its two letter state abbreviation. For each city, display the following information:

In addition to the information above, display the distance between the two cities the user typed.

A sample run is shown below:

This program creates a database of information about cities
throughout the United States.  It allows you to enter two
cities (either by zip code or by city and state) and reports
the distance between them.
Enter a zip code or city, state> Swarthmore, PA
 City, State, ZIP:  Swarthmore, PA, 19081
              County:  Delaware
        Population:  9907
         Lat / Long:  39.90 / -75.35
Closest big city:  Philadelphia, PA (6.1 miles)

Enter a zip code or city, state> 59453
 City, State, ZIP:  Judith Gap, MT, 59453
              County:  Wheatland
        Population:  307
         Lat / Long:  46.68 / -109.64
Closest big city:  Billings, MT (77.7 miles)

Distance between Swarthmore, PA and Judith Gap, MT is 1771.9 miles

Check more locations (yes/no)? yes
Enter a zip code or city, state> San Francisco, CA
 City, State, ZIP:  San Francisco, CA, 94101
              County:  San Francisco
        Population:  776733
         Lat / Long:  37.78 / -122.73
Closest big city:  Daly City, CA (15.2 miles)

Enter a zip code or city, state> Boston, MA
 City, State, ZIP:  Boston, MA, 02101
              County:  Suffolk
        Population:  359585
         Lat / Long:  42.37 / -71.03
Closest big city:  Cambridge, MA (2.9 miles)

Distance between San Francisco, CA and Boston, MA is 2709.5 miles

Check more locations (yes/no)? no

If you cannot find an entry for a particular city or ZIP code, inform the user that you cannot find that location and prompt them to enter another location.

You should practice good top-down design, incrementally implement and test your solution, and document your code with comments. You should use the functions you wrote in lab 08, but you will need to write additional functions as well. While much of the design is up to you, the requirements below are designed to avoid some headaches in the initial design.
  1. If you read the file zipcodes.txt sequentially, the lines are sorted by increasing zipcode. You must preserve this order in you list of all locations.
  2. When searching for a location by ZIP code, you must use binary search to find the location in the list of all locations. Use your function zipSearch from the previous lab
  3. Note that when you prompt the user to enter a location, the user can enter either a zipcode or a city, state. Your program must determine if the user input looks like a zipcode, a city/state pair, or garbage and use the appropriate search. Also, a user could enter an invalid zipcode but finally enter a valid location using a city, state pair. See the sample output for some examples.
  4. Be careful when reporting the closest city to a given city. Do not report the same city. For example, the city with a population over 100,000 closest to Philadelphia, PA, should not be Philadelphia, PA, but some other city. This is further complicated by the fact that many big cities have multiple ZIP codes, when reporting the closest big city, ensure that the city names are in fact different.

In addition to the requirements above, we have included some tips and problems to watch out for below.

ZIPs as Strings
While you will be doing actual numerical computations with latitudes, longitudes, and populations, you do not need to numerically compare ZIP codes (It is OK to do lexicographic string comparisions of ZIPs in the binary search). Furthermore, ZIP codes beginning with zeroes will be truncated if interpreted as integers, e.g., 01238 will be interpreted as 1238. Store and print ZIP codes as strings instead of numbers and you will save yourself a number of headaches.

Reporting ZIP codes

Some large cities (Philadelphia, PA, for example) have multiple ZIP codes. When asked to report a ZIP code for a city, you can decide how to handle this. Possible options include reporting all ZIP codes for a particular city, reporting the first ZIP code you find for that city, reporting one example ZIP code and stating there are other possibilities, or designing a method of your own choosing. You must report at least one ZIP code for a city if it is in the list of known cities.

Computing Distances

You should use the "great circle" distance formula to compute the distance between two cities. If the first city is at a geographic location (lat1, long1), and the second city is at (lat2, long2), then the distance between the two cities is given by the formula:

D = R acos( sin(lat1)sin(lat2) + cos(lat1)cos(lat2)cos(long2-long1))

Where R is the radius of the Earth (6371 km or 3963 miles), and acos is the inverse cosine. You can get this function by importing the math module in python. Note that each lat/long must be converted from degrees to radians before computing a sine or cosine. The formula for this conversion is

rad = deg*pi/180 
or use the python function radians imported from the math library.
Missing Data

This data file we have provided you is by no means complete. Some ZIP codes were missing or did not have lat/long data and were removed. For some cities, we did not have population data, so we set the population arbitrarily to 0. If your favorite US city or hometown in the US is missing and you know all the info (ZIP, city name, county name, state, lat, long, and population), let us know and we will be happy to add a few cities, but we are not trying to maintain a comprehensive list.

Optional Components
As noted above, these questions are NOT required to receive full credit. Furthermore, do not attempt to solve these problems until the required portion of the assignment is complete. There are many extensions you could add to this program. We posted a few we thought might be interesting.

Once you are satisfied with your program, hand it in by typing handin21 in a terminal window.