CS21 Lab 9: zipcodes

Due 11:59pm Tuesday, Nov 20, 2012

Run update21, if you haven't already, to create the cs21/labs/09 directory. Then cd into your cs21/labs/09 directory and create the python program for lab 09 in this directory (handin21 looks for your lab 09 assignments in your cs21/labs/09 directory).

$ update21
$ cd cs21/labs/09
$ vim zipcodes.py
ZIP Code Database
What US city has the ZIP code 12345? What is the ZIP code for Truth or Consequences, NM? Your assignment this week is to create a program (zipcodes.py) that allows the user to explore a zipcode database.

We have a file, /usr/local/doc/zipcodes.txt, which contains zipcode data for most of the United States (don't copy this file to your 09 directory, just use "/usr/local/doc/zipcodes.txt" in your python program). Each line of the file contains seven fields separated by commas:

  1. ZIP code
  2. latitude
  3. longitude
  4. city name
  5. county name
  6. state
  7. population
The entry for Swarthmore is shown here:

19081,39.897562,-075.346584,Swarthmore,Delaware,PA,9907

If you want, you can view the file with less: less /usr/local/doc/zipcodes.txt

Your program should prompt the user for either a zipcode or at least part of a city name. Depending on what the user types, your program should do one of the following:

A sample run is shown here:

$ python zipcodes.py 

  Welcome to Zipcodes v1.0

  Enter a zipcode or the name of a city and I'll tell
  you all about it. Enter a blank line when you're finished...
  
zip or city: 10901

10901 Suffern, NY .... Rockland County .... Pop: 21760

---- Cities in NY with population > 200000: 
            Brooklyn ...... Pop: 2465326
            New York ...... Pop: 1529375
               Bronx ...... Pop: 1327690
             Buffalo ...... Pop: 598640
           Rochester ...... Pop: 488602
       Staten Island ...... Pop: 443728
            Syracuse ...... Pop: 231585
             Jamaica ...... Pop: 216876
            Flushing ...... Pop: 214473

ny image
 

zip or city: london

Here are all the cities I can find that start with that...
03053 Londonderry, NH .... Rockingham County .... Pop: 23148
05148 Londonderry, VT .... Windham County .... Pop: 792
25126 London, WV .... Kanawha County .... Pop: 305
40741 London, KY .... Laurel County .... Pop: 36324
40742 London, KY .... Laurel County .... Pop: 36324
40743 London, KY .... Laurel County .... Pop: 36324
40744 London, KY .... Laurel County .... Pop: 36324
40745 London, KY .... Laurel County .... Pop: 36324
43140 London, OH .... Madison County .... Pop: 22135
45647 Londonderry, OH .... Ross County .... Pop: 1889
61544 London Mills, IL .... Fulton County .... Pop: 706
72847 London, AR .... Pope County .... Pop: 2631
76854 London, TX .... Kimble County .... Pop: 322

zip or city: 76854

76854 London, TX .... Kimble County .... Pop: 322

---- Cities in TX with population > 200000: 
             Houston ...... Pop: 2571090
         San Antonio ...... Pop: 1308200
              Dallas ...... Pop: 1261999
              Austin ...... Pop: 747080
          Fort Worth ...... Pop: 678401
             El Paso ...... Pop: 634240
           Arlington ...... Pop: 338858
      Corpus Christi ...... Pop: 278829
               Plano ...... Pop: 225287
             Lubbock ...... Pop: 219631
             Garland ...... Pop: 215555

tx image
 

zip or city: 145555

Please enter a 5-digit zip code...

zip or city: pickles

Sorry...I can't find anything for that city name...


zip or city: vien

Here are all the cities I can find that start with that...
04360 Vienna, ME .... Kennebec County .... Pop: 515
07880 Vienna, NJ .... Warren County .... Pop: 0
21869 Vienna, MD .... Dorchester County .... Pop: 1039
22180 Vienna, VA .... Fairfax County .... Pop: 59531
22181 Vienna, VA .... Fairfax County .... Pop: 59531
22182 Vienna, VA .... Fairfax County .... Pop: 59531
22183 Vienna, VA .... Fairfax County .... Pop: 59531
22184 Vienna, VA .... Fairfax County .... Pop: 59531
22185 Vienna, VA .... Fairfax County .... Pop: 59531
26105 Vienna, WV .... Wood County .... Pop: 11923
31092 Vienna, GA .... Dooly County .... Pop: 5680
44473 Vienna, OH .... Trumbull County .... Pop: 4215
57271 Vienna, SD .... Clark County .... Pop: 463
62995 Vienna, IL .... Johnson County .... Pop: 5953
65582 Vienna, MO .... Maries County .... Pop: 2241

zip or city: 31092

31092 Vienna, GA .... Dooly County .... Pop: 5680

---- Cities in GA with population > 200000: 
             Atlanta ...... Pop: 673017
            Marietta ...... Pop: 302216
            Savannah ...... Pop: 218659

ga image
 

zip or city: <Enter>

Requirements and tips...
You should practice good top-down design, incrementally implement and test your solution, and document your code with comments. While much of the design is up to you, the requirements below are designed to avoid some headaches in the initial design.
  1. When you read the data from zipcodes.txt, store the data in a python list, where each item in the list is itself a list of zipcode data for one zipcode: [zip, latitude, longitude, city, county, state, population].
  2. When you prompt the user to enter a location, the user can enter either a zipcode or a city (or part of a city). Your program must determine if the user input looks like a zipcode, a city, or garbage, and do the right thing (Hint: use str methods like isdigit() and/or isalpha()).
  3. The file lists all zip codes in sorted order, so it is not necessary to modify the order of items when you read the data in.
  4. Your two searches (i.e., for zip codes or for cities) must be as efficient as possible. That is, you should use binary search or linear search where appropriate.
  5. If you cannot find an entry for a particular city or ZIP code, inform the user that you cannot find that location and prompt them to enter another location.
  6. For reporting the biggest cities in the state, use some threshold population (e.g., 200000) and print information about all cities in this state with populations above the threshold. You should sort this data from largest to smallest population. Also, most big cities have more than one zipcode. Note: in the examples above, the big cities are listed only once, even if they have multiple zipcodes.
  7. Since our data is stored in a python list of lists, you will need to modify one of our sort functions from class to handle this data. You should not use the built-in python sort or sorted functions.
  8. It is best to store the zipcodes as strings. Some zipcodes begin with a zero, and python removes leading zeros from integers.
  9. Some of the data (latitude, longitude, population) will need to be converted from strings to floats or integers.
  10. We encourage you to finish the text part of this program before you add the graphics. This makes testing easier and faster. Once you have the text portion working, then add the graphics.
  11. To plot the state boundaries we have written some functions to create the graphics window already scaled to appropriate dimensions for the given state. You should be able to use the code below to create the graphics window with the state boundary already drawn. Your program must then draw points for all zipcodes (using the latitude as the y coordinate and the longitude as the x coordinate) and highlight in some way the user-specified zipcode and the largest cities in the state.
    from boundaries import *
    
    state = "VA"
    w = getStateGraphWin(state)
    if w != None:
      # plot all zipcodes for this state here...use lat/long for each zipcode
      w.getMouse()
      w.close()
    
    

Missing Data

The zipcode data file we have provided you is by no means complete. Some ZIP codes were missing or did not have lat/long data and were removed. For some cities, we did not have population data, so we set the population arbitrarily to 0. If your favorite US city or hometown in the US is missing and you know all the info (ZIP, city name, county name, state, lat, long, and population), let us know and we will be happy to add a few cities, but we are not trying to maintain a comprehensive list.

Submit

Once you are satisfied with your program, hand it in by typing handin21 in a terminal window.