CS41 — Interesting Problems



For the first few weeks of the semester while we are working on some math preliminaries, the class will be divided into teams, each of which will be assigned one of the interesting problems below. Your team will design an algorithm to solve your problem, implement it, analyze it's complexity, and present your solutions both in front of the class and in a written summary. The work should be done without reading ahead, or consulting other texts teams, students, or the internet.

Groups will be assigned by either yourselves or me and will not change for the duration of the project. Each week, each group will have a leader, recorder, and presenter. The leader should schedule meetings, and move meeting discussions forward smoothly. The recorder will be responsible for keeping notes and writing any reports that are to be submitted to me. The presenter will be responsible for giving in class progress statements.

By the end of the project, groups should have working code, an analysis of their algorithm, a discussion of the approach used including test data, and a discussion of things you might have liked to do but did not have time for.

CPU scheduling
You have recently been hired by a large data center to develop a new algorithm for scheduling jobs on a single processor. Clients can submit jobs to the data center. Each job a(i) takes a known time t(i) to process on a single CPU. Clients will pay for jobs you succesfully run, but only if the job is completed by a specified deadline d(i). If you succeed in completing a job before the deadline, you are paid and amount p(i), otherwise you are paid nothing. Suppose you are given a set of n jobs a(i)=<t(i), d(i), p(i)> and you want to maximize the profit you can obtain by scheduling the jobs. It may not be possible to schedule all the jobs and have them all finish before their respective deadline. Design an algorithm that will create a job schedule and maximize the profit. You may assume that all times, deadlines, and profits are integers. How long does it take for you algorithm to run? What parameters does the run time of your algorithm depend on, e.g., number of jobs, maximum profit, latest deadline, longest running job? How do you know if it is correct? Do you think your asymptotic running time is optimal?

Sum of n dice
Most people are familiar with basic probabilities when rolling one die or two dice. For example, what is the probability of a roll of two random six sided dice summing to seven? But not much is typically known about k > 2 dice. In particular, if two players each throw n dice, what is the probability that they will roll the same sum? Design an algorithm to solve this problem. Your program should determine for n six-sided dice, the number of different ways to get a sum of k. You should consider each die unique so a roll of (1,2,3) is different than (3,1,2). This is supposed to be a programming exercise, not an exercise in advanced probability, so think more like a computer scientist. If you feel like you are bogged down in probability, please contact me and I can help you out with answering any questions you may have about the problem and guide you in better direction. How long does it take for you algorithm to run? What parameters does the run time of your algorithm depend on, e.g., number of jobs, maximum profit, latest deadline, longest running job? How do you know if it is correct? Do you think your asymptotic running time is optimal? What is the maximum number of dice for which you can compute an answer? What is the limiting factor?

Google Maps Spatial Queries
Google maps and other map search engines allow users to search using text queries that describe geographic locations. Another way to search for objects would be to allow searches to be describe by a geographic box, circle, or in the most general case, an arbitrary two dimensional simply connected planar region. How would you design a data structure to store and query two dimensional point data using such query regions. It suffices to focus on bounded rectangular axis-aligned queries. Can you design a structure to support queries and updates? Analyze the space and runtime complexity of your approach.

Keep 'em separated
Election fever is sweeping through campus and a variety of student organizations are planning to staff a variety of booths over the weekend to support their cause. Groups have place their rectangular booths across Cunningham field (the rugby teams were away that weekend) in a haphazard manner. With no football team and no ruggers on campus to break up anticipated disputes, the campus administration has decided it would be safest if they could set up a temporary partition or fence to separate the two main groups of booths: Democrats and Republicans. The partition must be a simple straight line. Depending on the configuration of booths, such a partition may not be possible. Design an algorithm that determines if two groups of booths can be separated by a single line and returns one possible line if a separation is possible, or informs users that no such separation is possible. What is the complexity of your algorithm. Will it work on all possible configurations of booths, or only under limiting assumptions? Explain.

Intersection monitoring
Traffic in Philadelphia has become a mess. City planners would like to remote traffic monitors to observe traffic patterns to eventually commission construction project to relieve the traffic. Suppose a traffic monitor can only be placed at intersections and can only observe traffic along any road meeting at that intersection until the next intersection along that road. (It is easier for me to draw a picture of this). Naturally, the city would like to monitor all roads in the city with the minimal number of monitors. Design an algorithm that can determine the minimal number of monitors needed and their location such that all roads segments can be seen by at least one monitor.