CS41 — Interesting Problems



For the next few weeks of the semester I would like to work on one of the interesting problems below or present a problem that you currently do not know the answer to, but are interested in investigating. You make work on this project with one or two team members, or work on a project individually. You should design a solution to your problem and analyze it's complexity. This assignment will count as one homework assignment. To receive credit you must do one of the following: 1) present the problem and progress on its solution in front of the class for approximately 10 minutes. Be prepared to answer questions posed by myself or the class. 2) Write a 2-3 page summary of the problem, a possible solution, and analysis of your solution's complexity. 3) Implement a solution to your algorithm in a programming language of your choice. The code should be well documented and include a number of test cases, sample results, and a README describing the main features of your code.
Union of Rectangles
Given a collection of n axis-aligned rectangles, design an algorithm to compute the area of their union. If the rectangles are disjoint, the problem is easy. I want you do consider the case in which multiple rectangles can overlap. One application where this might be relevant is in photogrammetry and remote sensing. Suppose an aircraft or sattelite takes a series of snapshots of the Earth. Figuring out the percent of the Earth covered by the snapshots is roughly equivalent to computing the union of the rectangular regions of all the snapshots (curvature of Earth's surface and non-axis-aligned rectangles make the problem harder. If you solve this problem quickly, think about rectangles that may be rotated)
Range Minimum Queries
What were the high and low values of the S&P 500 in the past day, week, month, year, or decade. More generally, given two calendar days, what were the high and low values of the market between the two given days. Since the cases are symmetric, we are interested in an algorithm for supporting such range minimum queries. Given an array A of size n, we wish to preprocess A such that we can report the minimum value in any subarray of A quickly. Design a data structure and a query algorithm to solve this problem. Analyze you solution in terms of preprocessing time, space utilization for the data structure, and query time. Two obvious extremes are to do zero preprocessing and simply scan the list (O(1) preproc, O(n) space, O(n) query) or to compute all possible answers ahead of time and use a tree or hash table to find the results (O(n*n) preproc, O(n*n) space, O(log n) or O(1) query). Can you improve on these bounds? The best we could hope for is (O(1) preproc, O(n) space, O(1) query). Perhaps this is impossible as we have no time to preprocess, but the next best hope is to scan the array a few times, build a small index, and still achieve O(1) query after linear preprocessing time.
Art Gallerys
A local museum is exhibiting a large collection of priceless art. Naturally they would not want the art to be stolen so they would like to set up a number of cameras to observe the room. Unfortunately with artists being creative types the room is not a rectangle which can be observed with a single camera. Suppose the room is decribed by a simple but non-convex polygon. We wish to place cameras at the vertices of the polygon such that the entire interior of the polygon is visible by the cameras. Suppose the each camera can see arbitrarily far in a straight line as long as no walls obstruct the view. Furthermore assume the cameras have a viewing angle of 360 degrees. How many cameras are needed to cover a polygonal art gallery with n vertices? Can you prove both upper and lower bounds? This may sound a lot like the intersection monitoring problem which turned out to be rather hard (in terms of runtime) to find a solution. This problem has a polynomial solution. Your upper bound should not be in terms of big-Oh. Use exact values if possible (i.e., include the constant).
Line Segment Intersection
Given a collection of n line segments, report all intersections between the line segments. A brute force quadratic solution is possible by computing all possible pairs, but can you devise a more clever solution that may depend on the actual number of intersections k? Such algorithms all called output sensitive. Is there an upper bound to k. This problem is often used to solve spatial join operations in spatial databases. Given two sets of polygons, compute the union, intersection, or difference of the two sets. You only need to solve the segment intersection problem, but the second problem is interesting too.
Image classification with limited memory
A common task in image processing is to identify the number, size, and shape of connected regions in the image that have the same (or similar) intensity values. For example, a traffic camera might detect cars as darker blobs against a lighter higway background. An infared camera might identify animals as brighter spots against a dark backround. Suppose I am given a grid of pixels where each pixel has an integer intensity value. I would like to assign a unique label to each connected component in the image where a connected component is a connected region in the image where all pixels in the component have the same intensity value and for which I can go from any pixel in the connected component to any other pixel in the component via a path that is completely within the component. One easy solution is to run a breadth first search on each unvisited pixel to identify its connected compenent. This works great if I can load the entire image into memory. Suppose I add the constraint that only a constant number of rows fit into memory. Design an algorithm that still solves the problem, but reduces the number of times I need to load a row into memory.

Google Maps Spatial Queries
Google maps and other map search engines allow users to search using text queries that describe geographic locations. Another way to search for objects would be to allow searches to be describe by a geographic box, circle, or in the most general case, an arbitrary two dimensional simply connected planar region. How would you design a data structure to store and query two dimensional point data using such query regions. It suffices to focus on bounded rectangular axis-aligned queries. Can you design a structure to support queries and updates? Analyze the space and runtime complexity of your approach.