CS97 Project Ideas

Bridge detection

Most of the flow modeling methods for computing flow directions and flow accumulations on gridded terrains assume that local minima, or sinks, are due to noise and should be removed. Two common approaches for sink removal are flooding and carving. The papers we have read apply one of the two methods blindly without considering what kind of feature has created the local minimum. A relatively new problem with these methods is that modern, hi-res elevation data at 10-40 foot resolution is resolving many man-made features including bridges. These features were not visible in lower resolution terrains. A potential project is to look at methods for detecting bridges in high-resolution terrains. What does it mean for a feature to look like a bridge? The primary input for this project would be hi-resolution gridded terrains, though perhaps this data could be supplemented with other sources. I can provide a number of sample data sets from North Carolina.

River Simplification

Typical flow modeling applications on gridded terrains generate a flow direction and flow accumulation output grid. The flow directions can be thought of as edges in a tree describing a river network and flow accumulations can be thought of the size of subtrees rooted at a particular pixel. By specifying a target size threshold, one can extract those edges in the tree that have a flow accumulation above a certain threshold. This forms a river network subtree. A GIS can convert this network in to a vector layer of poly-lines representing the river, but often the resulting set of lines has many more vertices than necessary and the river network can be significantly simplified. A potential project is to look at river simplification methods. Douglas-Peucker could be a good start, but how do you make sure you preserve network connectivity? You need to simplify a set of lines, not just a single line. Can you break the network into a set of lines such that the resulting individual lines have some meaning, perhaps a main river and tributaries? The mapShaper project mentions some other line simplification methods that might be interesting to investigate. I can provide flow direction/flow accumulation grids for sample data sets or show you how to run the programs to generate your own. You may also want to look at shapefile formats for statewide or national data. I can help you with this too.

Parallel Algorithms for GIS

Most GIS algorithms are sequential and single threaded, but in some cases, parallelism could be exploited. One particular application is the point cloud to grid application we discussed in class. The way the quad-tree construction arranges data on disk makes it easy to break computation into a number of parallel pieces. This phase is currently done sequentially and is the new bottleneck. Can parallelism push the bottleneck back to the quad tree? This project would involve implementing some interpolation routine (I can show you the existing C source code that you may be able to just use without modification), dispatching a number of parallel jobs and combining the result. Combining the results is basically a merge sort. This is currently done sequentially, but could possibly also be made parallel. I'm open to other parallel GIS applications too. I have a bunch of point data from North Carolina that can be used and there might be other data sources out there. The USGS CLICK site has a number of lidar data sets, NOAA's CSC site has some coastal data, and there may be some bathymetric data out there that is publicly available.

Voronoi Natural Neighbors Interpolation

One of the complaints about the point cloud to grid paper that we read is that the interpolation method we used was a rather complex regularized spline with tension. In the experimental section, we see that the interpolation phase is the most expensive phase. Numerous other interpolation methods are available and modern hi-res data sets may have dense enough point spacing so that simpler interpolation schemes can approximate the surface just as well and run much faster. One such method is a Voronoi Natural Neighbors interpolation method which works as follows. Take a set of sample points and build a Voronoi diagram out of them. Next, pretend to insert a point into the Voronoi diagram that corresponds to an output grid cell. Compute how much area the Voronoi cell "steals" from neighboring cells and compute a value for the grid cell using the values of points in neighboring cells, weighted by area stolen. Actual implementation details might be a bit tricky, but there is a lot of code out there that you can likely borrow from or modify. SciPy has something called nn_interpolator that could provide hints or code sources. I have a number of point data sources that I can provide.

Mt. Diablo Viewshed

A sign on Mt. Diablo outside of the bay area in California claims that you can see more surface area (land/water) than any other place on earth besides Mt. Kilimanjaro. Many doubt this claim as Mt. Diablo is pretty small by mountain standards. Pike's peak in CO and Mt. McKinley in Alaska may be North American peaks which have larger viewsheds. Put this claim to rest using your newfound GIS skills. Data is publicly available for all these regions (I'd have to double check on McKinley). Implement a viewshed algorithm and compare the results. The data available online may be too dense and you may need to downsample the data to speed the computation. How does this affect your results. You should adjust for the curvature of the Earth. The computation is surprisingly not that difficult, but current GIS methods just seem to ignore it. As a warm-up exercise, how far do you think you can see on level terrain under ideal circumstances?

Border/Area Patrol

How many observation posts do you need to view all (or at least a high percentage) of a particular political border (the mason-dixon line, the North Korea/South Korean border, etc.) or a geographic area (Yosemite National Park, Delaware County, etc.)? How does the answer change as you change the type of terrain and/or the height of the observers/observation towers? How should you place the observers?

Geodesic flow routing on flat surfaces

In the Soille et al. carving and adaptive drainage enforcement, the authors proposed a new flow routing algorithm for routing on flat surfaces. In modern terrain data sets, flat areas are uncommon except in terrains that have been flooded to remove minima. This flooding can be viewed as symbolically marking a problem area containing a local minimum. Flow is typically routed across this problem area using the flooded flat elevations. Can we somehow do better if we use the original terrain to model flow but use flooded areas as a mask to indicate "We may need some fancy routing tricks here"? One approach would be to use geodesic routing on the original terrain in the area marked by the flooding algorithm as having one or more local minimum. The weights in the geodesic routing algorithm are simply the unmodified elevations. I'm particularly interested in this problem, but it is a bit weird to explain in a paragraph. I can doodle much better. If you think this might be interesting, I can explain further.

Building Topology from Shapefiles

The common format for exchange vector GIS layers (points, lines, polygons) is the ESRI Shapefile format. Unfortunately, this format does not have any topology support, so given a polygon in the shapefile, it is difficult to determine what polygons are adjacent, if two polygons overlap, if there are holes in the data, etc. A doubly connected edge list can support more topological queries. Can you build a DCEL from shapefiles and can you use this as a basis for validating shapefiles, including checking for polygon itersections, detecting isolated polygon islands (polygons with no neighbors), etc.

Shapefile overlay

Overlays are a common operation in GIS. Given two or more shapefiles can you implement an algorithm that supports common operations like union, intersection, difference? Line segment intersection algorithms and Doubly connected edge lists will be useful here.

Swarthmore GPS

Collect some GPS data around campus or the greater Swat area and do something cool. One possibility: map all walkways on campus. You may need to smooth the lines to clean up noisy signals. How do you do this? Douglas Peucker can be used to simplify clean data, but simplifying noisy data may actually preserve noise instead of removing it. Once you have the data, convert it to a graph representation. Also map the entrances and exits to buildings. Then develop a tool to find the shortest path between two given buildings. You could also imagine building two graphs. The first is a "good weather graph". All edges have a weight equal to their Euclidean distance. The second graph is a "bad weather graph". Paths that are indoors (you can't map these with GPS, so just guestimate) have weight 0 while paths outdoors have a cost proportional to their Euclidean distance. This Blaha graph prefers indoor paths. Are there ways to walk inside on inclement days that may be longer than walking outside, but keep you away from the dangerous outdoors?

CS97 — Project Ideas