Most of the flow modeling methods for computing flow directions and
flow accumulations on gridded terrains assume that local minima, or
due to noise and should be removed. Two common approaches for sink
removal are flooding and carving. The papers we have read apply one of
the two methods blindly without considering what kind of feature has
created the local minimum. A relatively new problem with these methods
is that modern, hi-res elevation data at 10-40 foot resolution is
resolving many man-made features including bridges. These features
were not visible in lower resolution terrains. A potential project is
to look at methods for detecting bridges in high-resolution terrains.
What does it mean for a feature to look like a bridge? The primary
input for this project would be hi-resolution gridded terrains, though
perhaps this data could be supplemented with other sources. I can
provide a number of sample data sets from North Carolina.
Typical flow modeling applications on gridded terrains generate a flow
direction and flow accumulation output grid. The flow directions can
be thought of as edges in a tree describing a river network and flow
accumulations can be thought of the size of subtrees rooted at a
particular pixel. By specifying a target size threshold, one can
extract those edges in the tree that have a flow accumulation above a
certain threshold. This forms a river network subtree. A GIS can
convert this network in to a vector layer of poly-lines representing
the river, but often the resulting set of lines has many more vertices
than necessary and the river network can be significantly simplified.
A potential project is to look at river simplification methods.
Douglas-Peucker could be a good start, but how do you make sure you
preserve network connectivity? You need to simplify a set of lines,
not just a single line. Can you break the network into a set of lines
such that the resulting individual lines have some meaning, perhaps a
main river and tributaries? The mapShaper project mentions some other
line simplification methods that might be interesting to investigate.
I can provide flow direction/flow accumulation grids for sample data
sets or show you how to run the programs to generate your own. You may
also want to look at shapefile formats for statewide or national data.
I can help you with this too.
Parallel Algorithms for GIS
Most GIS algorithms are sequential and single threaded, but in some
cases, parallelism could be exploited. One particular application is
the point cloud to grid application we discussed in class. The way the
quad-tree construction arranges data on disk makes it easy to break
computation into a number of parallel pieces. This phase is currently
done sequentially and is the new bottleneck. Can parallelism push the
bottleneck back to the quad tree? This project would involve
implementing some interpolation routine (I can show you the existing
C source code that you may be able to just use without modification),
dispatching a number of parallel jobs and combining the result.
Combining the results is basically a merge sort. This is currently
done sequentially, but could possibly also be made parallel. I'm open
to other parallel GIS applications too. I have a bunch of point data
from North Carolina that can be used and there might be other data
sources out there. The USGS CLICK site has a number of lidar data
sets, NOAA's CSC site has some coastal data, and there may be some
bathymetric data out there that is publicly available.
Voronoi Natural Neighbors Interpolation
One of the complaints about the point cloud to grid paper that we read
is that the interpolation method we used was a rather complex
regularized spline with tension. In the experimental section, we see
that the interpolation phase is the most expensive phase. Numerous
other interpolation methods are available and modern hi-res data sets
may have dense enough point spacing so that simpler interpolation
schemes can approximate the surface just as well and run much faster.
One such method is a Voronoi Natural Neighbors interpolation method
which works as follows. Take a set of sample points and build a
Voronoi diagram out of them. Next, pretend to insert a point into the
Voronoi diagram that corresponds to an output grid cell. Compute how
much area the Voronoi cell "steals" from neighboring cells and compute
a value for the grid cell using the values of points in neighboring
cells, weighted by area stolen. Actual implementation details might be
a bit tricky, but there is a lot of code out there that you can likely
borrow from or modify. SciPy has something called nn_interpolator that
could provide hints or code sources. I have a number of point data
sources that I can provide.
Mt. Diablo Viewshed
A sign on Mt. Diablo outside of the bay area in California claims
that you can see more surface area (land/water) than any other place
on earth besides Mt. Kilimanjaro. Many doubt this claim as Mt. Diablo
is pretty small by mountain standards. Pike's peak in CO and Mt.
McKinley in Alaska may be North American peaks which have larger
viewsheds. Put this claim to rest using your newfound GIS skills. Data
is publicly available for all these regions (I'd have to double check
on McKinley). Implement a viewshed algorithm and compare the results.
The data available online may be too dense and you may need to
downsample the data to speed the computation. How does this affect
your results. You should adjust for the curvature of the Earth. The
computation is surprisingly not that difficult, but current GIS methods
just seem to ignore it. As a warm-up exercise, how far do you think you can see on level
terrain under ideal circumstances?
How many observation posts do you need to view all (or at least a high
percentage) of a particular
political border (the mason-dixon line, the North Korea/South Korean
border, etc.) or a geographic area (Yosemite National Park, Delaware
County, etc.)? How does the answer change as you change the type of
terrain and/or the height of the observers/observation towers? How
should you place the observers?
Geodesic flow routing on flat surfaces
In the Soille et al. carving and adaptive drainage enforcement, the
authors proposed a new flow routing algorithm for routing on flat
surfaces. In modern terrain data sets, flat areas are uncommon except
in terrains that have been flooded to remove minima. This flooding can
be viewed as symbolically marking a problem area containing a local
minimum. Flow is typically routed across this problem area using the
flooded flat elevations. Can we somehow do better if we use the
original terrain to model flow but use flooded areas as a mask to
indicate "We may need some fancy routing tricks here"? One approach
would be to use geodesic routing on the original terrain in the area
marked by the flooding algorithm as having one or more local minimum.
The weights in the geodesic routing algorithm are simply the
unmodified elevations. I'm particularly interested in this problem,
but it is a bit weird to explain in a paragraph. I can doodle much
better. If you think this might be interesting, I can explain further.
Building Topology from Shapefiles
The common format for exchange vector GIS layers (points, lines,
polygons) is the ESRI Shapefile format. Unfortunately, this format
does not have any topology support, so given a polygon in the
shapefile, it is
difficult to determine what polygons are adjacent, if two polygons
overlap, if there are holes in the data, etc. A doubly connected edge
list can support more topological queries. Can you build a DCEL from
shapefiles and can you use this as a basis for validating shapefiles,
including checking for polygon itersections, detecting isolated
polygon islands (polygons with no neighbors), etc.
Overlays are a common operation in GIS. Given two or more shapefiles
can you implement an algorithm that supports common operations like
union, intersection, difference? Line segment intersection algorithms
and Doubly connected edge lists will be useful here.
Collect some GPS data around campus or the greater Swat area and do
something cool. One possibility: map all walkways on campus. You may
need to smooth the lines to clean up noisy signals. How do you do
this? Douglas Peucker can be used to simplify clean data, but
simplifying noisy data may actually preserve noise instead of removing
it. Once you have the data, convert it to a graph representation. Also
map the entrances and exits to buildings. Then develop a tool to find
the shortest path between two given buildings. You could also imagine
building two graphs. The first is a "good weather graph". All edges
have a weight equal to their Euclidean distance. The second graph is
a "bad weather graph". Paths that are indoors (you can't map these
with GPS, so just guestimate) have weight 0 while paths outdoors have
a cost proportional to their Euclidean distance. This Blaha
graph prefers indoor paths. Are there ways to walk inside on inclement
days that may be longer than walking outside, but keep you away from
the dangerous outdoors?