Note that the paper links below will only work when accessed from a UC
Berkeley campus IP address, because of publisher copyright
restrictions.
And now, onto the projects ...
- Optic flow.
Optic flow is the computation of the direction and magnitude of
the movement of patterns in a 2-D image. Key papers in the 1980s
defined several approaches to computing optic flow. This project idea
is to choose one of the algorithms, and implement it as a vision
processor. The processor takes an 2-D image as input, and produces a 2-D
optic flow field as output.
The variational approach to optic flow (Horn and Schunck) is described
in this paper.
Related theory is described in this
paper.
FPGA implementations of the algorithm are described in these two
papers (one,
two).
A Matlab implementation is available
here. This video shows the algorithm
in action.
The least-squares approach to optic flow (Lucas and Kanade) is
described in
this paper.
An FPGA implementation of the algorithm are described in this
paper.
A Matlab implementation is available
here.
The correlation
approach (Little, Bulthoff, Poggio) is described in these two papers
(one,
two).
A board-level hardware implementation is described
here.
It may also be helpful to read this tutorial on optic flow, although only the 2-D
sections of the paper are applicable to us. To get a sense of what a
present-day version of an optic flow algorithm looks like, you may
want to read this recent paper by Brox and Malik.
-
Histogram Filters.
Histogram filters are a general class of nonlinear image processing
algorithm. A well-known histogram filter is the median filter.
For the class project, we focus on a recent formulation of histogram
filters from Pixar (Kass and Solomon). The algorithm generalizes the concept in
an efficient way, while producing beautiful pictures (as one would
expect from Pixar). This paper
describes their algorithm. A key reference in their paper
(this paper by Deriche)
describes the approximation method that produces an efficient filter.
-
Face Detection.
Face detecton systems take complicated real-world images that
as input, and draw bounding boxes around all faces
found in the image.
The Viola-Jones algorithm is a face detector suitable for
implementation as a class project. This paper describes the algorithm, and this
paper describes
an FPGA implementation of the algorithm.
The OpenCV toolkit includes an implementation of the Viola-Jones
algorithm, including an XML file that codes the learned parameters for
the recognition cascade. It may be best to start by reading
this documentation about the algorithm in OpenCV,
followed by this
help page.
To view the
code (including the XML file), download OpenCV and
look for apps/HaarFaceDetect.
-
Edge Detection.
Edge detection is the automatic tracing of the contours of an image,
similar to the pencil sketch an artist makes. It has been a
classic problem since the early days of computer vision, and to
this day remains a research problem.
For the class project, we focus on a particular edge detection
algorithm by Perona and Malik, described in this
paper.
Matlab source code is available here.
We choose this paper because it casts edge detection as the
solution of the heat equation (from fluid dynamics).
Differential equation solvers are a good fit for parallel hardware.
For a current-day state of the art edge-detection system,
see this recent paper by Arbelaez, Maire, Fowlkes, and Malik.
-
SIFT Features.
Image pattern recognition systems often consist of two parts: an
image processing system that converts the image to a collection
of abstract features, and a pattern recognition engine that uses
the feature representation to perform a task.
A popular feature set is the SIFT features, described in this
paper. Your
class project would be to design a chip that took images as
input, and produced the SIFT representation as output.
There are many open-source SIFT implementations available on
the web, use Google to find the one most useful to bootstrap your
project.
-
Stereopsis.
A stereo vision module takes in two images (corresponding to the left
and right eyes in biological vision) and produces a map of the
relative distances (or depths) that we perceive when we
use two eyes to view a scene (as opposed to when we close one eye).
The classic algorithm for computing stereo vision is the Marr-Poggio
stereo algorithm, as described in this short paper and this longer paper. A robot vision system based on
this algorithm is described in this
paper. An analog circuit implementation is
described in this paper.
Stereo vision algorithms whose code is available on the web tend
to be the algorithms that improved on the Marr-Poggio approach.
Google the terms stereo algorithm matlab to find many examples.