|
|
|
CS194-26/294-26: Intro to Computer Vision and Computational Photography
|
INSTRUCTOR:
Alexei (Alyosha) Efros (Office hours: after lecture), Angjoo Kanazawa (Office hours: after lecture)
GSI: Evonne Ng (Office hours: Tue. 2PM - 3PM & Thu. 11AM - 12PM), Ruilong Li (Office hours: Tue. 2PM - 3PM & Thu. 11AM - 12PM)
Tutors: Kamyar Salahi (Office hours: Wed. 1PM - 2PM), Jerry Ma (Office hours: Mon. 2PM - 3PM), Jason Ding (Office hours: Wed. 3PM - 4PM), Jeffrey Shen (Office hours: Thu. 1PM - 2PM)
UNIVERSITY UNITS: 4
SEMESTER: Fall 2022
WEB PAGE: http://inst.eecs.berkeley.edu/~cs194-26/fa22/
Google Calender: c_mrcbcejculdl42mh9h8kk5hem4@group.calendar.google.com (Public URL)
Piazza: https://piazza.com/berkeley/fall2022/cs1942629426
Gradescope Entry Code: E7JNYB
Syllabus: here
LOCATION: Dwinelle 145
TIME
: MW 5:00 PM-6:30 PM
PREREQUISITES:
This is a heavily project-oriented class, therefore good programming proficiency (at least CS61B) is absolutely essential.
Moreover, familiarity with linear algebra (MATH 54 or EE16A/B or Gilbert Strang's online class) and calculus
are vital. Experience with neural networks (e.g. CS182 or equivalent) is strongly recommended. Due to the open-endedness of this course, creativity is a class requirement.
COURSE
DESCRIPTION:
The aim of this advanced undergraduate course is to introduce students to computing with visual data (images and video).
We will cover acquisition, representation, and manipulation of visual information from digital photographs (image processing),
image analysis and visual understanding (computer vision), and image synthesis (computational photography).
Key algorithms will be presented, ranging from classical (e.g. Gaussian and Laplacian Pyramids) to contemporary (e.g. ConvNets, GANs),
with an emphasis on using these techniques to build practical systems. This hands-on emphasis will be reflected in the programming assignments,
in which students will have the opportunity to acquire their own images and develop, largely from scratch, the image analysis and synthesis tools for solving applications.
PROGRAMMING ASSIGNMENTS:
Project 1: Images of the
Russian Empire -- Colorizing the Prokudin-Gorskii
Photo Collection
Class Choice Awards:
Janise Liang
Project 2: Fun with Filters and Frequencies
Class Choice Awards:
Skylar Sarabia
Project 3: Face Morphing and Modelling a Photo Collection
Class Choice Awards:
Joshua Chen
Project 4: (Auto)stitching and photo mosaics
Project 5: Facial Keypoint Detection with Neural Networks
|
TEXTBOOK:
We will be loosely using the new 2nd edition of Rick Szeliski's Computer Vision textbook. The latest draft is available off the textbook's website. If you find a bug or a typo, please e-mail Rick for a chance to get an acknowledgement in the finished book! The first edition is still available at the bookstore, but it's missing some important things, like discussion of Convolutional Neural Networks.
There
is a number of other fine texts that you can use for
general reference:
Computer Vision: A Modern Approach
(2nd edition), Forsyth and Ponce (classic computer vision text)
Vision Science: Photons to Phenomenology, Stephen
Palmer (great book on human visual
perception)
Digital
Image Processing, 2nd edition, Gonzalez and Woods (a good general image processing text)
Linear Algebra and its Applications, Gilbert Strang
(a truly wonderful book on linear
algebra)
CLASS NOTES
The instructor is extremely grateful to a large number of researchers for
making their slides available for use in this course. Steve Seitz and Rick Szeliski
have been particularly kind in letting me use their wonderful lecture
notes. In addition, I would like to
thank Paul Debevec,
Stephen Palmer, Paul Heckbert, David Forsyth, Steve Marschner
and others, as noted in the slides. The
instructor gladly gives permission to use and modify any of the slides for academic
and research purposes. However, please do also acknowledge the original sources
where appropriate.
CLASS SCHEDULE:
CLASS DATE |
TOPICS |
Material |
Aug
|
Introduction |
|
Aug 29
|
Capturing Light... in man and machine |
|
Sep 1
|
Point Processing & Filtering |
|
|
Convolution and Image Derivatives |
|
Sep 12 |
The Frequency Domain |
|
Sep 14 |
Pyramid Blending, Templates, NL Filters |
|
Sep 19 |
Image Transformations |
|
Sep 21 |
Image Warping and Morphing |
|
Sep 26 |
Data-driven Methods: Faces |
|
Sep 28 |
The Camera |
|
Oct 03 |
Homographies and Mosaics |
|
Oct 5 |
Automatic Image Alignment |
|
Oct 10 |
Automatic Image Alignment + Optical Flow |
|
Oct 12 |
Visual Texture (in human and machine) |
|
Oct 17 |
Feature Learning with Neural Networks | |
|
Convolutional Neural Networks |
|
Oct 24 |
Convolutional Neural Networks II |
|
Oct 26 |
Sequence Models for words and pixels |
|
Oct 31 |
Generative Models |
|
Nov 02 |
3D Vision: Calibration, Stereo |
|
Nov 07 |
3D Vision: Calibration, Stereo |
|
Nov 09 |
3D Vision: Calibration, Stereo |
|
Nov 14 |
Multi-Perspective Panoramas |
|
Nov 21 |
What Makes a Great Picture? |
|
Nov 30 |
Neural Radiance Fields 1 |
|
Dec 07 |
Neural Radiance Fields 2 |
CAMERAS:
Although it is not required, students are highly encouraged to obtain a digital
camera for use in the course.
METHOD OF EVALUATION:
Grading will be based on a set of programming and written assignments (60%), a midterm exam (11/16 Wed.) + potentially some Pop Quizzes (20%), and a final project due on 12/09 Friday (20%).
For the programming assignments, students will be allowed a total of 5
(five) late days per semester; each additional late day will incur a 10%
penalty.
Students taking CS294-26 will also be required to
submit a conference-style paper describing their final project.
PROGRAMMING RESOURCES:
Students will be encouraged to use either Python (with either scikit-image or opencv) or MATLAB (with the Image Processing Toolkit) as their primary computing platform. Specific libraries in both languages offer tons of build-in image processing
functions. Here is a link to some useful MATLAB and Python resources compiled for this class.
PREVIOUS OFFERINGS OF THIS COURSE:
Previous offerings of this course can be found here.
SIMILAR COURSES IN OTHER UNIVERSITIES:
Page design
courtesy of Doug James