CMSC 472/672: Computer Vision


Instructor: Tejas Gokhale (OH: Wednesday 2:30 PM - 3:30 PM or by appointment; ITE 214);
Teaching Assistant: TBD ;
Time: Monday and Wednesday 4:00pm - 5:15pm
Location: ITE 229


Course Description | Schedule | Grading | Syllabus

Course description

This course will offer a comprehensive introduction to the field of computer vision which has the broad goal of understanding visual signals (images and videos) for low/mid/high-level perceptual tasks. This course will introduce fundamental principles and concepts for developing computer vision systems such as image formation, acquisition, and processing, stereo and 3D vision, machine learning algorithms and neural networks for image understanding.

Prerequisites: We will assume that you have a basic (but solid) expertise in linear algebra, geometry, probability, and Python programming. Recommended classes at UMBC are: MATH 221 (Linear Algebra), STAT 355 or CMPE 320 (Probability and Statistics), MATH 151 (Calculus and Analytical Geometry). If you are unfamiliar with linear algebra or calculus, you should consider taking both: without these tools, you are likely to struggle with the course. Although we will provide brief math refreshers of these necessary topics, CMSC 491/691 should not be your first introduction to these topics.
We understand that some students may have had some prior exposure to signal/image/audio processing, computer graphics, machine learning, neural networks etc. However, none of these are pre-requisites -- the class is designed to be self-contained.

Reference Books There are no required textbooks. The following books may be useful to accompany the lectures:


Schedule

Schedule is tentative and subject to change.

Topic Resources Additional Reading
1 Introduction
2 Image Formation and Acquisition Szeliski Ch 2
3 Image Filtering I Szeliski Ch 3
4 Image Filtering II Szeliski Ch 3
5 Image Features I Szeliski Ch 3
6 Image Features II Torralba Ch 3; SIFT paper
7 Machine Learning for Computer Vision I Szeliski Ch 5, Goodfellow Ch 5
8 ML-for-CV II (Neural Networks) Goodfellow Ch 6.1-6.4, Shree Nayar FPCV Playlist
9 ML-for-CV III (Gradient Descent) Goodfellow Ch 6.5, Roger Grosse: Optimization Notes
10 Visual Recognition Szeliski Ch 6, Forsyth Ch 18
11 Image Transformations Szeliski 3.6
12 Homographies Hartley&Zisserman Part 0 (Ch 2, 3, 4), SVD Tutorial
13 Camera Models Hartley&Zisserman Part 1 (Ch 6, 7, 8)
14 Triangulation & Epipolar Geometry
15 Stereo Vision
16 Vision-and-Language
17 Image Synthesis
18 Robustness
19 Motion

Homework

All material for homework assignments (handouts, code, data, etc.) will be available to download from Blackboard.

Grading

The class has a mix of PhD, MS, and BS students. We believe that anyone with the above prerequisites and a will to learn will do well. Work hard, engage and participate in class, learn how to read, write, and present research articles, be creative in your projects, and seek help when needed!

Projects

Projects will be judged on the basis of relative growth (from where you start to where you end). Grad projects should have an original and unique research hypothesis with a potential for publication. Undergrad students can also propose original and unique research hypothesis, but will be allowed to work on an idea provided by the instructors (i.e. you get to skip “ideation”) or innovative applications or combination of existing work.

Late Submission Policy

Everyone in this course has 10 late days to use as needed for personal reasons and emergencies. Do not use them as an excuse to procrastinate -- start working on your assignments early. See the syllabus for details.

Academic Integrity

Please read UMBC's policy on Academic Integrity. I take academic integrity seriously. I hope that we will never have to deal with violations -- they are never pleasant for anyone involved. Please read the policies stated in the Syllabus .