Instructor: Tejas Gokhale (OH: Wednesday 2:30 PM - 3:30 PM or by appointment; ITE 214);
Teaching Assistant: TBD ;
Time: Monday and Wednesday 4:00pm - 5:15pm
Location: ITE 229
This course will offer a comprehensive introduction to the field of computer vision which has the broad goal of understanding visual signals (images and videos) for low/mid/high-level perceptual tasks. This course will introduce fundamental principles and concepts for developing computer vision systems such as image formation, acquisition, and processing, stereo and 3D vision, machine learning algorithms and neural networks for image understanding.
Prerequisites:
We will assume that you have a basic (but solid) expertise in linear algebra, geometry, probability, and Python programming.
Recommended classes at UMBC are: MATH 221 (Linear Algebra), STAT 355 or CMPE 320 (Probability and Statistics), MATH 151 (Calculus and Analytical Geometry).
If you are unfamiliar with linear algebra or calculus, you should consider taking both: without these tools, you are likely to struggle with the course. Although we will provide brief math refreshers of these necessary topics, CMSC 491/691 should not be your first introduction to these topics.
We understand that some students may have had some prior exposure to signal/image/audio processing, computer graphics, machine learning, neural networks etc. However, none of these are pre-requisites -- the class is designed to be self-contained.
Reference Books There are no required textbooks. The following books may be useful to accompany the lectures:
Schedule is tentative and subject to change.
Topic | Resources | Additional Reading | |
1 | Introduction | ||
2 | Image Formation and Acquisition | Szeliski Ch 2 | |
3 | Image Filtering I | Szeliski Ch 3 | |
4 | Image Filtering II | Szeliski Ch 3 | |
5 | Image Features I | Szeliski Ch 3 | |
6 | Image Features II | Torralba Ch 3; SIFT paper | |
7 | Machine Learning for Computer Vision I | Szeliski Ch 5, Goodfellow Ch 5 | |
8 | ML-for-CV II (Neural Networks) | Goodfellow Ch 6.1-6.4, Shree Nayar FPCV Playlist | |
9 | ML-for-CV III (Gradient Descent) | Goodfellow Ch 6.5, Roger Grosse: Optimization Notes | |
10 | Visual Recognition | Szeliski Ch 6, Forsyth Ch 18 | |
11 | Image Transformations | Szeliski 3.6 | |
12 | Homographies | Hartley&Zisserman Part 0 (Ch 2, 3, 4), SVD Tutorial | |
13 | Camera Models | Hartley&Zisserman Part 1 (Ch 6, 7, 8) | |
14 | Triangulation & Epipolar Geometry | ||
15 | Stereo Vision | ||
16 | Vision-and-Language | ||
17 | Image Synthesis | ||
18 | Robustness | ||
19 | Motion |