CMSC 475/675: Neural Networks

Instructor: Tejas Gokhale (OH: WED 1430--1530 ; ITE 214);
Teaching Assistant: Ziwei Zhang (OH: MON 1300--1400 and TUE 1430--1530; ITE 334)
Time: MON and WED 1600--1715
Location: ITE 231

Course description

This class will offer a comprehensive overview of neural network architectures and deep learning algorithms. Deep Learning has been a highly successful research field over the last 20 years across a range of domains (vision, language, audio, robotics; ``AI'' in general) and has also translated into significant commercial success. The class will focus on the core principles of extracting meaningful representations from high-dimensional data, a fundamental aspect for several applications in autonomous decision making. Class lectures will cover fundamental topics such as network design, training and optimization, and evaluation. Homeworks will give students the opportunity to implement algorithms learnt in class for applications in visual recognition, language understanding and other domains. In the term project, students will construct a research hypothesis, propose new techniques and solutions, interpret results, and communicate key findings.

Prerequisites: We will assume that you have a basic (but solid) expertise in linear algebra, geometry, probability, and Python programming. Recommended classes at UMBC are: MATH 221 (Linear Algebra), STAT 355 or CMPE 320 (Probability and Statistics), MATH 151 (Calculus and Analytical Geometry). If you are unfamiliar with linear algebra or calculus, you should consider taking both: without these tools, you are likely to struggle with the course. Although we will provide brief math refreshers of these necessary topics, CMSC 475/675 should not be your first introduction to these topics.

Schedule

Schedule is tentative and subject to change.

	Topic	Resources	Optional Reading
0	Introduction	[slides]
1	Learning	[slides]	DL Book Ch 5 Sec 5.1.
2, 3	MLP and Gradient Descent	[slides_A] [slides_B]	DL Book Ch 6
4	Convolutional Neural Networks	[slides]	DL Book Ch 9; Bishop DL Book Ch.10
5	Training NNs	[slides]	CNN Explainer Tool
6	Domain Adaptation	[slides] [notes]	[GitHub Compilation] Awesome-Domain-Adaptation [Paper] Saenko et al. Adapting Visual Category Models to New Domains. ECCV 2010. [Paper] Tzeng et al. Deep Domain Confusion: Maximizing for Domain Invariance. 2014.
7	Representation Learning + Autoencoders	[slides]	DL Book Ch.14
8	Generative Models	[slides]	NIPS 2016 Tutorial on GANs [link] Conditional GAN [paper]
9	Self-Supervised Learning	[slides: SSL] [slides: CLIP]	Chen et al. ICML 2020. A Simple Framework for Contrastive Learning of Visual Representations [pdf] Radford et al. ICML 2021. Learning Transferable Visual Models From Natural Language Supervision [pdf]
10	Adversarial Robustness	[slides] [notes]	Hendrycks et al. CVPR 2021. Natural Adversarial Examples [pdf] Goodfellow et al. ICLR 2015. Explaining and Harnessing Adversarial Examples [pdf] Madry et al. ICLR 2018. Towards Deep Learning Models Resistant to Adversarial Attacks [pdf]
11	Neural Language Models I: Word2Vec, N-Gram, Tokenization ...	[slides_1] [slides_2]	Jurafsky and Martin. Ch 3 and Ch 6
12	Neural Language Models II: Transformers ...	[notes]	Vaswani et al. NeurIPS 2017. Attention Is All You Need [pdf] Katharopoulos et al. ICML 2020. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention [pdf]
13	Neural Language Models III: BERT	[notes]	Devlin et al. NAACL 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [pdf]
14	Vision Transformers	[slides] [notes]	Dosovitskiy et al. ICLR 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale [pdf] Liu et al. ICCV 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [pdf]
15	Diffusion Models	[slides] [notes]	Sohl-Dickstein et al. ICML 2015. Deep Unsupervised Learning using Nonequilibrium Thermodynamics [pdf] Ho et al. NeurIPS 2020. Denoising Diffusion Probabilistic Models [pdf] Dhariwal et al. NeurIPS 2021. Diffusion Models Beat GANs on Image Synthesis [pdf] Rombach et al. CVPR 2022. High-Resolution Image Synthesis with Latent Diffusion Models [pdf]

Grading

Please consult the syllabus for details.

Homework: 30%
Midterm: 20%
Quizzes: 15%
Scribing: 5%
Project: 30%
Extra Credit: (max) 10%

The class has a mix of PhD, MS, and BS students. We believe that anyone with the above prerequisites and a will to learn will do well. Work hard, engage and participate in class, learn how to read, write, and present research articles, be creative in your projects, and seek help when needed!

Late Submission Policy

Everyone in this course has 10 late days to use as needed for personal reasons and emergencies. Do not use them as an excuse to procrastinate -- start working on your assignments early. See the syllabus for details.

Projects

Projects will be judged on the basis of relative growth (from where you start to where you end). Grad projects should have an original and unique research hypothesis with a potential for publication. Undergrad students can also propose original and unique research hypothesis, but will be allowed to work on an idea provided by the instructors (i.e. you get to skip “ideation”) or innovative applications or combination of existing work.

BS or MS students: I recommend working in groups of 4 students.
PhD students: If you'd like to work in smaller groups, consult with Tejas during Office Hours and discuss your existing research agenda. We will integrate the course project into that agenda if possible. Group sizes (or individual projects) will be decided on a case-by-case basis.

Resources

CVPR Template: [link]
For a quick tutorial on LaTeX, visit: this Overleaf Tutorial.

Academic Integrity

Please read UMBC's policy on Academic Integrity. I take academic integrity seriously. I hope that we will never have to deal with violations -- they are never pleasant for anyone involved. Please read the policies stated in the Syllabus .