Introduction to Machine Learning

CMSC 678

Fall 2022

Contact Information

Instructor: Tim Oates, oates@umbc.edu
Office: ITE-336

TA: Mohammad Eskandari, eskandari@umbc.edu
Office hours: Thursdays 10:00am - 1:00pm

We will rely on Slack for asynchronous communication. Please use this link to sign up for the class Slack. You can use Slack for discussions between yourselves and to ask questions of me or the TA. If you want to ensure that I see your post, please use @oates in it so that I get a notification. Most discussions tend to take place in the general channel, but feel free to ask me to create other channels.

Grading

Grades will be based on a midterm exam, a final exam, a project, and five homework assignments. The homeworks are crucial for solidifying what you learn in class.

The weights on the various items are as follows:

I will use plus/minus grading. Grades will be assigned as follows based on your class average:

Note that, for example, [80, 83) means the interval that includes 80 but not 83.

Professor Oates grades the exams and projects, and the TA grades the homeworks. All assignments will be submitted to the person grading them via slack. If you have questions about grading on homeworks, ask the TA first. If the question cannot be resolved that way, ask Professor Oates. All questions about grades on a homework must be dealt with before grades on the next homework are out.

Late Policy

All assignments (homeworks and the various components of the course project) must be turned in by 11:59PM Eastern time on the date that they are due. I understand that students have many demands on their time that vary in intensity over the course of the semester. Therefore, you will be allowed 5 late days without penalty for the entire semester. You can turn in 5 different assignments one day late each, or one assignment 5 days late, and so on. Late days cannot be used for exams.

Once the late days are used, a penalty of 33% will be imposed for each day (or fraction thereof) an assignment is late (33% for one day, 66% for two, 100% for three or more). An assignment is late by one day if it is not turned in at beginning of class on the day that it is due. It is late by two days if I do not have it by 2:30pm the following day, and so on. It is your responsibility to keep track of how many late days you have used.

If something arises, like a serious illness, and you'll need more time with an assignment, let Professor Oates know before the assignment is due. But note that the late days are meant to be used precisely for things like minor illnesses.

Masking

It is currently UMBC policy that everyone wear a mask in classroom settings. I will bring a few unused masks to class with me. If I see a student not wearing a mask I'll ask them to put one on. If they don't have one I'll give them one of mine. If I run out of extra masks and no other students have a mask to share, students without masks will be asked to leave the classroom for that day.

Project

The project is meant to give students deeper exposure to some topic in machine learning than they would get from the lectures, readings, and discussions alone. Those projects that are most successful often blend the student's own research with machine learning, e.g., by applying machine learning techniques to a problem in some other area, or by bringing an insight from some other area to a problem in machine learning. However, projects need not involve ties with ongoing research. Many good projects in the past have investigated the application of existing algorithms to a domain/dataset of interest to the student, such as Texas Hold'em, the stock market, sporting events, and so on. Students can come up with their own project ideas or they can come see me and we'll brainstorm project ideas. information.

Projects may be done by individuals or teams of two people. However, teams of two will be expected to do significantly more work than what is expected of an individual project. More information on projects can be found here.

Academic Honesty

By enrolling in this course, each student assumes the responsibilities of an active participant in UMBC’s scholarly community in which everyone’s academic work and behavior are held to the highest standards of honesty. Cheating, fabrication, plagiarism, and helping others to commit these acts are all forms of academic dishonesty, and they are wrong. Academic misconduct could result in disciplinary action that may include, but is not limited to, suspension or dismissal. To read the full Student Academic Conduct Policy, consult UMBC policies, or the Faculty Handbook (Section 14.3). For graduate courses, see the Graduate School website.

I will actively monitor all student work, as will the TA, for instances of academic misconduct. The penalty for any such misconduct on the first instance will be a zero on the assignment. The penalty for the second instance will be an F in the class. I will report all instances of academic misconduct to the graduate school.

Textbook

We will use a variety of online sources during this course.

Tools

You can use any programming language and any toolset for homeworks and your projects, but python has (almost) become the default language for machine learning at scale. Therefore, all of the examples that I do in class where we run an actual algorithm will be done using scikit-learn. A very easy way to get everything you may need is to install anaconda. It has python, scikit, and Jupyter notebooks for working with data and presenting results.

Syllabus

This syllabus is subject to small changes, but due dates and exam dates will not change. Note that for each topic there will be two sets of readings - some that everyone should do (marked all) and some that are optional (marked opt). You will only be held responsible for what is in the readings that everyone does, but if you want more information the optional readings are a good source.

Class
Date
Topic
Events/Readings
1 Thu  Sep 1 Course overview; What is machine learning? Read ESL Ch 1 (all)
ESL Ch 2 (opt)
2 Tue  Sep 6 Probability, loss functions, decision theory (slides) CIML Ch 2, ITILA Ch 2 (all)
UML Ch 2, UML Ch 14, ITILA Ch 36 (opt)
3 Thu  Sep 8 Linear regression, classification, perceptrons (slides) CIML Ch 7 linear models, CIML Ch 4 (percentrons) (all)
ESL Ch 3, UML Ch 9.2 (opt)
Homework 1 out
4 Tue  Sep 13
5 Thu  Sep 15
6 Tue  Sep 20 Logistic Regression - Slides
7 Thu  Sep 22 Decision trees (Slides, reading) Homework 1 due
Homework 2 out
8 Tue  Sep 27 Ensembles Boosting slides
9 Thu  Sep 29 Experimental setup, Multi-class vs. Multi-label, Evaluation (slides) CIML Ch 9.5-9.7 ESL Ch 4.4 (all)
UML Ch 9.3 ITILA Ch 39, 41.1-41.3 (opt)
10 Tue  Oct 4 Logistic regression, MaxEnt models (slides)
11 Thu  Oct 6 Neural networks, backpropagation (slides) Homework 2 due
CIML Ch 10; Goodfellow et al. (2016), Ch 6 (Deep Feedforward Networks) (all)
ESL Ch 11 UML Ch 20 ITILA Ch 38-39 (opt)
12 Tue  Oct 11 Homework 3 out
13 Thu  Oct 13 Recurrent neural networks (slides) Goodfellow et al. (2016), Ch 10 (RNNs)
14 Tue  Oct 18 Convolutional neural networks Goodfellow et al. (2016), Ch 9 (CNNs)
15 Thu  Oct 20 Dimensionality reduction
16 Tue  Oct 25 Midterm review Homework 3 due
17 Thu  Oct 27 Dimensionalty reduction (continued) (slides) Project proposal due
18 Tue  Nov 1 Midterm Exam on content of classes 1 - 14
19 Thu  Nov 3 k-Nearest neighbors, k-Means clustering (slides) Homework 4 out
20 Tue  Nov 8 Kernel methods (slides)
21 Thu  Nov 10 Support vector machines
22 Tue  Nov 15 Expectation maximization
23 Thu  Nov 17 Probabilistic modeling slides
24 Tue  Nov 22 Graphical models (slides)
Thu  Nov 24 Thanksgiving holiday, no class
25 Tue  Nov 29 Homework 4 due
Homework 5 out
26 Thu  Dec 1
27 Tue  Dec 6 Reinforcement learning Slides: 1, 3, 4, 5, 6
28 Thu  Dec 8
29 Tue  Dec 13 Final exam review Homework 5 due
Thu  Dec 15 Final Exam 1:00PM - 3:00PM
Thu  Dec 22 by 11:59PM via slack to Professor Oates Final project writeup due