## Syllabus

## Professor Rebecca Willett

willett@discovery.wisc.edu

3537 Engineering Hall

## TA Zeping Ren

zren43@wisc.edu

## Class logistics:

Class location: 1610 Engineering Hall

Class time: 11-12:15pm Mondays and Wednesdays

Office hours: 4-5pm Mondays and 12-1pm Tuesdays when classes are in session in 3537 Engineering Hall

TA Office hours: 3-4pm Mondays, 11am-noon Tuesdays, and 2-4pm Fridays in CS 6378

## Course Topics:

This course is an introduction to machine learning that focuses on matrix methods

and features real-world applications ranging from classication and clustering to denoising and data analysis. Mathematical topics covered include: linear equations, regression, regularization, the singular value decomposition, and iterative algorithms. Machine learning topics include: the lasso, support vector machines, kernel methods, clustering, dictionary learning, neural networks, and deep learning. Students are expected to have taken a course in calculus and have exposure to numerical computing (e.g. Matlab, Python, Julia, R). Appropriate for graduate students or advanced undergraduates.

## Prerequisites:

(MATH 222 and (ECE 203 or CS 200, 300, 302)) or (graduate or professional standing)

## Textbook:

*Matrix Methods in Data Mining and Pattern Recognition* by Lars Elden. Textbook is freely available for anybody on the UW-Madison network: http://epubs.siam.org/doi/book/10.1137/1.9780898718867. The textbook will be supplemented with additional notes and readings.

## Videos of lectures

## Tentative schedule:

- Vectors and Matrices in Machine Learning Models
- Lecture 1 Notes
- Read textbook sections 1.1, 1.2, 1.6, 2.1, 2.2, and 2.3.

- Vectors and Matrices in Machine Learning
- Read textbook sections 2.4, 2.5, 2.6.
- Lecture 2 Notes, Lecture 2 demo
- Lecture 3 notes

- Linear Systems and Least Squares (Face recognition)
- Linear Independence and Orthogonality (Classication)
**Exam 1: Oct. 11, 7:15-9:15pm in 1800 Engineering Hall**- Singular Value Decomposition (Principal Component Analysis)
- Lecture 10 – Tikhinov and SVD
- Lecture 11-The SVD
- Lecture 12-SVD and PCA, lecture12.m
- Lecture 13-More SVD
- Lecture 14-Power Iterations and PageRank
- Lecture 15-Matrix Completion, Deconvolution, and Random Variables

- The Lasso and Proximal Gradient Algorithms
- Lecture 16-Iterative Solvers
- Lecture 17-Proximal Gradient, lecture17.m
- Lecture 18-LASSO

- Support Vector Machines and Hinge Loss
- Lecture 19-SVM

- Stochastic Gradient Descent Algorithms
**Exam 2: Nov. 15, 7:15-9:15pm in 1800 Engineering Hall**- Neural Networks and Deep Learning
- Kernel methods
- Unsupervised Learning and Clustering

## Learning Outcomes:

This applies to both graduate and undergraduate students enrolled in the class. Upon successful completion of this course, students will:

- Understand machine learning methods and algorithms through matrix-vector methods and optimization theory.
- Formulate a wide variety of machine learning problems as optimization models and solve them numerically. Understand practical implications of norm choice, regularization, and convexity.
- Investigate an applied machine topic not explicitly covered in class and produce a research project that explains, analyzes, and discusses the topic.

## Evaluation:

Graduate and undergraduate students will be expected to perform at the graduate level and will be evaluated equally. All students will be evaluated by regular homework assignments, exams, and a final project. The final grade will be allocated to the different components as follows:

*Homework: 20%.*There are roughly weekly homework assignments (about 10 total). Homework problems include both mathematical derivations and proofs as well as more applied problems that involve writing code and working with real or synthetic data sets.*Exams: 40%.*Two midterm exams (20% each), to conclude Parts I and II. No final exam.*Final project: 40%.*Students will work in groups (up to 3 students per group) to investigate a machine learning problem or technique using tools learned in class.

Letter grades will be assigned using the following hard cutoffs:

- A: 93% or higher
- AB: 87% or higher
- B: 80% or higher
- BC: 70% or higher
- C: 60% or higher
- D: 50% or higher
- F: less than 50%

We reserve the right to curve the grades, but only in a fashion that would improve the grade earned by the stated rubric.

## Academic integrity:

Students are strongly encouraged to work together on homework assignments, but each student must submit his or her own writeup. Plagiarism of material written by classmates, book or article authors, or web posters is prohibited. Students must work independently on exams. Academic integrity will be strictly enforced. http://students.wisc.edu/doso/acadintegrity.html