# Location and Time

• Location: 103 Transportation Building
• Time: MWF, 8:00 AM - 8:50 AM

# Course Staff

### Instructor

• Name: David Dalpiaz
• Office: 122B Illini Hall
• Office Hours: Wednesday, 12:30 PM - 1:30 PM, Thursday, 1:00 PM - 3:00 PM, or by appointment.

### Teaching Assistant

• Name: Wenjing Yin
• Office: 122 Illini Hall
• Office Hours: Tuesday, 3:20 PM - 4:20 PM, Thursday, 3:20 PM - 4:20 PM.

# Course Objectives

After this course, students should be expected to be able to …

• identify supervised (regression and classification) and unsupervised (clustering) learning problems.
• choose effective methods to use in solving various learning problem.
• understand the fundamental theory behind statistical learning methods.
• implement the statistical learning methods in practice using a statistical computing environment.

# Course Content

Tentative subjects include:

• Basics: Supervised vs Unsupervised Learning, Bias-Variance Trade-Off, Cross-Validation
• Regression: SLR, MLR, Model Selection, Penalized Regression, Smoothing, Generalized Additive Models, Regression Trees.
• Classification: Logistic Regression, KNN, LDA, QDA, Naive Bayes, Support Vector Machines, Classification Trees, Bagging, Boosting, Random Forests
• Clustering: PCA, K-Means, Hierarchical Clustering, Mixture Models, EM Algorithm

If time permits, we will discuss the basics of neural networks, and build towards the fundamentals of deep learning.

# Textbooks

R tutorials will be found in R for Statistical Learning, which will be updated as the semester progresses. Additional reading material will be posted on the course website.

# Prerequisites

A course which covers linear regression and uses R, such as STAT 420 or STAT 425. Basic knowledge of probability and linear algebra is also assumed.

# Email Policy

Due to the large size of this course, we follows a strict email policy. Before sending an email, please read this note.

# Homework

There will be ten homework assignments.

Please see this note for a detailed homework policy, including the directions for all assignments.

# Quizzes

There will be two in-class quizzes. The quiz dates are:

• Quiz 1: Wednesday, October 18
• Quiz 2: Wednesday, December 6

# Projects

Project due dates, assignment details, and group assignments will be announced after the midpoint of the semester.

## Group Final Project

There are five assignments associated with the group final project. They are:

• Group Selection
• Project Proposal
• Project Report
• Evaluation of Peers
• Evaluations from Peers

Graduate students will be required to complete a small additional project, which will take the form of a Kaggle-like competition. Undergraduate students will receive a 100% without completing this project, but are still encouraged to give it a try!.

# Software

R and RStudio are required software for this course. R is a freely available language and environment for statistical computing and graphics. RStudio is a free and open-source integrated development environment for R. You must have access to a computer where you are able to install the most up-to-date versions of R and RStudio, as well as install R packages.

Type Percentage
Homework 40
Quiz I 15
Quiz II 15
Group Selection 1
Project Proposal 4
Project Report 15
Evaluation of Peers 2.5
Evaluations from Peers 2.5

A+ A A- B+ B B- C+ C C- D+ D D-
TBD 93% 90% 87% 83% 80% 77% 73% 70% 67% 63% 60%

Grades are not curved or adjusted. This is not to dishearten students, but to let them know that their grade is based on individual effort and not on comparative effort.

# Attendence

You are expected to attend all lectures and discussions. Failure to do so may not have a direct effect on your course grade, but will likely have a significant indirect effect. Any known or potential extracurricular conflicts should be discussed in person with the instructor during the first week of classes, or as soon as they arise.

The official University of Illinois policy related to academic integrity can be found in Article 1, Part 4 of the Student Code. Section 1-402 in particular outlines behavior which is considered an infraction of academic integrity. These sections of the Student Code will be upheld in this course. Any violations will be dealt with in a swift, fair and strict manner. Homework assignments are meant to be learning experiences. You may discuss the exercises with other students, but you must write-up the solutions on your own. In short, do not cheat, it is not worth the risk. You are more likely to get caught than you believe. If you think you may be operating in a grey area, you most likely are.