STA414S/2104S: Statistical Methods for Data Mining and Machine Learning January - April, 2010
Meets Tuesday 12-2, Thursday 12-1.
Course Information
This course will consider topics in statistics that have played a role in
the development of techniques for data mining and machine learning. We will
cover linear methods for regression and classification, nonparametric
regression and classification methods, generalized additive models, aspects
of model inference and model selection, model averaging and tree based
methods.
Prerequisite: Either STA 302H (regression) or CSC 411H (machine learning). CSC108H was recently added: this is not urgent but you must be willing to use a statistical computing environment such as R or Matlab.
Office Hours: Tuesdays, 3-4; Thursdays, 2-3; or by appointment.
Textbook: Hastie, Tibshirani and Friedman. The Elements of Statistical Learning.
Springer-Verlag.
Book web
page
Course evaluation: Two homework sets: 40%. Midterm exam: 20%. Final project: 40%.
Computing: I will refer to, and provide explanations for, the R computing environment. You are welcome to use some other package if you prefer. There are many online resources for R, including:
Material from lectures
|