STA 250 Optimization for Big Data Analytics 2017
Fall 2017

Tues/Thurs 10:00 am - 11:50 am

Instructor: Cho-Jui Hsieh
Office location: Mathematical Sciences Building (MSB) 4232
Office hours: Tuesday 6pm-7pm
TA: Puyudi Yang
Office hours: Wednesday 1:15pm-2:15pm



    This course aims at providing students skills to apply optimization algorithms for solving problems in statistics, machine learning, and data analytics. The course will begin with a quick review of several widely used optimization algorithms (gradient descent, stochastic gradient descent, coordinate descent, Newton method. Then it will cover computational tools for implementing these optimization algorithms.
This course will cover some chapters of "Numerical Optimization" by Nocedal and Wright (N&W), and also some chapters of "Convex Optimization" by Stephen Boyd and Lieven Vandenberghe (B&V). Note that both books are available online. A high-level summary of the syllabus is as follows:

Grading Policy

Grades will be determined as follows:


Readings and links
Tues 9/25
Mathemetical Background

Thurs 9/28
Introduction to Optimization
N&W Chapter 1, 2.1
B&V Chapter 1, 2

Tues 10/3
Gradient descent, convergence, auto-differentiation
B&V Chapter 9.1-9.3
N&W Chapter 2.2

Thurs 10/5
Line Search
N&W Chapter 3.1-3.3

Newton's method

Conjugate Gradient Method, Newton-CG

Stochastic Gradient Descent, Variance Reduction, Adagrad, Adam

Gradient descent for constrained and non-smooth problems

Quasi-Newton methods, BFGS, L-BFGS

Coordinate Descent

Primal-dual relationships, KKT condition

Dual gradient ascent, Augmented Lagrangian, ADMM

Barrier Method

Trust Region Method

Accelerated gradient descent; Discrete optimization