048716 – Large Scale Linear Algebra with Applications to Learning
The goal of this class will be to develop important tools in large scale linear algebra, with a focus on applications in Machine Learning. There has been increasing focus on large-‐scale optimization and machine learning, on both the statistical side (when is there signal in the noise) and the algorithmic side (how can we find it). Spectral properties of matrices, matrix factorization, and algorithms to compute these, are central in both the theory and computation in large-‐scale machine learning problems. This class will explore in depth several topics around these general themes. Specifically, the plan is to cover most of the topics listed below.
Intended Audience: This class is targeted to graduate students in ECE, CS, OR, Mathematics and related areas. A strong linear algebra background will prove very useful, as well as a knowledge of probability and stochastic processes, and basic notions from analysis. Prior knowledge of, or some prior exposure to optimization is desirable, but not essential. While no extensive programming will be required, basic familiarity with Matlab and/or Python will be helpful.
The class grade will be based on problem sets (40%), a final project and
List of topics
The following is an approximate list of topics this class will cover.
1. Clustering: k-‐means, EM, spectral methods. Development of the basic tools in
matrix perturbation, including several versions of the sin theta theorem on
perturbation of invariant subspaces.
2. Low-‐rank matrices and low-‐rank matrix computation and approximation:
Background, Spectral Theorem and SVD, and its fast computation through iterative algortihms (e.g., Power method and variante, Lanczos and connections to conjugate gradient, etc.)
3. Dimensionality reduction: Johnson-‐Lindenstrauss lemma and its applications, PCA, non-‐linear methods – ISOMAP and LLE. Sparse PCA.
4. Random matrix theory, random linear algebra, matrix concentrations. Basic tools from random linear algebra for approximately computing matrix factorizations, and spectral properties, including matrix subsampling (column or element), dimensionality reduction, etc.
5. Streaming models of computation, and streaming algorithms. Emphasis will be on understanding the limits of algorithms with time and memory requirements, with a small number of (or just a single, in the true streaming model) passes over the data.
6. Divide and Conquer algorithms, with possible discussion of the Map-‐reduce model of computation.
7. Various applications will be considered, including examples from matrix completion and sparse reconstruction, graphical models, and possibly other areas.