Latest News and Events

The SAMSI-FODAVA Workshop on Interactive Visualization and Analysis of Massive Data will be held on December 10-12, 2012.
Posted: October 02, 2012
The FODAVA Annual Meeting will immediately follow (Dec 12-13) the SAMSI/FODAVA joint workshop at the same location.
Posted: September 05, 2012
Many of the modern data sets such as text and image data can be represented in high-dimensional vector spaces and have benefited from computational methods that utilize advanced techniques from num
Posted: June 30, 2012

Towards Understanding Mixtures of Gaussians: Spectral Methods and Polynomial Time Learning with No Separation

Misha Belkin

Mixtures of Gaussian distributions is an important tool in machine learning and statistics, widely used in numerous applications, such as speech recognition and computer vision. In recent years there has been considerable progress toward theoretical understanding of the complexity of estimating mixture distribution, especially in high dimensions. However, commonly used methods, such as Expectation Maximization suffer from several shortcomings, particularly their sensitivity to initialization and the inability to estimate the number of components required.

In my talk I will discuss a class of spectral methods based on eigenvectors of certain kernel matrices for learning parameters of Gaussian mixtures and provide theoretical analyses and experimental results showing that these methods may overcome some of the shortcomings of Expectation Maximization. In a slightly different direction, I will also discuss some recent work showing the first polynomial time algorithm for learning mixtures of Gaussians in high dimension with arbitrarily small separation between components. Parts of the talk are joint work with Tao Shi, Bin Yu and Kaushik Sinha.