Latest News and Events

The SAMSI-FODAVA Workshop on Interactive Visualization and Analysis of Massive Data will be held on December 10-12, 2012.
Posted: October 02, 2012
The FODAVA Annual Meeting will immediately follow (Dec 12-13) the SAMSI/FODAVA joint workshop at the same location.
Posted: September 05, 2012
Many of the modern data sets such as text and image data can be represented in high-dimensional vector spaces and have benefited from computational methods that utilize advanced techniques from num
Posted: June 30, 2012

Model Complexity Optimization

Alexey Chervonenkis

It is shown (theoretically and empirically) that a reliable result can be gained only in the case of a certain relation between the capacity of the class of models from which we choose and the size of the training set. There are different ways to measure the capacity of a class of models. In practice the size of a training set is always finite and limited. It leads to an idea to choose a model from the most narrow class, or in other words to use the simplest model (Occam's razor). But if our class is narrow, it is possible that there is no true model within the class or a model close to the true one. It means that there will be greater residual error or larger number of errors even on the training set. So the problem of model complexity choice arises – to find a balance between errors due to limited number of training data and errors due to excessive model simplicity. I shall review different approaches to the problem.

Alexey Chervonenkis was born in Moscow, Russia, in 1938. A graduate of Moscow Institute of Physics and Technology, he joined the Institute of Control Sciences of Russian Academy of Sciences in Moscow in early 60s, where he worked ever since, currently holding the position of Leading Researcher. He also holds a Professorship at Royal Holloway University in London, UK, and teaches at Yandex School of Data Analysis in Moscow. He is mostly known as one of the main developers of the fundamental Vapnik-Chervonenkis theory, a central part of the modern machine learning theory. Besides theoretical work, he has worked on a number of application areas. In 1987 he was awarded the State Prize of the Soviet Union for his work on the geostatistical analysis of spatial grade distribution in ore deposits and development of practical mining control systems. This event is sponsored by NSF/DHS FODAVA program, School of Mathematics, Division of Computational Science and Engineering, Algorithms and Randomness Center, and Machine Learning and Data Mining Seminar Series grant by Yahoo.