Latest News and Events
Mathematical Foundations of Multiscale Graph Representations and Interactive Learning
The analysis of large high-dimensional data sets and graphs is motivated by many important applications, such as the study of databases of images and documents, and the modeling of complex dynamical systems (e.g. transaction data, weather patterns, molecular dynamics). This research involves the development of novel mathematical techniques for extracting and visualizing information from large data sets. The data layout, visualization, and human interaction are centered around multi-scale representations, which make it possible to access the data, the derived information and the inference processes associated with it at multiple levels of resolution. The human interaction affects both the geometry and the inference processes on the data, depending on the task at hand. The successful development of these techniques will have substantial impact on any application data which lends itself to a graph representation, such as citation networks, social networks, transaction data correlations, and many aspects of biological systems like gene expression and metabolic pathways. It will also reveal new and interesting multiscale geometric structures of high-dimensional data sets and graphs, and lead to a better understanding of how to extract information from them.
This research develops novel multiscale embedding techniques and algorithms for graphs and data sets, based on diffusion processes on graphs. Such processes are used to generate multiscale embeddings of a graph, at different scales, as well as to perform learning tasks, with and without human interaction. These multiscale embeddings have strong quantitative guarantees in terms of metric distortion. At the same time, multiscale bases are constructed which have provable capabilities of sparsely representing functions on the graph, making them very well suited for both visualization and learning. We demonstrate the above on data sets ranging from gene networks to document corpora.