Latest News and Events
Multi-Source Visual Analytics
Data visualization forms an important aspect of analysis in the field of visual analytics. Analysts rely on visual tools to process massive data sets and discover meaningful patterns in the data. A common strategy for many visualization tools is to transform high-dimensional data to an intermediate lower-dimensional space and then project to screen space using a visualization transformation. For example, a data set with 200 dimensions can be transformed to an intermediate 4D representation and then mapped to screen space by using two-dimensions for the location and two dimensions to determine shape and color. Therefore, the mathematical foundations of visualization are closely related to the problem of dimensionality reduction.
While dimensionality reduction is a necessary step to visualize the data, the final goal of visual analytics is data analysis, such as searching, clustering, and the detection of outliers. Therefore, there is an urgent need to study dimensionality reduction techniques that are especially useful for data analysis.
This research involves the development and implementation of linear and nonlinear dimensionality reduction algorithms for the transformation and visualization of high-dimensional data. The novel aspect of the transformation is that dimensionality reduction and clustering are performed simultaneously in a joint framework. In addition, this research involves the development and implementation of novel algorithms for multi-source data transformations based on multiple kernel learning (MKL). This addresses the question of fusing a multitude of heterogeneous independently collected data. In the past, most research on MKL has focused on supervised learning. One major contribution of this research is to extend MKL to the unsupervised case. This research presents visual analytics as a bridge between theoretical foundations in machine learning and real-world applications. This research is utilizing two testbed data bases, one consisting of printed documents as might be used by the intelligence community and one based on public health information.