Latest News and Events

The SAMSI-FODAVA Workshop on Interactive Visualization and Analysis of Massive Data will be held on December 10-12, 2012.
Posted: October 02, 2012
The FODAVA Annual Meeting will immediately follow (Dec 12-13) the SAMSI/FODAVA joint workshop at the same location.
Posted: September 05, 2012
Many of the modern data sets such as text and image data can be represented in high-dimensional vector spaces and have benefited from computational methods that utilize advanced techniques from num
Posted: June 30, 2012


In this project, the investigators study the fundamental aspects ofincorporating uncertainty with sensitivity analysis in the visualanalytics process. They also aim to develop novel and scalablevisual representations of sensitivity, from the visualization of theraw sensitivity coefficients to visual summaries of multivariatederivatives obtained from the analysis. Uncertainty-aware visualanalytics helps enhance analysts' confidence levels on the insightgained from the analysis.

Data visualization forms an important aspect of analysis in the field of visual analytics. Analysts rely on visual tools to process massive data sets and discover meaningful patterns in the data. A common strategy for many visualization tools is to transform high-dimensional data to an intermediate lower-dimensional space and then project to screen space using a visualization transformation.

This research project will extend the theoretical foundations of mixture modeling for statistical learning by novel mathematical tools that can probe into the precise geometry of mixture models. Based on the theoretical results, the investigators will develop new approaches to clustering, dimension reduction, variable selection, and temporal analysis. These methods will open promising paths for interactively visualizing complex data and for data summarization. A suite of statistical tools will be integrated as the technical backbone into a new visualization system.

The human eye is often capable of identifying interesting patterns and trends from a well-presented data set, whereas computational algorithms may have difficulties with such a task. Yet, there are limits to human ability, both with the scale of the data set in terms of objects and attributes and with dynamic changes over time. This project develops an analytic and computational framework to support the visual analysis of large-scale dynamic data with network structure.

Developing new algorithms, visualization tools, and mathematical models that can predict and explain patterns in data is fundamental to machine learning and statistics. They enable a predictive modeling that is fundamental to science and engineering. Visualization is critical in all phases of data analysis, from the moment the data are collected when data checking and cleaning are needed, to the final presentation of results.

Principal Investigator(s):

The ability to obtain insight from massive, dynamic, and likely incomplete digital data is absolutely essential to those who collect these data for time-critical decision-making. This research is developing mathematical formulations for the quantification, propagation, and aggregation of such data to support collaborative reasoning using visual means.

There is currently a major discrepancy between the dramatic improvements in hardware for sensing, communication, and storage of raw data and the capacity of humans to analyze and act on this data in a meaningful way. There is every reason to believe that this development will continue in the near future, given the revolutionary changes to hardware and software in the World Wide Web, the Sensor Web, the network of hand-held and mobile devices, and the Smart Grid.

The goal of this proposal is to transform large audio corpora into a form suitable for visualization. Specifically, this proposal addresses the type of audio anomalies that human data analysts hear instantly: angry shouting, trucks at midnight on a residential street, gunshots. The human ear detects anomalies of this type rapidly and with high accuracy. Unfortunately, a data analyst can listen to only one sound at a time. Visualization shows the analyst many sounds at once, possibly allowing him or her to detect an anomaly several orders of magnitude faster than ?real time.?

The proposed research exploits an idea of John Tukey that was never published. Called scagnostics (a Tukey neologism for "scatterplot diagnostics"), the original idea leads to a more general characterization of high-dimensional point sets using visually-based geometric and graph-theoretic measures. These measures comprise a canonical set of 9 features of pointwise data typically observed by experienced statisticians. Computing these measures on all possible 2D axis-parallel orthogonal projections in a p-dimensional space results in a p(p- 1)/2 × 9 matrix of measures.