Latest News and Events

The SAMSI-FODAVA Workshop on Interactive Visualization and Analysis of Massive Data will be held on December 10-12, 2012.
Posted: October 02, 2012
The FODAVA Annual Meeting will immediately follow (Dec 12-13) the SAMSI/FODAVA joint workshop at the same location.
Posted: September 05, 2012
Many of the modern data sets such as text and image data can be represented in high-dimensional vector spaces and have benefited from computational methods that utilize advanced techniques from num
Posted: June 30, 2012

Extreme Scale Visual Analytics VisWeek 2010 Workshop

Extreme Scale Visual Analytics VisWeek 2010 Workshop 

Time and Place

October 24th, 2010, 08:30 - 18:00.

Grand American Hotel, Salt Lake City UT USA

Organizers

David Ebert Purdue University
Guy Lebanon Georgia Institute of Technology
Patrick McCormick Los Alamos National Lab
Haesun Park Georgia Institute of Technology
Hanspeter Pfister Harvard University
Leland Wilkinson SYSTAT

 

Goals and Technical Scope

Scientific sensors and applications are continuing to produce data at ever­increasing rates. The social sciences and humanities are transitioning to quantitative disciplines with access to vast amounts of on­line data. And ever more sophisticated sensor networks are monitoring traffic, energy usage patterns, or ocean pollution levels around us. This explosion of data is overwhelming our capabilities to explore, analyze, hypothesize, and thus fully interpret the underlying details. These tasks will become even more challenging as we make advancements from petascale to exascale computing. Furthermore, the computer architectures that support these efforts are undergoing a revolutionary change as manufacturers transition to building chips that use an increasing number of processor cores. In addition, graphics hardware that was once designed entirely for the rendering of polygonal primitives has rapidly evolved into a powerful general­purpose processor that could significantly influence the design of future processors. While these trends have the ability to support new capabilities and improve overall performance, they will do so in a disruptive manner – placing further strain on already complex software development efforts. Visualization is one of the linchpins for solving these challenges. It has the potential to facilitate the analysis of very large data with a combination of effective visual human­computer interfaces and powerful computational analytics. It also provides effective visual interfaces to monitor the execution of complex parallel software on massively parallel machines. On the other hand, the profound changes in heterogeneous machine architectures and programming environments for exascale computing will also change research and application development in visualization.

The primary goal of the workshop is to bring together researchers in Computer Science, Computational Science, Mathematics, Statistics, Visualization, and related areas working on large scale high dimensional data analysis with a potential impact in Data and Visual Analytics. The workshop should provide an opportunity to discuss and explore issues of scale and complexity in data and visual analytics and advanced technical developments related to the issues. Increasing amounts of data, regarding both number of data points and dimensionality, and related issues such as efficient and effective data representation and transformation, visual representation in limited screen space and real­time visual interaction present new challenges that are fundamental to the continued success of the field. We plan to investigate these challenges and discuss promising directions.

Topics include

  • Analytics and visualization for extremely high dimensional data. Biological, security, remote sensing, Web, and other data are sometimes extremely high dimensional. Several methods for analyzing such data have been proposed including feature selection, feature extraction, random projections, basis pursuit,and kernel methods. We will review and evaluate different existing methods as well as discuss promising new directions that are particularly relevant for visual representation of high dimensional data.
  • Real time and scalable computational methods for visual analytics of massive data. We will discuss how different techniques scale up computationally for massive data with an emphasis on identifying methods that are of linear or sub­linear complexity. For massive data, we will explore informative ways for representation and transformation in limited screen space. We will also focus on streaming data visualization, with attention to real­time display and responsive interactivity.
  • Programming Support. Programming languages, compiler technologies, and operating systems for large scale visual analytics.
  • Parallel and high performance computing in visual analytics. We will discuss how different visual analytics techniques can benefit from recent advances in parallel/high performance computing and cloud computing. We will briefly survey distributed algorithms such as MapReduce, Hadoop, and PSM.
  • Speedup vs. accuracy tradeoff in visual analytics. How can we effectively exchange reduction in the accuracy of a visual analytics procedure for a significant computational speedup. What are the practical and theoretical aspects of this issue?
  • Fundamental limits and theory. What are the important theoretical issues in visual analytics scalability? What is the impact of the curse of dimensionality on visual analytics? Are there any fundamental laws of diminishing returns?
  • Visual analytics on limited computational platforms. What strategies and algorithms can be employed when large problems are scaled to small machines such as the iPhone, iPad, and mobile phones? What are the strengths and weaknesses of mobile operating systems such as Windows Mobile, Android, and iPhone OS? How can we design a flexible environment that scales up to megapixel displays and down to hand­ helds?

Planned Activities

We intend to have three sessions in the workshop (one day). The morning session will contain six invited talks, each a half hour long. The second session (after an hour lunch break) will contain six more invited talks. The third session will contain panel discussions and a poster session. The panel discussion will focus on challenges and future directions. The panelists will be recruited from experts in the field, including the invited speakers. One or more of the organizers will moderate the panel. Questions will be taken from the audience and the moderator and panelists will be free to contribute additional questions.

The final part of the last session will be a poster presentation session. All posters will be based on contributed submissions. The organizers will review the submissions and select high quality submissions for poster presentation. We plan to encourage participation by graduate students and junior researchers. As an incentive to encourage more student participation, we plan to offer partial travel support for the students who are the first authors of the selected papers under the condition that the advisor guarantees to support the rest of the travel expenses. It is anticipated that there will be large number of submissions from a wide range of communities, especially since we plan to solicit submissions from the FODAVA teams as well as other related groups, such as DHS Center of Excellence affiliated universities.

Schedule

Extreme Scale Visual Analytics
Sunday, October 24, 2010
8:30am Welcome by Haesun Park (Georgia Tech)
 
Session Chair Speakers (Affliation) Talk Title
Session 1: 8:30am - 10:10am Hanspeter Pfister Lucy Nowell (DOE) (30 mins) Science at Scale: Batch No More
Kelly Gaither (TACC) (30 mins) Large Scale Remote Visualization and Visual Analytics
Steven Parker (NVIDIA) (30 mins)
Session 2: 10:30am - 12:10pm Haesun Park William Cleveland (Purdue) (30 mins) Divide and Recombine for the Analysis of Large, Complex Datasets
Heike Hofmann (Iowa State University) (30 mins) Validating Visual Features
Leland Wilkinson (SYSTAT) (30 mins) Pass/Stream/Merge (PSM) analytics: A Flexible Architecture for Distributed Analytics and Visualization
Session 3: 2:00pm - 3:40pm Guy Lebanon Alex Smola (Yahoo Research) (30 mins) Visualization: The Dependence Maximization Point of View
Alex Gray (Georgia Tech) Beyond RAM: Fast Visual/Statistical Analysis Using Disk-Based Data Structures
Mauro Maggioni (Duke) (30 mins) Novel multiscale representations of data sets for interactive learning
Session 4: 4:15pm - 5:55pm David Ebert Hanspeter Pfister (Harvard) (30 mins) Large-Scale Visual Analytics in Biology
Daniel Keim (University of Konstanz) (30 mins) Scalable Visual Analytics
Panel Discussion: Challenges and Future Directions Panelists: Chris Johnson (Utah), Tom Ertl (Stuttgart), Bill Pike (PNNL)
 
6:30pm Dinner: (pay your own) - location TBD.