Latest News and Events
Visualizing Audio for Anomaly Detection
The goal of this proposal is to transform large audio corpora into a form suitable for visualization. Specifically, this proposal addresses the type of audio anomalies that human data analysts hear instantly: angry shouting, trucks at midnight on a residential street, gunshots. The human ear detects anomalies of this type rapidly and with high accuracy. Unfortunately, a data analyst can listen to only one sound at a time. Visualization shows the analyst many sounds at once, possibly allowing him or her to detect an anomaly several orders of magnitude faster than ?real time.? This proposal aims to render large audio data sets, comprising thousands of microphones or thousands of minutes, in the form of interactive graphics that reveal important anomalies at a glance. Data transformations will include signal processing, statistical modeling, and visualization. Signal processing will seek to characterize all of the ways in which the difference between two audio signals may be ?important,? including, for example, spectral differences, rhythmic differences, and differences in the impression made on the auditory cortex of a human listener. Statistical modeling will seek to characterize the range of audio events that are ?normal? or easily explicable, so that we may precisely measure the degree to which a potential anomaly is abnormal or inexplicable. Visualization methods will render measures of abnormality, and information about the signal characteristics of each anomaly, in a form suitable for rapid browsing. Two testbeds are proposed. The ?multi-day audio timeline? will be a portable application, visually similar to a nonlinear audio editing suite, which will allow the analyst to rapidly zoom in on potentially anomalous periods of time. The ?milliphone? will be a three-dimensional visualization tool for command and control centers. Audio recordings from one thousand security microphones scattered throughout a city or a large industrial site will be rendered in the form of brightly colored visible threads reaching skyward from a map of the secure region. The analyst will be able to listen to the audio recorded on any microphone by touching its thread; by touching the thread at different heights, the analyst will be able to audit different periods of time. The brightness, color, and thickness of each thread will display the abnormality and signal characteristics of the audio signal at each point in time. Data transformation research will map related types of abnormality to related color/brightness codes, so that important anomalies, and anomalies confirmed by multiple microphones, are immediately visible to the trained data analyst.
This research seeks broad impact in the area of security analysis. Video cameras are routinely used for security monitoring of industrial sites, government installations, day care centers, and nursing homes. Data analysts and guards routinely browse the video recorded by up to twenty surveillance cameras simultaneously, fast-forwarding through uninteresting periods of time. Microphones would be used in the same applications, if they were useful; but there is at present no way for a data analyst to rapidly and accurately audit the signals from many different microphones. The proposed techniques will give guards and data analysts new data transformation and visualization techniques that will help them to rapidly identify dangerous situations signaled by inexplicable but anomalous audio signals.