Visualization Experiments

From SpeechWiki

Revision as of 20:35, 2 June 2009 by Camilleg (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to: navigation, search

Visualization experiments include at least the following components:

Visualization Interface Number 1 --- Timeline Audio ("timeliner")

This is intended to be lightweight for laptops or handhelds, similar to audacity.

Visualization Interface Number 2 --- Milliphone

This is a command center interface, designed for the Cube.

Feature Computation --- Signal Features

These features rapidly give an analyst information about the signal, e.g., spectrograms.

Feature Computation --- Classification Features

These features measure the degree of match (confidence score) between the signal at any point in time, and a classification label of interest. Classification labels might be defined in advance (e.g., "explosion,"), or they might be defined by the analyst during a session.

Minutes of 2009 Apr 10 noon meeting

Dramatis personae

 Mark
 Camille
 Gradstudents Xi, Xiaodan, Sarah, Lae-Hoon
 Undergrad David Cohen

Tasks

Camille: keep developing timeliner

 done: pan, zoom (mouse scrollwheel)
 done: gui using ruby-opengl, for all OSes: heron ibex leopard xp vista
 todo: scrollweel-glut workaround for windows
 todo: inline C generate texturemaps

Camille: keep developing milliphone, hand off to gradstudents

All: run timeliner

 load a recorded sound
 load precomputed features to display
 select and play intervals

Grads: choose features, code feature generators

David: measure and model computation speed of feature generators

Camille: map features to HSV

Grads: design and pilot-study experiments

Grads: recruit analyst-subjects, schedule experiments

Grads: September, run 5-subject experiment. November, present at FODAVA meeting, Richland WA

Notes

How combine features?

Feature generators read a recorded sound and write a feature file.

 format: http://labrosa.ee.columbia.edu/doc/HTKBook21/node58.html#SECTION03271000000000000000
 http://htk.eng.cam.ac.uk/
 Sequence of feature vectors.

Later: stream not batch.

Experiment tasks:

 find instances of a class of sound events
 find anomalous sounds (open-ended, vague)

Recorded sounds

 AMI meeting room transcribed
 fieldrecorder/090216
 fieldrecorder aircraft + webcam for ground truth
   play freqsweep through genelec into fieldrecorder.
   ignore clock drift.
   Keep data files small enough for our tools.
 toy
   ruby script plus short audio source files generates a long target file.  Tweak script while tweaking apps.

Realtime server (later)

 record audio
 circular buffer, a few months long
 compute features at multiple scales
   fast approximate algorithms for caching of features.
 stream all this to a googlemaps-ish server
 when client scrolls (pans) or zooms, it requests fresh data from server

Visualization Experiments

From SpeechWiki

Contents

Minutes of 2009 Apr 10 noon meeting

Dramatis personae

Tasks

Notes

Views

Personal tools

Navigation

Toolbox

Search