Visualization Experiments
From SpeechWiki
m (normalize terminology) |
(concise; update cube's url) |
||
Line 3: | Line 3: | ||
; Visualization Interface Number 1 --- Timeline Audio ("timeliner") | ; Visualization Interface Number 1 --- Timeline Audio ("timeliner") | ||
- | + | Lightweight, for laptops or handhelds. Similar to [http://audacity.sourceforge.net audacity]. | |
; Visualization Interface Number 2 --- Milliphone | ; Visualization Interface Number 2 --- Milliphone | ||
- | + | A command center interface, designed for the [http://isl.beckman.illinois.edu/Labs/CUBE/CUBE.html Cube]. | |
; Feature Computation --- Signal Features | ; Feature Computation --- Signal Features | ||
Line 15: | Line 15: | ||
; Feature Computation --- Classification Features | ; Feature Computation --- Classification Features | ||
- | These features measure | + | These features measure how well a given classification label is matched by the signal at a given point in time (confidence score). Labels may be defined before or during a session. |
- | == | + | ==Dramatis personae== |
- | + | Mark Hasegawa-Johnson | |
+ | Camille Goudeseune | ||
+ | Grads: Sarah Borys, Lae-Hoon Kim, Xi Zhou, Xiaodan Zhuang | ||
+ | Undergrads: David Cohen | ||
- | + | ==Tasks== | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
Camille: keep developing timeliner | Camille: keep developing timeliner | ||
Line 53: | Line 51: | ||
Grads: September, run 5-subject experiment. November, present at FODAVA meeting, Richland WA | Grads: September, run 5-subject experiment. November, present at FODAVA meeting, Richland WA | ||
- | + | ==Notes== | |
How combine features? | How combine features? |
Revision as of 20:47, 2 June 2009
Visualization experiments include at least the following components:
- Visualization Interface Number 1 --- Timeline Audio ("timeliner")
Lightweight, for laptops or handhelds. Similar to audacity.
- Visualization Interface Number 2 --- Milliphone
A command center interface, designed for the Cube.
- Feature Computation --- Signal Features
These features rapidly give an analyst information about the signal, e.g., spectrograms.
- Feature Computation --- Classification Features
These features measure how well a given classification label is matched by the signal at a given point in time (confidence score). Labels may be defined before or during a session.
Dramatis personae
Mark Hasegawa-Johnson Camille Goudeseune Grads: Sarah Borys, Lae-Hoon Kim, Xi Zhou, Xiaodan Zhuang Undergrads: David Cohen
Tasks
Camille: keep developing timeliner
done: pan, zoom (mouse scrollwheel) done: gui using ruby-opengl, for all OSes: heron ibex leopard xp vista todo: scrollweel-glut workaround for windows todo: inline C generate texturemaps
Camille: keep developing milliphone, hand off to gradstudents
All: run timeliner
load a recorded sound load precomputed features to display select and play intervals
Grads: choose features, code feature generators
David: measure and model computation speed of feature generators
Camille: map features to HSV
Grads: design and pilot-study experiments
Grads: recruit analyst-subjects, schedule experiments
Grads: September, run 5-subject experiment. November, present at FODAVA meeting, Richland WA
Notes
How combine features?
Feature generators read a recorded sound and write a feature file.
format: http://labrosa.ee.columbia.edu/doc/HTKBook21/node58.html#SECTION03271000000000000000 http://htk.eng.cam.ac.uk/ Sequence of feature vectors.
Later: stream not batch.
Experiment tasks:
find instances of a class of sound events find anomalous sounds (open-ended, vague)
Recorded sounds
AMI meeting room transcribed fieldrecorder/090216 fieldrecorder aircraft + webcam for ground truth play freqsweep through genelec into fieldrecorder. ignore clock drift. Keep data files small enough for our tools. toy ruby script plus short audio source files generates a long target file. Tweak script while tweaking apps.
Realtime server (later)
record audio circular buffer, a few months long compute features at multiple scales fast approximate algorithms for caching of features. stream all this to a googlemaps-ish server when client scrolls (pans) or zooms, it requests fresh data from server