Visualization Experiments
From SpeechWiki
m (concise) |
(apr 10 minutes) |
||
Line 16: | Line 16: | ||
These features measure the degree of match (confidence score) between the signal at any point in time, and a classification label of interest. Classification labels might be defined in advance (e.g., "explosion,"), or they might be defined by the analyst during a session. | These features measure the degree of match (confidence score) between the signal at any point in time, and a classification label of interest. Classification labels might be defined in advance (e.g., "explosion,"), or they might be defined by the analyst during a session. | ||
+ | |||
+ | ; Minutes of 2009 Apr 10 noon meeting | ||
+ | |||
+ | Dramatis personae. | ||
+ | Mark | ||
+ | Camille | ||
+ | Gradstudents Xi, Xiaodan, Sarah, Lae-Hoon | ||
+ | Undergrad David Cohen | ||
+ | |||
+ | Camille: write timeline editor | ||
+ | pan | ||
+ | zoom (mouse scrollwheel) | ||
+ | gui: no widgets, just input? then opengl suffices. | ||
+ | OS: heron ibex leopard xp vista (servers: fedora 10) | ||
+ | |||
+ | Camille: keep developing milliphone, hand off to gradstudents | ||
+ | |||
+ | Apps read a recorded sound and write a feature file. | ||
+ | format: http://labrosa.ee.columbia.edu/doc/HTKBook21/node58.html#SECTION03271000000000000000 | ||
+ | http://htk.eng.cam.ac.uk/ | ||
+ | Sequence of feature vectors. | ||
+ | Later: stream not batch. | ||
+ | |||
+ | All: run timeline editor | ||
+ | load a recorded sound | ||
+ | load precomputed features to display | ||
+ | select and play intervals | ||
+ | |||
+ | Grads: choose features | ||
+ | |||
+ | David: measure and model computation speed of features | ||
+ | |||
+ | Camille: map features to HSV | ||
+ | |||
+ | Grads: design and pilot-study experiments | ||
+ | Grads: recruit analyst-subjects, schedule experiments | ||
+ | Grads: September, run 5-subject experiment. November, present at FODAVA meeting, Richland WA | ||
+ | |||
+ | How combine features? | ||
+ | |||
+ | Experiment tasks: | ||
+ | find instances of a class of sound events | ||
+ | find anomalous sounds (open-ended, vague) | ||
+ | |||
+ | Recorded sounds | ||
+ | AMI meeting room transcribed | ||
+ | fieldrecorder/090216 | ||
+ | fieldrecorder aircraft + webcam for ground truth | ||
+ | play freqsweep through genelec into fieldrecorder. | ||
+ | ignore clock drift. | ||
+ | Keep data files small enough for our tools. | ||
+ | toy | ||
+ | ruby script plus short audio source files generates a long target file. Tweak script while tweaking apps. | ||
+ | |||
+ | Realtime server (later) | ||
+ | record audio | ||
+ | circular buffer, a few months long | ||
+ | compute features at multiple scales | ||
+ | fast approximate algorithms for caching of features. | ||
+ | stream all this to a googlemaps-ish server | ||
+ | when client scrolls (pans) or zooms, it requests fresh data from server |
Revision as of 22:16, 10 April 2009
Visualization experiments include at least the following components:
- Visualization Interface Number 1 --- Timeline Audio
This is intended to be lightweight for laptops or handhelds, similar to audacity.
- Visualization Interface Number 2 --- Milliphone
This is a command center interface, designed for the Cube.
- Feature Computation --- Signal Features
These features rapidly give an analyst information about the signal, e.g., spectrograms.
- Feature Computation --- Classification Features
These features measure the degree of match (confidence score) between the signal at any point in time, and a classification label of interest. Classification labels might be defined in advance (e.g., "explosion,"), or they might be defined by the analyst during a session.
- Minutes of 2009 Apr 10 noon meeting
Dramatis personae.
Mark Camille Gradstudents Xi, Xiaodan, Sarah, Lae-Hoon Undergrad David Cohen
Camille: write timeline editor
pan zoom (mouse scrollwheel) gui: no widgets, just input? then opengl suffices. OS: heron ibex leopard xp vista (servers: fedora 10)
Camille: keep developing milliphone, hand off to gradstudents
Apps read a recorded sound and write a feature file.
format: http://labrosa.ee.columbia.edu/doc/HTKBook21/node58.html#SECTION03271000000000000000 http://htk.eng.cam.ac.uk/ Sequence of feature vectors.
Later: stream not batch.
All: run timeline editor
load a recorded sound load precomputed features to display select and play intervals
Grads: choose features
David: measure and model computation speed of features
Camille: map features to HSV
Grads: design and pilot-study experiments Grads: recruit analyst-subjects, schedule experiments Grads: September, run 5-subject experiment. November, present at FODAVA meeting, Richland WA
How combine features?
Experiment tasks:
find instances of a class of sound events find anomalous sounds (open-ended, vague)
Recorded sounds
AMI meeting room transcribed fieldrecorder/090216 fieldrecorder aircraft + webcam for ground truth play freqsweep through genelec into fieldrecorder. ignore clock drift. Keep data files small enough for our tools. toy ruby script plus short audio source files generates a long target file. Tweak script while tweaking apps.
Realtime server (later)
record audio circular buffer, a few months long compute features at multiple scales fast approximate algorithms for caching of features. stream all this to a googlemaps-ish server when client scrolls (pans) or zooms, it requests fresh data from server