Timeliner
From SpeechWiki
(Difference between revisions)
(formatting) |
(formatting) |
||
Line 20: | Line 20: | ||
* apt-file find ruby.h: list packages that generate ruby.h | * apt-file find ruby.h: list packages that generate ruby.h | ||
- | == | + | ==Feature files in HTK format== |
/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille | /workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille | ||
- | + | ====Making==== | |
HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile | HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile | ||
- | + | ====Viewing==== | |
HList -h $ResultFile | HList -h $ResultFile | ||
- | {| {{prettytable}} | + | ====Examples==== |
- | + | {| {{prettytable}} | |
! ConfigFile | ! ConfigFile | ||
! ResultFile | ! ResultFile | ||
! Description | ! Description | ||
- | |||
- | |||
- | |||
- | |||
|- | |- | ||
| HCopy_MFCC.cfg | | HCopy_MFCC.cfg | ||
| AIT_20061020_AmarkIII_1.ch4.feaMFCC | | AIT_20061020_AmarkIII_1.ch4.feaMFCC | ||
| MFCC (window 25ms, step 10ms), 78-dim | | MFCC (window 25ms, step 10ms), 78-dim | ||
+ | |- | ||
+ | | HCopy_FB.cfg | ||
+ | | AIT_20061020_AmarkIII_1.ch4.feaFB | ||
+ | | Narrow-band filterbank (window 25ms, step 10ms), 78-dim | ||
|- | |- | ||
| HCopy_FB_w6ms_o2ms.cfg | | HCopy_FB_w6ms_o2ms.cfg | ||
| AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms | | AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms | ||
- | | | + | | Wide-band filterbank (window 6ms, step 2ms), 78-dim |
|} | |} | ||
+ | ====Notes==== | ||
+ | *Audio: | ||
+ | AIT_20061020_AmarkIII_1.ch4.wav | ||
- | + | *92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters): | |
- | + | AIT_20061020_AmarkIII_1.ch4.annhtk | |
- | + | ||
- | + | *For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are respectively first- and second-order regression coefficients derived from the first 26 dims. | |
- | dims are original, the second and third 26 dims are first- | + | |
- | second-order regression coefficients derived from the first 26 dims | + | |
- | + |
Revision as of 15:30, 30 November 2009
Contents |
ruby codes
/workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/timeliner (the feature file opened is currently hardcoded , update that line before using)
Dependencies
Ubuntu. Verified with Xiaodan's 9.04, Camille's 64bit 8.10, Mark's 9.10.
Get security updates. Install packages with aptitude (or apt-get) and gem (some back and forth to discover dependencies):
- aptitude update
- aptitude install ruby ruby1.8-dev rubygems1.8 gem mesa-common-dev libglu1-mesa-dev freeglut3-dev imagemagick libmagickcore-dev libmagickwand-dev (may need to change order)
- gem update
- gem install rake mkrf rmagick RubyInline ruby-opengl rspec ZenTest
- gem install jstrait-wavefile -s http://gems.github.com
Memos
- locate: find a file on local disk
- apt-file find ruby.h: list packages that generate ruby.h
Feature files in HTK format
/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille
Making
HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile
Viewing
HList -h $ResultFile
Examples
ConfigFile | ResultFile | Description |
---|---|---|
HCopy_MFCC.cfg | AIT_20061020_AmarkIII_1.ch4.feaMFCC | MFCC (window 25ms, step 10ms), 78-dim |
HCopy_FB.cfg | AIT_20061020_AmarkIII_1.ch4.feaFB | Narrow-band filterbank (window 25ms, step 10ms), 78-dim |
HCopy_FB_w6ms_o2ms.cfg | AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms | Wide-band filterbank (window 6ms, step 2ms), 78-dim |
Notes
- Audio:
AIT_20061020_AmarkIII_1.ch4.wav
- 92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters):
AIT_20061020_AmarkIII_1.ch4.annhtk
- For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are respectively first- and second-order regression coefficients derived from the first 26 dims.