Timeliner

From SpeechWiki

(Difference between revisions)

Revision as of 15:30, 30 November 2009

ruby codes

/workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/timeliner (the feature file opened is currently hardcoded , update that line before using)

Dependencies

Ubuntu. Verified with Xiaodan's 9.04, Camille's 64bit 8.10, Mark's 9.10.

Get security updates. Install packages with aptitude (or apt-get) and gem (some back and forth to discover dependencies):

aptitude update
aptitude install ruby ruby1.8-dev rubygems1.8 gem mesa-common-dev libglu1-mesa-dev freeglut3-dev imagemagick libmagickcore-dev libmagickwand-dev (may need to change order)
gem update
gem install rake mkrf rmagick RubyInline ruby-opengl rspec ZenTest
gem install jstrait-wavefile -s http://gems.github.com

Memos

locate: find a file on local disk
apt-file find ruby.h: list packages that generate ruby.h

Feature files in HTK format

/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille

Making

HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile

Viewing

HList -h $ResultFile

Examples

ConfigFile	ResultFile	Description
HCopy_MFCC.cfg	AIT_20061020_AmarkIII_1.ch4.feaMFCC	MFCC (window 25ms, step 10ms), 78-dim
HCopy_FB.cfg	AIT_20061020_AmarkIII_1.ch4.feaFB	Narrow-band filterbank (window 25ms, step 10ms), 78-dim
HCopy_FB_w6ms_o2ms.cfg	AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms	Wide-band filterbank (window 6ms, step 2ms), 78-dim

Notes

Audio:

AIT_20061020_AmarkIII_1.ch4.wav

92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters):

AIT_20061020_AmarkIII_1.ch4.annhtk

For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are respectively first- and second-order regression coefficients derived from the first 26 dims.

@@ Line 20: / Line 20: @@
 * apt-file find ruby.h: list packages that generate ruby.h
-== feature generating (to be detailed) ==
+==Feature files in HTK format==
 /workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille
-A. To make a feature file:
+====Making====
 HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile
-B. To view a feature file:
+====Viewing====
 HList -h $ResultFile
-{| {{prettytable}}
+====Examples====
-|+ HCopy-made features in htk-format
+{| {{prettytable}}
 ! ConfigFile
 ! ResultFile
 ! Description
-|-
-| HCopy_FB.cfg
-| AIT_20061020_AmarkIII_1.ch4.feaFB
-| narrow-band filterbank (window 25ms, step 10ms), 78-dim
 |-
 | HCopy_MFCC.cfg
 | AIT_20061020_AmarkIII_1.ch4.feaMFCC
 | MFCC (window 25ms, step 10ms), 78-dim
+|-
+| HCopy_FB.cfg
+| AIT_20061020_AmarkIII_1.ch4.feaFB
+| Narrow-band filterbank (window 25ms, step 10ms), 78-dim
 |-
 | HCopy_FB_w6ms_o2ms.cfg
 | AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms
-| wide-band filterbank (window 6ms, step 2ms), 78-dim
+| Wide-band filterbank (window 6ms, step 2ms), 78-dim
 |}
+====Notes====
+*Audio:
+AIT_20061020_AmarkIII_1.ch4.wav
-Note:
+*92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters):
-AIT_20061020_AmarkIII_1.ch4.wav                wavform
+AIT_20061020_AmarkIII_1.ch4.annhtk
-AIT_20061020_AmarkIII_1.ch4.annhtk            92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters)
-) For each 78 dim filter bank parameters (or MFCCs), the first 26
+*For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are respectively first- and second-order regression coefficients derived from the first 26 dims.
-dims are original, the second and third 26 dims are first-order  and
-second-order regression coefficients derived from the first 26 dims
-respectively.

Timeliner

From SpeechWiki

Revision as of 15:30, 30 November 2009

Contents

ruby codes

Dependencies

Memos

Feature files in HTK format

Making

Viewing

Examples

Notes

Views

Personal tools

Navigation

Toolbox

Search