Timeliner

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
(table)
Line 1: Line 1:
-
 
== ruby codes ==
== ruby codes ==
Line 28: Line 27:
== feature generating (to be detailed) ==
== feature generating (to be detailed) ==
-
3a. filter bank
 
-
3b. posterior output (ANN, HMM(DBN) )
+
/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille
 +
 
 +
A. To make a feature file:
 +
HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile
 +
 
 +
B. To view a feature file:
 +
HList -h $ResultFile
 +
 
 +
{| {{prettytable}} 
 +
|+ HCopy-made features in htk-format
 +
! ConfigFile
 +
! ResultFile
 +
! Description
 +
|-
 +
| HCopy_FB.cfg
 +
| AIT_20061020_AmarkIII_1.ch4.feaFB
 +
| narrow-band filterbank (window 25ms, step 10ms), 78-dim
 +
|-
 +
| HCopy_MFCC.cfg
 +
| AIT_20061020_AmarkIII_1.ch4.feaMFCC
 +
| MFCC (window 25ms, step 10ms), 78-dim
 +
|-
 +
| HCopy_FB_w6ms_o2ms.cfg
 +
| AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms
 +
| wide-band filterbank (window 6ms, step 2ms), 78-dim
 +
|}
 +
 
-
3x. general feature writer function
+
Note:
 +
AIT_20061020_AmarkIII_1.ch4.wav                wavform
 +
AIT_20061020_AmarkIII_1.ch4.annhtk            92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters)
-
== todo ==
+
1) For each 78 dim filter bank parameters (or MFCCs), the first 26
-
feature writer function; doc; verify htk feature read in ruby
+
dims are original, the second and third 26 dims are first-order  and
 +
second-order regression coefficients derived from the first 26 dims
 +
respectively.

Revision as of 14:11, 22 November 2009

ruby codes

/workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/timeliner (the feature file opened is currently hardcoded , update that line before using)

set up the environment

a) Ubuntu (verified with Xiaodan's 9.04, Camille's 64bit 8.10, Mark's ?8.04.)

b) update all security updates, aptitude update

c) install packages with aptitude(or apt-get) and gem (some back and forth to discover dependencies)

aptitude install ruby ruby1.8-dev rubygems1.8

aptitude install gem; gem update

aptitude install mesa-common-dev libglu1-mesa-dev freeglut3-dev imagemagick libmagickcore-dev libmagickwand-dev (may need change order)

gem install rake mkrf rmagick RubyInline ruby-opengl rspec ZenTest

Memos:

locate: find a file on local disk

apt-file find ruby.h: find packages that generate ruby.h

feature generating (to be detailed)

/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille

A. To make a feature file: HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile

B. To view a feature file: HList -h $ResultFile

HCopy-made features in htk-format
ConfigFile ResultFile Description
HCopy_FB.cfg AIT_20061020_AmarkIII_1.ch4.feaFB narrow-band filterbank (window 25ms, step 10ms), 78-dim
HCopy_MFCC.cfg AIT_20061020_AmarkIII_1.ch4.feaMFCC MFCC (window 25ms, step 10ms), 78-dim
HCopy_FB_w6ms_o2ms.cfg AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms wide-band filterbank (window 6ms, step 2ms), 78-dim


Note: AIT_20061020_AmarkIII_1.ch4.wav wavform AIT_20061020_AmarkIII_1.ch4.annhtk 92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters)

1) For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are first-order and second-order regression coefficients derived from the first 26 dims respectively.

Personal tools