Timeliner

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
(formatting)
(formatting)
Line 20: Line 20:
* apt-file find ruby.h: list packages that generate ruby.h
* apt-file find ruby.h: list packages that generate ruby.h
-
== feature generating (to be detailed) ==
+
==Feature files in HTK format==
/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille
/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille
-
A. To make a feature file:
+
====Making====
HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile
HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile
-
B. To view a feature file:
+
====Viewing====
HList -h $ResultFile
HList -h $ResultFile
-
{| {{prettytable}}
+
====Examples====
-
|+ HCopy-made features in htk-format
+
{| {{prettytable}}
! ConfigFile
! ConfigFile
! ResultFile
! ResultFile
! Description
! Description
-
|-
 
-
| HCopy_FB.cfg
 
-
| AIT_20061020_AmarkIII_1.ch4.feaFB
 
-
| narrow-band filterbank (window 25ms, step 10ms), 78-dim
 
|-
|-
| HCopy_MFCC.cfg
| HCopy_MFCC.cfg
| AIT_20061020_AmarkIII_1.ch4.feaMFCC
| AIT_20061020_AmarkIII_1.ch4.feaMFCC
| MFCC (window 25ms, step 10ms), 78-dim
| MFCC (window 25ms, step 10ms), 78-dim
 +
|-
 +
| HCopy_FB.cfg
 +
| AIT_20061020_AmarkIII_1.ch4.feaFB
 +
| Narrow-band filterbank (window 25ms, step 10ms), 78-dim
|-
|-
| HCopy_FB_w6ms_o2ms.cfg
| HCopy_FB_w6ms_o2ms.cfg
| AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms
| AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms
-
| wide-band filterbank (window 6ms, step 2ms), 78-dim
+
| Wide-band filterbank (window 6ms, step 2ms), 78-dim
|}
|}
 +
====Notes====
 +
*Audio:
 +
AIT_20061020_AmarkIII_1.ch4.wav
-
Note:
+
*92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters):
-
AIT_20061020_AmarkIII_1.ch4.wav                wavform
+
AIT_20061020_AmarkIII_1.ch4.annhtk
-
AIT_20061020_AmarkIII_1.ch4.annhtk            92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters)
+
-
1) For each 78 dim filter bank parameters (or MFCCs), the first 26
+
*For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are respectively first- and second-order regression coefficients derived from the first 26 dims.
-
dims are original, the second and third 26 dims are first-order  and
+
-
second-order regression coefficients derived from the first 26 dims
+
-
respectively.
+

Revision as of 15:30, 30 November 2009

Contents

ruby codes

/workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/timeliner (the feature file opened is currently hardcoded , update that line before using)

Dependencies

Ubuntu. Verified with Xiaodan's 9.04, Camille's 64bit 8.10, Mark's 9.10.

Get security updates. Install packages with aptitude (or apt-get) and gem (some back and forth to discover dependencies):

  • aptitude update
  • aptitude install ruby ruby1.8-dev rubygems1.8 gem mesa-common-dev libglu1-mesa-dev freeglut3-dev imagemagick libmagickcore-dev libmagickwand-dev (may need to change order)
  • gem update
  • gem install rake mkrf rmagick RubyInline ruby-opengl rspec ZenTest
  • gem install jstrait-wavefile -s http://gems.github.com

Memos

  • locate: find a file on local disk
  • apt-file find ruby.h: list packages that generate ruby.h

Feature files in HTK format

/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille

Making

HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile

Viewing

HList -h $ResultFile

Examples

ConfigFile ResultFile Description
HCopy_MFCC.cfg AIT_20061020_AmarkIII_1.ch4.feaMFCC MFCC (window 25ms, step 10ms), 78-dim
HCopy_FB.cfg AIT_20061020_AmarkIII_1.ch4.feaFB Narrow-band filterbank (window 25ms, step 10ms), 78-dim
HCopy_FB_w6ms_o2ms.cfg AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms Wide-band filterbank (window 6ms, step 2ms), 78-dim

Notes

  • Audio:

AIT_20061020_AmarkIII_1.ch4.wav

  • 92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters):

AIT_20061020_AmarkIII_1.ch4.annhtk

  • For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are respectively first- and second-order regression coefficients derived from the first 26 dims.
Personal tools