Timeliner

From SpeechWiki

Revision as of 15:32, 30 November 2009 by Camilleg (Talk | contribs)
Jump to: navigation, search

Contents

Ruby codes

/workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/timeliner

Feature files and directory are hardcoded in the .rb file. Update those lines before running.

Dependencies

Ubuntu. Verified with Xiaodan's 9.04, Camille's 64bit 8.10, Mark's 9.10.

Get security updates. Install packages with aptitude (or apt-get) and gem (some back and forth to discover dependencies):

  • aptitude update
  • aptitude install ruby ruby1.8-dev rubygems1.8 gem mesa-common-dev libglu1-mesa-dev freeglut3-dev imagemagick libmagickcore-dev libmagickwand-dev (may need to change order)
  • gem update
  • gem install rake mkrf rmagick RubyInline ruby-opengl rspec ZenTest
  • gem install jstrait-wavefile -s http://gems.github.com

Memos

  • locate: find a file on local disk
  • apt-file find ruby.h: list packages that generate ruby.h

Feature files in HTK format

/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille

Making

HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile

Viewing

HList -h $ResultFile

Examples

ConfigFile ResultFile Description
HCopy_MFCC.cfg AIT_20061020_AmarkIII_1.ch4.feaMFCC MFCC (window 25ms, step 10ms), 78-dim
HCopy_FB.cfg AIT_20061020_AmarkIII_1.ch4.feaFB Narrow-band filterbank (window 25ms, step 10ms), 78-dim
HCopy_FB_w6ms_o2ms.cfg AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms Wide-band filterbank (window 6ms, step 2ms), 78-dim

Notes

  • Audio:

AIT_20061020_AmarkIII_1.ch4.wav

  • 92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters):

AIT_20061020_AmarkIII_1.ch4.annhtk

  • For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are respectively first- and second-order regression coefficients derived from the first 26 dims.
Personal tools