Timeliner

From SpeechWiki

Revision as of 15:30, 30 November 2009 by Camilleg (Talk | contribs)
Jump to: navigation, search

Contents

ruby codes

/workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/timeliner (the feature file opened is currently hardcoded , update that line before using)

Dependencies

Ubuntu. Verified with Xiaodan's 9.04, Camille's 64bit 8.10, Mark's 9.10.

Get security updates. Install packages with aptitude (or apt-get) and gem (some back and forth to discover dependencies):

  • aptitude update
  • aptitude install ruby ruby1.8-dev rubygems1.8 gem mesa-common-dev libglu1-mesa-dev freeglut3-dev imagemagick libmagickcore-dev libmagickwand-dev (may need to change order)
  • gem update
  • gem install rake mkrf rmagick RubyInline ruby-opengl rspec ZenTest
  • gem install jstrait-wavefile -s http://gems.github.com

Memos

  • locate: find a file on local disk
  • apt-file find ruby.h: list packages that generate ruby.h

Feature files in HTK format

/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille

Making

HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile

Viewing

HList -h $ResultFile

Examples

ConfigFile ResultFile Description
HCopy_MFCC.cfg AIT_20061020_AmarkIII_1.ch4.feaMFCC MFCC (window 25ms, step 10ms), 78-dim
HCopy_FB.cfg AIT_20061020_AmarkIII_1.ch4.feaFB Narrow-band filterbank (window 25ms, step 10ms), 78-dim
HCopy_FB_w6ms_o2ms.cfg AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms Wide-band filterbank (window 6ms, step 2ms), 78-dim

Notes

  • Audio:

AIT_20061020_AmarkIII_1.ch4.wav

  • 92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters):

AIT_20061020_AmarkIII_1.ch4.annhtk

  • For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are respectively first- and second-order regression coefficients derived from the first 26 dims.
Personal tools