Timeliner

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
(set up the environment)
(more dependencies)
 
(19 intermediate revisions not shown)
Line 1: Line 1:
-
== ruby codes ==
+
==Ruby codes==
-
 
+
/workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/timeliner
/workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/timeliner
-
(the feature file  opened is currently hardcoded , update that line before using)
 
-
== set up the environment ==
+
Feature files and directory are hardcoded in the .rb file. Update those lines before running.
-
a) Ubuntu (verified with Xiaodan's 9.04, Camille's 64bit 8.10, Mark's 9.10.)
+
-
b) update all security updates, aptitude update
+
== Dependencies ==
-
c) install packages with aptitude(or apt-get) and gem (some back and forth to discover dependencies)
+
Ubuntu. Verified with Xiaodan's 9.04, Camille's 64bit 8.10, Mark's 9.04.  9.1 in progress.
-
aptitude install ruby ruby1.8-dev rubygems1.8
+
Get security updates. Install packages with aptitude (or apt-get) and gem (some back and forth to discover dependencies):
 +
* aptitude update
 +
* aptitude install sox audacity libaudiofile-dev ruby ruby1.8-dev rubygems1.8 mesa-common-dev libglu1-mesa-dev freeglut3-dev imagemagick libmagickcore-dev libmagickwand-dev ''(may need to change order; may replace *magick* with libmagick++9-dev)''
 +
* From http://rubygems.org/ , install gem from the .tgz file.  Follow its instructions.
 +
* gem update
 +
* gem install rake mkrf ZenTest RubyInline rspec rice
 +
(consider sticking to ruby 1.8 for the sake of http://rmagick.rubyforge.org/install-faq.html)
 +
* gem install rmagick --no-ri --no-rdoc (fails on 8.10, imagemagick is too old?)
 +
* gem install ruby-opengl (fails on ubuntu 9.1 and 10.04): http://rubyforge.org/tracker/index.php?func=detail&aid=27386&group_id=2103&atid=8185 )
 +
* Install HCopy and HList from HTK-3.4.tar.gz (register first).
-
aptitude install gem;  gem update
+
==== Notes ====
 +
* locate ''finds a file on local disk''
-
aptitude install mesa-common-dev libglu1-mesa-dev freeglut3-dev imagemagick libmagickcore-dev libmagickwand-dev (may need change order)
+
* apt-file find ruby.h ''lists packages that generate ruby.h''
 +
* alsamixer ''adjusts volume''
-
gem install rake mkrf rmagick RubyInline ruby-opengl rspec ZenTest
+
==Feature files in HTK format==
-
 
+
-
Memos:
+
-
 
+
-
locate: find a file on local disk
+
-
 
+
-
apt-file find ruby.h: find packages that generate ruby.h
+
-
 
+
-
== feature generating (to be detailed) ==
+
/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille
/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille
-
A. To make a feature file:
+
====Making====
HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile
HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile
-
B. To view a feature file:
+
====Viewing====
HList -h $ResultFile
HList -h $ResultFile
-
{| {{prettytable}}
+
====Examples====
-
|+ HCopy-made features in htk-format
+
{| {{prettytable}}
! ConfigFile
! ConfigFile
! ResultFile
! ResultFile
! Description
! Description
 +
! Dims
 +
! Window
 +
! Step
 +
|-
 +
| HCopy_MFCC.cfg
 +
| AIT_20061020_AmarkIII_1.ch4.feaMFCC
 +
| MFCC
 +
| 78
 +
| 25ms
 +
| 10ms
|-
|-
| HCopy_FB.cfg
| HCopy_FB.cfg
| AIT_20061020_AmarkIII_1.ch4.feaFB
| AIT_20061020_AmarkIII_1.ch4.feaFB
-
| narrow-band filterbank (window 25ms, step 10ms), 78-dim
+
| Narrow-band filterbank
-
|-
+
| 78
-
| HCopy_MFCC.cfg
+
| 25ms
-
| AIT_20061020_AmarkIII_1.ch4.feaMFCC
+
| 10ms
-
| MFCC (window 25ms, step 10ms), 78-dim
+
|-
|-
| HCopy_FB_w6ms_o2ms.cfg
| HCopy_FB_w6ms_o2ms.cfg
| AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms
| AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms
-
| wide-band filterbank (window 6ms, step 2ms), 78-dim
+
| Wide-band filterbank
 +
| 78
 +
| 6ms
 +
| 2ms
|}
|}
 +
====Notes====
 +
*Audio:
 +
AIT_20061020_AmarkIII_1.ch4.wav
-
Note:
+
*92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters):
-
AIT_20061020_AmarkIII_1.ch4.wav                wavform
+
AIT_20061020_AmarkIII_1.ch4.annhtk
-
AIT_20061020_AmarkIII_1.ch4.annhtk            92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters)
+
-
1) For each 78 dim filter bank parameters (or MFCCs), the first 26
+
*For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are respectively first- and second-order regression coefficients derived from the first 26 dims.
-
dims are original, the second and third 26 dims are first-order  and
+
-
second-order regression coefficients derived from the first 26 dims
+
-
respectively.
+

Latest revision as of 22:55, 17 August 2010

Contents

Ruby codes

/workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/timeliner

Feature files and directory are hardcoded in the .rb file. Update those lines before running.

Dependencies

Ubuntu. Verified with Xiaodan's 9.04, Camille's 64bit 8.10, Mark's 9.04. 9.1 in progress.

Get security updates. Install packages with aptitude (or apt-get) and gem (some back and forth to discover dependencies):

  • aptitude update
  • aptitude install sox audacity libaudiofile-dev ruby ruby1.8-dev rubygems1.8 mesa-common-dev libglu1-mesa-dev freeglut3-dev imagemagick libmagickcore-dev libmagickwand-dev (may need to change order; may replace *magick* with libmagick++9-dev)
  • From http://rubygems.org/ , install gem from the .tgz file. Follow its instructions.
  • gem update
  • gem install rake mkrf ZenTest RubyInline rspec rice

(consider sticking to ruby 1.8 for the sake of http://rmagick.rubyforge.org/install-faq.html)

Notes

  • locate finds a file on local disk
  • apt-file find ruby.h lists packages that generate ruby.h
  • alsamixer adjusts volume

Feature files in HTK format

/workspace/ifp-32-2/hasegawa/xzhuang2/AED2009/tmp/forCamille

Making

HCopy -C $ConfigFile AIT_20061020_AmarkIII_1.ch4.wav $ResultFile

Viewing

HList -h $ResultFile

Examples

ConfigFile ResultFile Description Dims Window Step
HCopy_MFCC.cfg AIT_20061020_AmarkIII_1.ch4.feaMFCC MFCC 78 25ms 10ms
HCopy_FB.cfg AIT_20061020_AmarkIII_1.ch4.feaFB Narrow-band filterbank 78 25ms 10ms
HCopy_FB_w6ms_o2ms.cfg AIT_20061020_AmarkIII_1.ch4.feaFB_w6ms_o2ms Wide-band filterbank 78 6ms 2ms

Notes

  • Audio:

AIT_20061020_AmarkIII_1.ch4.wav

  • 92 dims (14 dim decorrelated event-classifier neural network output + 78 dim filter bank parameters):

AIT_20061020_AmarkIII_1.ch4.annhtk

  • For each 78 dim filter bank parameters (or MFCCs), the first 26 dims are original, the second and third 26 dims are respectively first- and second-order regression coefficients derived from the first 26 dims.
Personal tools