Computer Resources

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
(Databases)
(This Wiki)
 
(42 intermediate revisions not shown)
Line 1: Line 1:
-
==LVCSR at Illinois Computer Resources==
+
=This Wiki=
-
This page has two main purposes: (1) Help local users keep track of databases. Disk management is decentralized, therefore messy; this page is intended to help. (2) Specify which directories are archival, and which are temporary.
+
==To make wiki pages visible only to the SST group==
 +
Is generally readable by anyone, and editable by anyone with an account (The account can be created by anyone too).
 +
It is possible to make wiki pages that are accessible only by people in
 +
our group (non-public pages).  To do this simply prefix the name of your
 +
page with "SST:". For example try to see my page
 +
[[SST:Units_Paper]] without being logged in or without being a member of the sst_group.
 +
You can make an existing page private or public simply by moving/renaming
 +
it.  Existing wiki links to the renamed page will remain valid.
-
Directories listed below are accessible only to ISLE/IFP registered users.
+
In short:
 +
* To check if you are a member, look [[Special:ListUsers|here]].
 +
* To become a member, email [[User:Arthur|Arthur]] or [[User:Mark Hasegawa-Johnson|Mark]].
 +
* To create a page in the SST namespace, start the title of your page with SST: e.g. [[SST:Test_Page]].
 +
* To move an existing page into or out of the SST namespace, use the `move' tab at the top of the page.
-
==Databases==
+
==To make files (pdfs, pictures) visible only to the SST group==
 +
The procedure is almost the same as for making protected pages.  When uploading the file, prefix it's name with SST:
 +
E.g.  Try going to [[File:SST:test2.txt]] without logging in.  When it was uploaded, I named it SST:test2.txt.  To make wiki links to it, use [[File:SST:test2.txt]].  Same thing for images [[Image:Yoonsook.jpg]] is a picture everyone can see on a wiki page, [[Image:SST:Usflag-transbg_42.png‎]] will only be visible to the people in the SST group.
-
These directories have been backed upThey may or may not be backed up again unless Paritosh is specifically told to do so.  All paths are relative to the /workspace mount point.  The data below are sorted first by channel (telephone, broadband, audiovisual, biomedical), and second by language.
+
==[[Current events| Calendar]]==
 +
Can be edited through googleAsk [[User:Arthur|Arthur]] or Mark to share the calendar with you in Google.
-
* English Telephone Speech
+
=LVCSR at Illinois Computer Resources=
-
** {helmholtz1/Switchboard,fluffy1/switchboard-1} - Switchboard 1 audio, orthographic transcriptions, and e-speech prosody predictions.
+
-
** fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
+
-
** fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard-1 extracts
+
-
** spot0/switchboard-2 - Switchboard-2 is like Switchboard-1, but more, and with better U.S. dialect coverage
+
-
** fluffy1/{hub5,hub5_eval} - The NIST hub5 training and test data
+
-
** mickey0/english_callhome - Conversations among family and friends
+
-
** fluffy1/ntimit-{train,test} - The NTIMIT read speech corpus, passed through a telephone channel
+
-
* Chinese Telephone Speech
+
* Compute Facilities:
-
** mickey0/Mandarin_callhome
+
** [http://ifp-32.ifp.uiuc.edu/ganglia/ Cluster Status]
 +
** [https://portal.teragrid.org/gridsphere/gridsphere Teragrid Portal]
-
* Spanish Telephone Speech
+
* Data:
-
** nibbler0/speech_data/Spanish_callhome
+
** [[Data On Line | Corpora we develop and distribute]]
 +
** We are members of [http://www.ldc.upenn.edu LDC].  Most LDC data is organized as described in the [http://www.isle.uiuc.edu/data/README.txt Data Organization README].  Some useful slices of LDC data that have not been moved to ifp-32-2 include:
 +
*** /workspace/fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
 +
*** /workspace/fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard-1 extracts
 +
*** /workspace/fletcher1/bdc - The Boston Directions Corpus, two speakers have prosodic transcriptions, others don't
 +
*** /workspace/nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
 +
*** /workspace/fluffy1/penn_treebank
-
* Hindi Telephone Speech
+
* Time-aligned Switchboard Disfluency corpus
-
** nibbler0/speech_data/Hindi_callfriend
+
** mickey0/sw_disTime-0.9.9 - merged from the original Switchboard time transcription and the Treebank-3 disfluency transcription (TextGrid included)
 +
** mickey0/sw_disTime-1.0.0 (TextGrid NOT included)
-
* English Broadband Speech
+
=Parallel Computing=
-
** mickey1/BN97 - Broadcast News
+
Our cluster gets its own [[Parallel Computing]] page.
-
** mickey1/HUB5E_98 - HUB5 NIST competition Broadcast News data
+
-
** tidigits - Isolated Digits and Phone Numbers
+
-
** fletcher1/kidspeech - Recordings of Children 8, 9, 10 years old
+
-
** Radio_Speech_Corpus - The Boston University Radio Speech Corpus, seven speakers with prosodic transcription (partly at /workspace/fluffy1/radio_news)
+
-
** fletcher1/bdc - The Boston Directions Corpus, two speakers have prosodic transcriptions, others don't
+
-
* Chinese Broadband Speech
+
=Applications=
-
** nibbler1/MBN - Mandarin Broadcast News
+
-
** nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
+
-
* Audiovisual and Multimicrophone Data
+
* [[Software]] created at SST@UIUC
-
** fluffy0/data/{icsi_mr,isl_meeting}_transcr - Transcriptions of meeting room data from ISL, ICSI.  Audio and (remote camera) video available but not online
+
-
** mickey1/AVICAR_DIST - 4-camera, 8-microphone recording of read sentences, phone numbers, and isolated letters in a moving car
+
-
** rizzo1/speech_hearing - 7-microphone recordings of isolated words produced by talkers with dysarthria
+
-
* Biomedical Image Data
+
* Acoustic model training:
-
** mickey0/UW_XRAY_MICROBEAM - point tracking data, 100Hz sampling, obtained using X-ray microbeam
+
** [http://htk.eng.cam.ac.uk HTK] hidden Markov modeling toolkit: ifp-32-1/hasegawa/programs/htk-3.4
-
** {speech_web/mri,http://www.isle.uiuc.edu/mri} - much 3D MRI of vowels, a little fast 2D MRI of the alphabet, and a little high-res 3D MRI of excised tongues
+
** [http://ssli.ee.washington.edu/~bilmes/gmtk/ GMTK] Dynamic Bayesian Nets/Graphical Models: nibbler0/speech_apps/GMTK
 +
** [http://cmusphinx.sourceforge.net Sphinx] speech recognizer
 +
** [http://www-lium.univ-lemans.fr/tools/index.php?option=com_content&task=view&id=20&Itemid=38 LIUM speech tools], including speaker segmentation
-
* Text
+
* Decoding:
-
** rizzo0/treebank - The Penn Treebank syntactically parsed corpus
+
** [http://julius.sourceforge.jp/en_index.php Julius] LVCSR decoder - /workspace/ifp-32-1/hasegawa/programs/julius-4.1
 +
** [http://www.research.att.com/~fsmtools/dcd/ AT&T DCD] LVCSR decoder - nibbler0/speech_apps/dcd-2.0
-
* Time-aligned Switchboard Disfluency corpus
+
* Language model training:
-
** mickey0/sw_disTime-0.9.9 - merged from the original Switchboard time transcription and the Treebank-3 disfluency transcription (TextGrid included)
+
** [http://www.speech.sri.com/projects/srilm/ SRILM] Big N-gram counts and backoff, lattices: fluffy0/programs/srilm
-
** mickey0/sw_disTime-1.0.0 (TextGrid NOT included)
+
** [http://www.research.att.com/~fsmtools/fsm/ AT&T FSM Library]: fluffy0/programs/fsm-4.0
 +
** [http://www.openfst.org OpenFST]: fluffy0/programs/OpenFst/
 +
 
 +
* Scoring
 +
** [http://www.nist.gov/speech/tools/ NIST Speech Tools]: ifp-32-1/hasegawa/programs
 +
 
 +
* SVMs, NNs, Boosting and such
 +
** [http://www.csie.ntu.edu.tw/~cjlin/libsvm/ libSVM]: fluffy0/programs/svmlib
 +
** [http://www.cs.cornell.edu/people/tj/svm_light/ svm_light]: fluffy0/programs/svm_light
 +
** [http://www.icsi.berkeley.edu/Speech/icsi-speech-tools.html quicknet]: mickey0/quicknet
 +
** [http://www.research.att.com/sw/tools/BoosTexter/index.html Boostexter]
-
==Applications==
 
-
* Hidden Markov Models:
 
-
** HTK (Cambridge): fluffy0/programs/htk-3.3
 
-
** DCD (ATT): nibbler0/speech_apps/dcd-2.0
 
-
* Dynamic Bayesian Nets/Graphical Models: nibbler0/speech_apps/GMTK
 
-
* Language Models: fluffy0/programs/srilm
 
-
* Finite State Machines:
 
-
** FSM (ATT): fluffy0/programs/fsm-4.0
 
-
** FST (MIT): fluffy0/programs/fst-1.0-RC1 (MIT)
 
-
* Support Vector Machines:
 
-
** SVMLIB (NJTU): fluffy0/programs/svmlib
 
-
** svm_light (Joachims): fluffy0/programs/svm_light
 
-
** PVTK (UIUC): mickey0/SVM/PVTK
 
* Spectrograms and Waveform Viewing
* Spectrograms and Waveform Viewing
** XKL (MIT): nibbler0/speech_apps/xkl-2.3.1
** XKL (MIT): nibbler0/speech_apps/xkl-2.3.1
** ESPS (Entropic Systems, now Microsoft)
** ESPS (Entropic Systems, now Microsoft)
-
** Praat
+
** [http://www.fon.hum.uva.nl/praat/ Praat]
-
* Speech Data File Formats:  
+
-
** SPHERE (NIST): fluffy0/programs/sphere
+
-
** sox (linux): /usr/bin/sox
+
-
** HCopy (Cambridge): see HTK
+
-
==Backups==
+
== Installing / Arranging Software ==
 +
If you download linux software from the internet, and find it useful, please put it where others may also use it!  Here's how.
 +
 
 +
# Type `umask 022` or `umask 000`.  If you use 022, you are volunteering to manage the package; if you use 000, you are inviting others to help manage it.
 +
# Download the tarfile to /workspace/ifp-32-1/hasegawa/programs; untar it to create $PACKAGE_DIR; remove the tar file (important!); configure; make all.
 +
# Decide where you want the binaries.  Reasonable places for programs are /workspace/ifp-32-1/hasegawa/programs/...
 +
#* scripts = executes on any machine (e.g., perl, bash scripts)
 +
#* bin.`uname` (i.e., bin.Linux) = executes on both ifp-32 and mickey.  PLEASE CHECK: ssh mickey; execute code; see if it gives you "cannot execute binary file".
 +
#* bin.`arch` =  executes only on machines of type `arch`.  Type `arch` to see what machine you're on.
 +
#* $PACKAGE_DIR/bin.Linux = packages with many binaries should remain in $PACKAGE_DIR, to avoid over-writing similarly-named programs in ../bin.Linux.
 +
# Change the installdir variable in your Makefile, according to your decision in part (3).  Type "make install" to install, then "make clean" to remove object files and such.
 +
 
 +
=Backups=
If you have personal working directories that should be regularly backed up, outside of your own home directory, list them here.
If you have personal working directories that should be regularly backed up, outside of your own home directory, list them here.
Line 91: Line 104:
** tico0/sborys
** tico0/sborys
-
* Bowon
+
* Xiaodan
-
** mickey1/AVICAR_AUDIO
+
** /workspace/tico0/AED/
-
** mickey1/AVICAR_DATA
+
-
** mickey1/AVICAR_DIST
+
-
** mickey1/AVICAR_DIST_OLD
+
-
** rizzo1/bowonlee
+
-
** mickey0/bowonlee
+
-
* Mital
+
* Camille
-
** mickey0/magandhi
+
** /workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/
-
* Xiaodan
+
= SVN =
-
** spot1/xzhuang2
+
Our server is svn://mickey.ifp.uiuc.edu
-
** c1-15/hasegawa/xzhuang2*
+
 
 +
On windows, download [http://tortoisesvn.tigris.org/ tortoisesvn].
 +
 
 +
On linux, the client is svn, and should be installed everywhere.
 +
 
 +
For linux command help see [http://artis.imag.fr/~Xavier.Decoret/resources/svn/index.html  simple tutorial]
 +
(don't worry about any of the svnadmin commands, and replace file:///home/user/svn with svn://mickey.ifp.uiuc.edu
 +
 
 +
= Compiling =
 +
gcc is used by default, but I (Arthur) am getting good results with intel's compiler which is available for free for non-comercial use and is installed in /workspace/ifp-32-1/hasegawa/programs/intel (we got the fortran, c/c++ compilers, and the intel math library).
 +
 
 +
Benchmarking quicknet in 4 thread mode  with every combination of intel/gcc and ATLAS/intel implementations of the BLAS library, you get the following:
 +
<pre>
 +
logs/smallGccCompilerIntelMath.log:    CV speed: 4351.14 MCPS, 3107.8 presentations/sec.
 +
logs/smallGccCompilerIntelMath.log:    Train speed: 2056.95 MCUPS, 1469.2 presentations/sec.
 +
logs/smallGccCompilerIntelMath.log:    CV speed: 4691.55 MCPS, 3351.0 presentations/sec.
 +
 
 +
logs/smallIntelCompilerIntelMathLib.log:CV speed: 3984.39 MCPS, 2845.9 presentations/sec.
 +
logs/smallIntelCompilerIntelMathLib.log:Train speed: 2140.31 MCUPS, 1528.7 presentations/sec.
 +
logs/smallIntelCompilerIntelMathLib.log:CV speed: 4034.74 MCPS, 2881.9 presentations/sec.
 +
 
 +
logs/smallIntelCompilerATLASMathLib.log:CV speed: 3508.69 MCPS, 2506.1 presentations/sec.
 +
logs/smallIntelCompilerATLASMathLib.log:Train speed: 1961.05 MCUPS, 1400.7 presentations/sec.
 +
logs/smallIntelCompilerATLASMathLib.log:CV speed: 3553.22 MCPS, 2537.9 presentations/sec.
 +
 
 +
logs/smallGccCompilerATLASMathLib.log:  CV speed: 4219.30 MCPS, 3013.7 presentations/sec.
 +
logs/smallGccCompilerATLASMathLib.log:  Train speed: 1954.73 MCUPS, 1396.2 presentations/sec.
 +
logs/smallGccCompilerATLASMathLib.log:  CV speed: 4133.10 MCPS, 2952.1 presentations/sec.
 +
</pre>
 +
The train speed is the interesting one because it takes the longest, and on it we get almost a 10% speed up.  Strangely CV (testing) speed is best with a gcc compiler and Intel math library. 
 +
 
 +
Using the intel compiler and math library from the above setup and running on the shiny new PCs that Mark got for us:
 +
<pre>
 +
CV speed:    3828.51 MCPS, 2734.6 presentations/sec.
 +
Train speed: 2932.56 MCUPS, 2094.6 presentations/sec.
 +
CV speed:    4093.49 MCPS, 2923.8 presentations/sec.
-
*Rajiv
+
</pre>
-
**scratch/rreddy
+
Going from gcc to intel you have to switch tools as follows :
 +
<pre>
 +
gcc intel
 +
---
 +
gcc icc
 +
g++ icpc
 +
ar  xiar
 +
</pre>

Latest revision as of 22:31, 6 September 2010

Contents

This Wiki

To make wiki pages visible only to the SST group

Is generally readable by anyone, and editable by anyone with an account (The account can be created by anyone too). It is possible to make wiki pages that are accessible only by people in our group (non-public pages). To do this simply prefix the name of your page with "SST:". For example try to see my page SST:Units_Paper without being logged in or without being a member of the sst_group. You can make an existing page private or public simply by moving/renaming it. Existing wiki links to the renamed page will remain valid.

In short:

  • To check if you are a member, look here.
  • To become a member, email Arthur or Mark.
  • To create a page in the SST namespace, start the title of your page with SST: e.g. SST:Test_Page.
  • To move an existing page into or out of the SST namespace, use the `move' tab at the top of the page.

To make files (pdfs, pictures) visible only to the SST group

The procedure is almost the same as for making protected pages. When uploading the file, prefix it's name with SST: E.g. Try going to File:SST:test2.txt without logging in. When it was uploaded, I named it SST:test2.txt. To make wiki links to it, use File:SST:test2.txt. Same thing for images Yoonsook.jpg is a picture everyone can see on a wiki page, SST:Usflag-transbg 42.png will only be visible to the people in the SST group.

Calendar

Can be edited through google. Ask Arthur or Mark to share the calendar with you in Google.

LVCSR at Illinois Computer Resources

  • Data:
    • Corpora we develop and distribute
    • We are members of LDC. Most LDC data is organized as described in the Data Organization README. Some useful slices of LDC data that have not been moved to ifp-32-2 include:
      • /workspace/fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
      • /workspace/fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard-1 extracts
      • /workspace/fletcher1/bdc - The Boston Directions Corpus, two speakers have prosodic transcriptions, others don't
      • /workspace/nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
      • /workspace/fluffy1/penn_treebank
  • Time-aligned Switchboard Disfluency corpus
    • mickey0/sw_disTime-0.9.9 - merged from the original Switchboard time transcription and the Treebank-3 disfluency transcription (TextGrid included)
    • mickey0/sw_disTime-1.0.0 (TextGrid NOT included)

Parallel Computing

Our cluster gets its own Parallel Computing page.

Applications

  • Acoustic model training:
    • HTK hidden Markov modeling toolkit: ifp-32-1/hasegawa/programs/htk-3.4
    • GMTK Dynamic Bayesian Nets/Graphical Models: nibbler0/speech_apps/GMTK
    • Sphinx speech recognizer
    • LIUM speech tools, including speaker segmentation
  • Decoding:
    • Julius LVCSR decoder - /workspace/ifp-32-1/hasegawa/programs/julius-4.1
    • AT&T DCD LVCSR decoder - nibbler0/speech_apps/dcd-2.0
  • Language model training:
    • SRILM Big N-gram counts and backoff, lattices: fluffy0/programs/srilm
    • AT&T FSM Library: fluffy0/programs/fsm-4.0
    • OpenFST: fluffy0/programs/OpenFst/
  • Spectrograms and Waveform Viewing
    • XKL (MIT): nibbler0/speech_apps/xkl-2.3.1
    • ESPS (Entropic Systems, now Microsoft)
    • Praat

Installing / Arranging Software

If you download linux software from the internet, and find it useful, please put it where others may also use it! Here's how.

  1. Type `umask 022` or `umask 000`. If you use 022, you are volunteering to manage the package; if you use 000, you are inviting others to help manage it.
  2. Download the tarfile to /workspace/ifp-32-1/hasegawa/programs; untar it to create $PACKAGE_DIR; remove the tar file (important!); configure; make all.
  3. Decide where you want the binaries. Reasonable places for programs are /workspace/ifp-32-1/hasegawa/programs/...
    • scripts = executes on any machine (e.g., perl, bash scripts)
    • bin.`uname` (i.e., bin.Linux) = executes on both ifp-32 and mickey. PLEASE CHECK: ssh mickey; execute code; see if it gives you "cannot execute binary file".
    • bin.`arch` = executes only on machines of type `arch`. Type `arch` to see what machine you're on.
    • $PACKAGE_DIR/bin.Linux = packages with many binaries should remain in $PACKAGE_DIR, to avoid over-writing similarly-named programs in ../bin.Linux.
  4. Change the installdir variable in your Makefile, according to your decision in part (3). Type "make install" to install, then "make clean" to remove object files and such.

Backups

If you have personal working directories that should be regularly backed up, outside of your own home directory, list them here.

  • Art
    • mickey0/akantor
    • rizzo1/akantor is itself a backup of svn because it cannot be backed up in the normal way.
  • Sarah
    • nibbler0/data
    • rizzo0/sborys
    • spot1/sborys
    • tico0/sborys
  • Xiaodan
    • /workspace/tico0/AED/
  • Camille
    • /workspace/ifp-32-2/hasegawa/data/multimodal/nonspeech/FODAVA/

SVN

Our server is svn://mickey.ifp.uiuc.edu

On windows, download tortoisesvn.

On linux, the client is svn, and should be installed everywhere.

For linux command help see simple tutorial (don't worry about any of the svnadmin commands, and replace file:///home/user/svn with svn://mickey.ifp.uiuc.edu

Compiling

gcc is used by default, but I (Arthur) am getting good results with intel's compiler which is available for free for non-comercial use and is installed in /workspace/ifp-32-1/hasegawa/programs/intel (we got the fortran, c/c++ compilers, and the intel math library).

Benchmarking quicknet in 4 thread mode with every combination of intel/gcc and ATLAS/intel implementations of the BLAS library, you get the following:

logs/smallGccCompilerIntelMath.log:     CV speed: 4351.14 MCPS, 3107.8 presentations/sec.
logs/smallGccCompilerIntelMath.log:     Train speed: 2056.95 MCUPS, 1469.2 presentations/sec.
logs/smallGccCompilerIntelMath.log:     CV speed: 4691.55 MCPS, 3351.0 presentations/sec.

logs/smallIntelCompilerIntelMathLib.log:CV speed: 3984.39 MCPS, 2845.9 presentations/sec.
logs/smallIntelCompilerIntelMathLib.log:Train speed: 2140.31 MCUPS, 1528.7 presentations/sec.
logs/smallIntelCompilerIntelMathLib.log:CV speed: 4034.74 MCPS, 2881.9 presentations/sec.

logs/smallIntelCompilerATLASMathLib.log:CV speed: 3508.69 MCPS, 2506.1 presentations/sec.
logs/smallIntelCompilerATLASMathLib.log:Train speed: 1961.05 MCUPS, 1400.7 presentations/sec.
logs/smallIntelCompilerATLASMathLib.log:CV speed: 3553.22 MCPS, 2537.9 presentations/sec.

logs/smallGccCompilerATLASMathLib.log:  CV speed: 4219.30 MCPS, 3013.7 presentations/sec.
logs/smallGccCompilerATLASMathLib.log:  Train speed: 1954.73 MCUPS, 1396.2 presentations/sec.
logs/smallGccCompilerATLASMathLib.log:  CV speed: 4133.10 MCPS, 2952.1 presentations/sec.

The train speed is the interesting one because it takes the longest, and on it we get almost a 10% speed up. Strangely CV (testing) speed is best with a gcc compiler and Intel math library.

Using the intel compiler and math library from the above setup and running on the shiny new PCs that Mark got for us:

CV speed:    3828.51 MCPS, 2734.6 presentations/sec.
Train speed: 2932.56 MCUPS, 2094.6 presentations/sec.
CV speed:    4093.49 MCPS, 2923.8 presentations/sec.

Going from gcc to intel you have to switch tools as follows :

gcc intel
---
gcc icc
g++ icpc
ar  xiar
Personal tools