Software

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
m (nested stft, local link to stft.c)
(reformatting into wiki style)
Line 1: Line 1:
-
=Statistical Speech Technology Group Software=
+
===Statistical Speech Technology Group Software===
Our policy: everything we write is free on the web. This wiki is intended to be definitive, because anybody in the group can edit it to add their own software. A spider-indexable backup is at http://www.isle.uiuc.edu/software .
Our policy: everything we write is free on the web. This wiki is intended to be definitive, because anybody in the group can edit it to add their own software. A spider-indexable backup is at http://www.isle.uiuc.edu/software .
Line 8: Line 8:
* On Linux, use the svn command, e.g., svn co svn://mickey.ifp.uiuc.edu/speechfileformats
* On Linux, use the svn command, e.g., svn co svn://mickey.ifp.uiuc.edu/speechfileformats
-
<table border=2><tr>
+
==Learning==
-
<tr><td>Learning</td></tr>
+
-
<tr><td>Pronounce</td><td>Letters to phones using an HMM<br>
+
-
[http://mickey.ifp.uiuc.edu/speechWiki/index.php/Phonetic_Transcription_Tool Description],[http://mickey.ifp.uiuc.edu/speech/webpronounce/webpronounce.cgi Demo],
+
-
[svn://mickey.ifp.uiuc.edu/pronounce SVN archive] (Arthur Kantor, 2007)</td></tr>
+
-
<tr><td>HDK</td><td>HTK-based Explicit-duration HMM<br>
+
; Pronounce: An orthographic string to phonetic string mapping tool. (Arthur Kantor, 2007)
-
[http://www.isle.uiuc.edu/pubs/2003/chen03interspeech.pdf Description],
+
:This tool computes American English phonetic transcriptions from plaintext.  Its HMM either generates a most likely phonetic transcription, or forces alignment if a phonetic transcription is provided.  So, it gives a reasonable pronunciation for both out-of-dictionary words and partially pronounced words.
-
[http://www.isle.uiuc.edu/software/HDK4.tar.gz TGZ archive], [svn://mickey.ifp.uiuc.edu/HDK4_release SVN repository] (Ken Chen, 2003)
+
: [http://mickey.ifp.uiuc.edu/speechWiki/index.php/Phonetic_Transcription_Tool Description], [http://mickey.ifp.uiuc.edu/speech/webpronounce/webpronounce.cgi Demo],  
-
</td></tr>
+
:[svn://mickey.ifp.uiuc.edu/pronounce SVN archive]  
-
<tr><td>HTKtrain</td><td>Scripts for training HMMs using HTK<br>
+
;HTK-based Explicit-duration HMM (Ken Chen, 2003)
-
[svn://mickey.ifp.uiuc.edu/HTKtrain SVN repository] (Sarah Borys and Mark Hasegawa-Johnson, 2008)
+
:[http://www.isle.uiuc.edu/pubs/2003/chen03interspeech.pdf Description], [http://www.isle.uiuc.edu/software/HDK4.tar.gz TGZ archive]
-
</td></tr>
+
:[svn://mickey.ifp.uiuc.edu/HDK4_release SVN repository]  
-
<tr><td>Signal Processing</td></tr>
+
;HTKtrain
-
<tr><td>PVTK</td><td>Extract HTK features as training vectors for libSVM, apply trained SVMs directly to feature files<br>
+
:Scripts for training HMMs using HTK (Sarah Borys and Mark Hasegawa-Johnson, 2008)
-
[http://www.isle.uiuc.edu/software/PVTK2005May23.tgz TGZ archive], [svn://mickey.ifp.uiuc.edu/PVTK SVN repository] (Sarah Borys and MH 2005-8)
+
:[svn://mickey.ifp.uiuc.edu/HTKtrain SVN repository]  
-
</td></tr>
+
-
<tr><td>VAD</td><td>Voice activity detector with improved noise model<br>
+
-
[http://www.isle.uiuc.edu/pubs/2007/lee07dspincars.pdf Description],
+
-
[http://www.isle.uiuc.edu/software/lee_vad.m lee_vad.m], [svn://mickey.ifp.uiuc.edu/lee_vad SVN repository] (Bowon Lee, 2007)
+
-
</td></tr>
+
-
<tr><td>Nested STFTs</td><td>Efficient Simultaneous Multi-Scale Computation of FFTs<br>
+
-
[http://fodava.gatech.edu/files/reports/GT-FODAVA-09-01.pdf Description], [http://www.isle.uiuc.edu/software/stft.c stft.c] (Dave Cohen, Camille Goudeseune, Mark Hasegawa-Johnson 2009)</td></tr>
+
-
<tr><td>Improved Mistral</td><td>State of the Art Text-Independent Speaker Verification System,especially for NIST SRE<br>
 
-
Based [http://mistral.univ-avignon.fr/wiki/index.php/Main_Page Mistral Open Source package].<br>
 
-
Improved and New Features:
+
==Signal Processing==
-
* add full factor analysis(eigenchannel and eigenvoice), instead of simple factor analysis(eigenchannel)
+
;PVTK (Sarah Borys and MH 2005-8)
-
* add multi-threads for Windows as well as Linux
+
:Extract HTK features as training vectors for libSVM, apply trained SVMs directly to feature files
-
* support read HTK format feature/model
+
:[http://www.isle.uiuc.edu/software/PVTK2005May23.tgz TGZ archive]
-
* add an effective Algorithm for fast implementation of FA.
+
:[svn://mickey.ifp.uiuc.edu/PVTK SVN repository]
-
* code optimization(for FA)
+
-
* fixed some bugs
+
-
Source: /ws/ifp-32-2/hasegawa/pineking/programs/Improved_Mistral
+
;VAD (Bowon Lee, 2007)
 +
:Voice activity detector with improved noise model
 +
:[http://www.isle.uiuc.edu/pubs/2007/lee07dspincars.pdf Description], [http://www.isle.uiuc.edu/software/lee_vad.m lee_vad.m]
 +
:[svn://mickey.ifp.uiuc.edu/lee_vad SVN repository]
-
(Qingsong Liu 2009)</td></tr>
+
;Nested STFTs (Dave Cohen, Camille Goudeseune, Mark Hasegawa-Johnson 2009)
 +
:Efficient Simultaneous Multi-Scale Computation of FFTs
 +
: [http://fodava.gatech.edu/files/reports/GT-FODAVA-09-01.pdf Description], [http://www.isle.uiuc.edu/software/stft.c stft.c]
-
<tr><td>Computation</td></tr>
 
-
<tr><td>GMTK Parallel</td>
 
-
<td>Split GMTK commands into batch jobs for a cluster<br>
 
-
[http://mickey.ifp.uiuc.edu/speechWiki/index.php/GMTK_parallel_tools Description],
 
-
[svn://mickey.ifp.uiuc.edu/gmtkScripts/ SVN repository] (Arthur Kantor, 2008)</td></tr>
 
-
<tr><td>HTK Parallel
+
:Improved Mistral (Qingsong Liu 2009)
-
</td><td>
+
:State of the Art Text-Independent Speaker Verification System,especially for NIST SRE
-
Split an HTK command into batch jobs for a cluster (Bowon Lee, 2006)<br>
+
:Based on [http://mistral.univ-avignon.fr/wiki/index.php/Main_Page Mistral Open Source package]
-
[http://www.ifp.uiuc.edu/~bowonlee/research/cluster/HTK_parallel.htm Description],  
+
:Improved and New Features:
-
[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HCopy.pl HCopy.pl],  
+
:* add full factor analysis(eigenchannel and eigenvoice), instead of simple factor analysis(eigenchannel)
 +
:* add multi-threads for Windows as well as Linux
 +
:* support read HTK format feature/model
 +
:* add an effective Algorithm for fast implementation of FA.
 +
:* code optimization(for FA)
 +
:* fixed some bugs
 +
:Source: /ws/ifp-32-2/hasegawa/pineking/programs/Improved_Mistral
 +
 
 +
==Computation==
 +
;GMTK Parallel (Arthur Kantor, 2008)
 +
:Split GMTK commands into batch jobs for a cluster
 +
:[http://mickey.ifp.uiuc.edu/speechWiki/index.php/GMTK_parallel_tools Description],
 +
:[svn://mickey.ifp.uiuc.edu/gmtkScripts/ SVN repository]
 +
 
 +
;HTK Parallel (Bowon Lee, 2006)
 +
:These Perl scripts ([http://www.ifp.uiuc.edu/~bowonlee/research/cluster/HTK_parallel.htm description]) Split an HTK command for parallel excution on a [http://www.ifp.uiuc.edu/~bowonlee/research/cluster/linux_cluster.htm SGE] cluster.
 +
:[http://www.ifp.uiuc.edu/~bowonlee/research/cluster/HTK_parallel.htm Description],  
 +
:[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HCopy.pl HCopy.pl],  
[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HVite.pl HVite.pl],  
[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HVite.pl HVite.pl],  
[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HERest.pl HERest.pl],  
[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HERest.pl HERest.pl],  
-
[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HResults.pl HResults.pl], [svn://mickey.ifp.uiuc.edu/HTK_parallel/ SVN repository] </td></tr>
+
[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HResults.pl HResults.pl]
-
 
+
:[svn://mickey.ifp.uiuc.edu/HTK_parallel/ SVN repository]  
-
<tr><td>Data</td></tr>
+
-
<tr><td>dtmfseg</td><td>Segment audio files at DTMF tones<br>
+
-
[svn://mickey.ifp.uiuc.edu/dtmfseg/ SVN repository] (Bowon Lee, 2006)</td></tr>
+
-
 
+
-
<tr><td>transcription tools</td><td>Convert transcription formats<br>
+
-
[http://www.isle.uiuc.edu/software/transcription_tools2005May.tgz TGZ archive], [svn://mickey.ifp.uiuc.edu/transcription_tools/ SVN repository] (Mark Hasegawa-Johnson, 2005)</td></tr>
+
-
 
+
-
<tr><td>speechfileformats</td><td>Read and write HTK files in matlab<br>
+
-
[http://www.isle.uiuc.edu/software/speechfileformats.tgz TGZ archive], [svn://mickey.ifp.uiuc.edu/speechfileformats/ SVN repository] (Mark Hasegawa-Johnson, 2004)</td></tr>
+
-
 
+
-
<tr><td>CTMRedit</td><td>Manually and automatically segment CT and MR image stacks<br>
+
-
[http://www.isle.uiuc.edu/pubs/1990s/hasegawa-johnson99embs.pdf Description],
+
-
[svn://mickey.ifp.uiuc.edu/CTMRedit SVN repository] (Jul Cha and MH 1999)
+
-
</td></tr>
+
-
 
+
-
<tr><td>improved MVA</td><td>Perform mean and variance normalization and ARMA filtering<br>
+
-
It's essentially [http://ssli.ee.washington.edu/people/chiaping/mva.html this] version,
+
-
improved:
+
-
* better error reporting (e.g. failing to open file tells you so instead of core dumping)
+
-
* more accurate mean and variance estimation (doubles instead of floats in strategic places)
+
-
* faster computation in the case of MV (ARMA order 0)
+
-
 
+
-
source: svn://mickey.ifp.uiuc.edu/corporaNormalizationScripts/fisher/MVA.cc
+
-
 
+
-
binary: http://mickey.ifp.uiuc.edu/speech/akantor/fisher/programs/bin.Linux/MVA
+
-
</td></tr>
+
-
 
+
-
<tr><td>Scripts</td><td>miscellaneous perl, python, bash, and ruby [svn://mickey.ifp.uiuc.edu/scripts SVN archive],
+
-
[[Scripts_Documentation| Documentation]]</td></tr>
+
-
</table>
+
-
 
+
-
=Phonetic Transcription Tool=
+
-
 
+
-
This tool computes American English phonetic transcriptions from plaintext.
+
-
Its HMM either generates a most likely phonetic transcription,
+
-
or forces alignment if a phonetic transcription is provided.
+
-
 
+
-
So, it gives a reasonable pronunciation for both out-of-dictionary words and partially pronounced words.
+
-
 
+
-
[[Phonetic Transcription Tool]]. 
+
-
[http://mickey.ifp.uiuc.edu/speech/webpronounce/webpronounce.cgi Online demo].
 
-
=Scripts for parallel processing of HTK commands=
+
==Data==
 +
;dtmfseg (Bowon Lee, 2006)
 +
:Segment audio files at DTMF tones
 +
:[svn://mickey.ifp.uiuc.edu/dtmfseg/ SVN repository]
-
These Perl scripts ([http://www.ifp.uiuc.edu/~bowonlee/research/cluster/HTK_parallel.htm description])
+
;transcription tools (Mark Hasegawa-Johnson, 2005)
-
queue jobs with [http://www.ifp.uiuc.edu/~bowonlee/research/cluster/linux_cluster.htm SGE], Sun Grid Engine.
+
:Convert transcription formats
 +
:[http://www.isle.uiuc.edu/software/transcription_tools2005May.tgz TGZ archive]
 +
:[svn://mickey.ifp.uiuc.edu/transcription_tools/ SVN repository]  
-
[http://www.ifp.uiuc.edu/~bowonlee/htk-pl/HCopy.pl HCopy.pl]
+
;speechfileformats (Mark Hasegawa-Johnson, 2004)
 +
:Read and write HTK files in matlab
 +
:[http://www.isle.uiuc.edu/software/speechfileformats.tgz TGZ archive]
 +
:[svn://mickey.ifp.uiuc.edu/speechfileformats/ SVN repository]  
-
[http://www.ifp.uiuc.edu/~bowonlee/htk-pl/HERest.pl HERest.pl]
+
;CTMRedit (Jul Cha and MH 1999)
 +
:Manually and automatically segment CT and MR image stacks
 +
:[http://www.isle.uiuc.edu/pubs/1990s/hasegawa-johnson99embs.pdf Description]
 +
:[svn://mickey.ifp.uiuc.edu/CTMRedit SVN repository]  
-
[http://www.ifp.uiuc.edu/~bowonlee/htk-pl/HVite.pl HVite.pl]
+
;improved MVA (Arthur Kantor 2008)
 +
:Perform mean and variance normalization and ARMA filtering
 +
:It's essentially [http://ssli.ee.washington.edu/people/chiaping/mva.html this] version but with
 +
:* better error reporting (e.g. failing to open file tells you so instead of core dumping)
 +
:* more accurate mean and variance estimation (doubles instead of floats in strategic places)
 +
:* faster computation in the case of MV (ARMA order 0)
 +
:[http://mickey.ifp.uiuc.edu/speech/akantor/fisher/programs/bin.Linux/MVA binary] [svn://mickey.ifp.uiuc.edu/corporaNormalizationScripts/fisher/MVA.cc SVN repository]  
-
[http://www.ifp.uiuc.edu/~bowonlee/htk-pl/HResults.pl HResults.pl]
+
==Miscellaneous==
 +
Other scripts written in perl, python, bash, and ruby can be found in [svn://mickey.ifp.uiuc.edu/scripts SVN archive].
-
Bowon Lee, 02/24/2006
+
There is also [[Scripts_Documentation| auto-generated documentation]] for them.

Revision as of 04:05, 19 March 2010

Contents

Statistical Speech Technology Group Software

Our policy: everything we write is free on the web. This wiki is intended to be definitive, because anybody in the group can edit it to add their own software. A spider-indexable backup is at http://www.isle.uiuc.edu/software .

Our software is available via subversion, using login name "anon" with no password (hit "enter" when a password is requested).

Learning

Pronounce
An orthographic string to phonetic string mapping tool. (Arthur Kantor, 2007)
This tool computes American English phonetic transcriptions from plaintext. Its HMM either generates a most likely phonetic transcription, or forces alignment if a phonetic transcription is provided. So, it gives a reasonable pronunciation for both out-of-dictionary words and partially pronounced words.
Description, Demo,
SVN archive
HTK-based Explicit-duration HMM (Ken Chen, 2003)
Description, TGZ archive
SVN repository
HTKtrain
Scripts for training HMMs using HTK (Sarah Borys and Mark Hasegawa-Johnson, 2008)
SVN repository


Signal Processing

PVTK (Sarah Borys and MH 2005-8)
Extract HTK features as training vectors for libSVM, apply trained SVMs directly to feature files
TGZ archive
SVN repository
VAD (Bowon Lee, 2007)
Voice activity detector with improved noise model
Description, lee_vad.m
SVN repository
Nested STFTs (Dave Cohen, Camille Goudeseune, Mark Hasegawa-Johnson 2009)
Efficient Simultaneous Multi-Scale Computation of FFTs
Description, stft.c


Improved Mistral (Qingsong Liu 2009)
State of the Art Text-Independent Speaker Verification System,especially for NIST SRE
Based on Mistral Open Source package
Improved and New Features:
  • add full factor analysis(eigenchannel and eigenvoice), instead of simple factor analysis(eigenchannel)
  • add multi-threads for Windows as well as Linux
  • support read HTK format feature/model
  • add an effective Algorithm for fast implementation of FA.
  • code optimization(for FA)
  • fixed some bugs
Source: /ws/ifp-32-2/hasegawa/pineking/programs/Improved_Mistral

Computation

GMTK Parallel (Arthur Kantor, 2008)
Split GMTK commands into batch jobs for a cluster
Description,
SVN repository
HTK Parallel (Bowon Lee, 2006)
These Perl scripts (description) Split an HTK command for parallel excution on a SGE cluster.
Description,
HCopy.pl,

HVite.pl, HERest.pl, HResults.pl

SVN repository


Data

dtmfseg (Bowon Lee, 2006)
Segment audio files at DTMF tones
SVN repository
transcription tools (Mark Hasegawa-Johnson, 2005)
Convert transcription formats
TGZ archive
SVN repository
speechfileformats (Mark Hasegawa-Johnson, 2004)
Read and write HTK files in matlab
TGZ archive
SVN repository
CTMRedit (Jul Cha and MH 1999)
Manually and automatically segment CT and MR image stacks
Description
SVN repository
improved MVA (Arthur Kantor 2008)
Perform mean and variance normalization and ARMA filtering
It's essentially this version but with
  • better error reporting (e.g. failing to open file tells you so instead of core dumping)
  • more accurate mean and variance estimation (doubles instead of floats in strategic places)
  • faster computation in the case of MV (ARMA order 0)
binary SVN repository

Miscellaneous

Other scripts written in perl, python, bash, and ruby can be found in SVN archive.

There is also auto-generated documentation for them.

Personal tools