Software
From SpeechWiki
(Difference between revisions)
m (nested stft, local link to stft.c) |
(reformatting into wiki style) |
||
Line 1: | Line 1: | ||
- | =Statistical Speech Technology Group Software= | + | ===Statistical Speech Technology Group Software=== |
Our policy: everything we write is free on the web. This wiki is intended to be definitive, because anybody in the group can edit it to add their own software. A spider-indexable backup is at http://www.isle.uiuc.edu/software . | Our policy: everything we write is free on the web. This wiki is intended to be definitive, because anybody in the group can edit it to add their own software. A spider-indexable backup is at http://www.isle.uiuc.edu/software . | ||
Line 8: | Line 8: | ||
* On Linux, use the svn command, e.g., svn co svn://mickey.ifp.uiuc.edu/speechfileformats | * On Linux, use the svn command, e.g., svn co svn://mickey.ifp.uiuc.edu/speechfileformats | ||
- | + | ==Learning== | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ; Pronounce: An orthographic string to phonetic string mapping tool. (Arthur Kantor, 2007) | |
- | [http:// | + | :This tool computes American English phonetic transcriptions from plaintext. Its HMM either generates a most likely phonetic transcription, or forces alignment if a phonetic transcription is provided. So, it gives a reasonable pronunciation for both out-of-dictionary words and partially pronounced words. |
- | [http:// | + | : [http://mickey.ifp.uiuc.edu/speechWiki/index.php/Phonetic_Transcription_Tool Description], [http://mickey.ifp.uiuc.edu/speech/webpronounce/webpronounce.cgi Demo], |
- | + | :[svn://mickey.ifp.uiuc.edu/pronounce SVN archive] | |
- | + | ;HTK-based Explicit-duration HMM (Ken Chen, 2003) | |
- | [svn://mickey.ifp.uiuc.edu/ | + | :[http://www.isle.uiuc.edu/pubs/2003/chen03interspeech.pdf Description], [http://www.isle.uiuc.edu/software/HDK4.tar.gz TGZ archive] |
- | + | :[svn://mickey.ifp.uiuc.edu/HDK4_release SVN repository] | |
- | + | ;HTKtrain | |
- | + | :Scripts for training HMMs using HTK (Sarah Borys and Mark Hasegawa-Johnson, 2008) | |
- | + | :[svn://mickey.ifp.uiuc.edu/HTKtrain SVN repository] | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | |||
- | |||
- | + | ==Signal Processing== | |
- | + | ;PVTK (Sarah Borys and MH 2005-8) | |
- | + | :Extract HTK features as training vectors for libSVM, apply trained SVMs directly to feature files | |
- | + | :[http://www.isle.uiuc.edu/software/PVTK2005May23.tgz TGZ archive] | |
- | + | :[svn://mickey.ifp.uiuc.edu/PVTK SVN repository] | |
- | + | ||
- | + | ||
- | + | ;VAD (Bowon Lee, 2007) | |
+ | :Voice activity detector with improved noise model | ||
+ | :[http://www.isle.uiuc.edu/pubs/2007/lee07dspincars.pdf Description], [http://www.isle.uiuc.edu/software/lee_vad.m lee_vad.m] | ||
+ | :[svn://mickey.ifp.uiuc.edu/lee_vad SVN repository] | ||
- | ( | + | ;Nested STFTs (Dave Cohen, Camille Goudeseune, Mark Hasegawa-Johnson 2009) |
+ | :Efficient Simultaneous Multi-Scale Computation of FFTs | ||
+ | : [http://fodava.gatech.edu/files/reports/GT-FODAVA-09-01.pdf Description], [http://www.isle.uiuc.edu/software/stft.c stft.c] | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | + | :Improved Mistral (Qingsong Liu 2009) | |
- | + | :State of the Art Text-Independent Speaker Verification System,especially for NIST SRE | |
- | Split | + | :Based on [http://mistral.univ-avignon.fr/wiki/index.php/Main_Page Mistral Open Source package] |
- | [http://www.ifp.uiuc.edu/~bowonlee/research/cluster/HTK_parallel.htm Description], | + | :Improved and New Features: |
- | [http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HCopy.pl HCopy.pl], | + | :* add full factor analysis(eigenchannel and eigenvoice), instead of simple factor analysis(eigenchannel) |
+ | :* add multi-threads for Windows as well as Linux | ||
+ | :* support read HTK format feature/model | ||
+ | :* add an effective Algorithm for fast implementation of FA. | ||
+ | :* code optimization(for FA) | ||
+ | :* fixed some bugs | ||
+ | :Source: /ws/ifp-32-2/hasegawa/pineking/programs/Improved_Mistral | ||
+ | |||
+ | ==Computation== | ||
+ | ;GMTK Parallel (Arthur Kantor, 2008) | ||
+ | :Split GMTK commands into batch jobs for a cluster | ||
+ | :[http://mickey.ifp.uiuc.edu/speechWiki/index.php/GMTK_parallel_tools Description], | ||
+ | :[svn://mickey.ifp.uiuc.edu/gmtkScripts/ SVN repository] | ||
+ | |||
+ | ;HTK Parallel (Bowon Lee, 2006) | ||
+ | :These Perl scripts ([http://www.ifp.uiuc.edu/~bowonlee/research/cluster/HTK_parallel.htm description]) Split an HTK command for parallel excution on a [http://www.ifp.uiuc.edu/~bowonlee/research/cluster/linux_cluster.htm SGE] cluster. | ||
+ | :[http://www.ifp.uiuc.edu/~bowonlee/research/cluster/HTK_parallel.htm Description], | ||
+ | :[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HCopy.pl HCopy.pl], | ||
[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HVite.pl HVite.pl], | [http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HVite.pl HVite.pl], | ||
[http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HERest.pl HERest.pl], | [http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HERest.pl HERest.pl], | ||
- | [http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HResults.pl HResults.pl] | + | [http://www.ifp.uiuc.edu/~bowonlee/research/htk-pl/HResults.pl HResults.pl] |
- | + | :[svn://mickey.ifp.uiuc.edu/HTK_parallel/ SVN repository] | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | |||
- | = | + | ==Data== |
+ | ;dtmfseg (Bowon Lee, 2006) | ||
+ | :Segment audio files at DTMF tones | ||
+ | :[svn://mickey.ifp.uiuc.edu/dtmfseg/ SVN repository] | ||
- | + | ;transcription tools (Mark Hasegawa-Johnson, 2005) | |
- | + | :Convert transcription formats | |
+ | :[http://www.isle.uiuc.edu/software/transcription_tools2005May.tgz TGZ archive] | ||
+ | :[svn://mickey.ifp.uiuc.edu/transcription_tools/ SVN repository] | ||
- | [http://www. | + | ;speechfileformats (Mark Hasegawa-Johnson, 2004) |
+ | :Read and write HTK files in matlab | ||
+ | :[http://www.isle.uiuc.edu/software/speechfileformats.tgz TGZ archive] | ||
+ | :[svn://mickey.ifp.uiuc.edu/speechfileformats/ SVN repository] | ||
- | [http://www. | + | ;CTMRedit (Jul Cha and MH 1999) |
+ | :Manually and automatically segment CT and MR image stacks | ||
+ | :[http://www.isle.uiuc.edu/pubs/1990s/hasegawa-johnson99embs.pdf Description] | ||
+ | :[svn://mickey.ifp.uiuc.edu/CTMRedit SVN repository] | ||
- | [http:// | + | ;improved MVA (Arthur Kantor 2008) |
+ | :Perform mean and variance normalization and ARMA filtering | ||
+ | :It's essentially [http://ssli.ee.washington.edu/people/chiaping/mva.html this] version but with | ||
+ | :* better error reporting (e.g. failing to open file tells you so instead of core dumping) | ||
+ | :* more accurate mean and variance estimation (doubles instead of floats in strategic places) | ||
+ | :* faster computation in the case of MV (ARMA order 0) | ||
+ | :[http://mickey.ifp.uiuc.edu/speech/akantor/fisher/programs/bin.Linux/MVA binary] [svn://mickey.ifp.uiuc.edu/corporaNormalizationScripts/fisher/MVA.cc SVN repository] | ||
- | [ | + | ==Miscellaneous== |
+ | Other scripts written in perl, python, bash, and ruby can be found in [svn://mickey.ifp.uiuc.edu/scripts SVN archive]. | ||
- | + | There is also [[Scripts_Documentation| auto-generated documentation]] for them. |
Revision as of 04:05, 19 March 2010
Contents |
Statistical Speech Technology Group Software
Our policy: everything we write is free on the web. This wiki is intended to be definitive, because anybody in the group can edit it to add their own software. A spider-indexable backup is at http://www.isle.uiuc.edu/software .
Our software is available via subversion, using login name "anon" with no password (hit "enter" when a password is requested).
- On Windows, download TortoiseSVN
- On Linux, use the svn command, e.g., svn co svn://mickey.ifp.uiuc.edu/speechfileformats
Learning
- Pronounce
- An orthographic string to phonetic string mapping tool. (Arthur Kantor, 2007)
- This tool computes American English phonetic transcriptions from plaintext. Its HMM either generates a most likely phonetic transcription, or forces alignment if a phonetic transcription is provided. So, it gives a reasonable pronunciation for both out-of-dictionary words and partially pronounced words.
- Description, Demo,
- SVN archive
- HTK-based Explicit-duration HMM (Ken Chen, 2003)
- Description, TGZ archive
- SVN repository
- HTKtrain
- Scripts for training HMMs using HTK (Sarah Borys and Mark Hasegawa-Johnson, 2008)
- SVN repository
Signal Processing
- PVTK (Sarah Borys and MH 2005-8)
- Extract HTK features as training vectors for libSVM, apply trained SVMs directly to feature files
- TGZ archive
- SVN repository
- VAD (Bowon Lee, 2007)
- Voice activity detector with improved noise model
- Description, lee_vad.m
- SVN repository
- Nested STFTs (Dave Cohen, Camille Goudeseune, Mark Hasegawa-Johnson 2009)
- Efficient Simultaneous Multi-Scale Computation of FFTs
- Description, stft.c
- Improved Mistral (Qingsong Liu 2009)
- State of the Art Text-Independent Speaker Verification System,especially for NIST SRE
- Based on Mistral Open Source package
- Improved and New Features:
- add full factor analysis(eigenchannel and eigenvoice), instead of simple factor analysis(eigenchannel)
- add multi-threads for Windows as well as Linux
- support read HTK format feature/model
- add an effective Algorithm for fast implementation of FA.
- code optimization(for FA)
- fixed some bugs
- Source: /ws/ifp-32-2/hasegawa/pineking/programs/Improved_Mistral
Computation
- GMTK Parallel (Arthur Kantor, 2008)
- Split GMTK commands into batch jobs for a cluster
- Description,
- SVN repository
- HTK Parallel (Bowon Lee, 2006)
- These Perl scripts (description) Split an HTK command for parallel excution on a SGE cluster.
- Description,
- HCopy.pl,
HVite.pl, HERest.pl, HResults.pl
Data
- dtmfseg (Bowon Lee, 2006)
- Segment audio files at DTMF tones
- SVN repository
- transcription tools (Mark Hasegawa-Johnson, 2005)
- Convert transcription formats
- TGZ archive
- SVN repository
- speechfileformats (Mark Hasegawa-Johnson, 2004)
- Read and write HTK files in matlab
- TGZ archive
- SVN repository
- CTMRedit (Jul Cha and MH 1999)
- Manually and automatically segment CT and MR image stacks
- Description
- SVN repository
- improved MVA (Arthur Kantor 2008)
- Perform mean and variance normalization and ARMA filtering
- It's essentially this version but with
- better error reporting (e.g. failing to open file tells you so instead of core dumping)
- more accurate mean and variance estimation (doubles instead of floats in strategic places)
- faster computation in the case of MV (ARMA order 0)
- binary SVN repository
Miscellaneous
Other scripts written in perl, python, bash, and ruby can be found in SVN archive.
There is also auto-generated documentation for them.