Statistical Speech Technology Group Software

Our policy: everything we write is free on the web. This wiki is intended to be definitive, because anybody in the group can edit it to add their own software. A spider-indexable backup is at http://www.isle.uiuc.edu/software .

Our software is available via subversion, using login name "anon" with no password (hit "enter" when a password is requested).

On Windows, download TortoiseSVN
On Linux, use the svn command, e.g., svn co svn://mickey.ifp.uiuc.edu/speechfileformats

Learning
Pronounce	Letters to phones using an HMM Description,Demo, SVN archive (Arthur Kantor, 2007)
HDK	HTK-based Explicit-duration HMM Description, TGZ archive, SVN repository (Ken Chen, 2003)
HTKtrain	Scripts for training HMMs using HTK SVN repository (Sarah Borys and Mark Hasegawa-Johnson, 2008)
Signal Processing
PVTK	Extract HTK features as training vectors for libSVM, apply trained SVMs directly to feature files TGZ archive, SVN repository (Sarah Borys and MH 2005-8)
VAD	Voice activity detector with improved noise model Description, lee_vad.m, SVN repository (Bowon Lee, 2007)
Nested STFTs	Efficient Simultaneous Multi-Scale Computation of FFTs Description, stft.c (Dave Cohen, Camille Goudeseune, Mark Hasegawa-Johnson 2009)
Improved Mistral	State of the Art Text-Independent Speaker Verification System,especially for NIST SRE Based Mistral Open Source package. Improved and New Features: add full factor analysis(eigenchannel and eigenvoice), instead of simple factor analysis(eigenchannel) add multi-threads for Windows as well as Linux support read/write HTK format feature/model add an effective Algorithm for fast implementation of FA. code optimization(for FA) fixed some bugs Source: /ws/ifp-32-2/hasegawa/pineking/programs/Improved_Mistral Scripts(perl): /ws/ifp-32-2/hasegawa/pineking/scripts/Improved_Mistral (Qingsong Liu 2009)
Computation
GMTK Parallel	Split GMTK commands into batch jobs for a cluster Description, SVN repository (Arthur Kantor, 2008)
HTK Parallel	Split an HTK command into batch jobs for a cluster (Bowon Lee, 2006) Description, HCopy.pl, HVite.pl, HERest.pl, HResults.pl, SVN repository
Data
dtmfseg	Segment audio files at DTMF tones SVN repository (Bowon Lee, 2006)
transcription tools	Convert transcription formats TGZ archive, SVN repository (Mark Hasegawa-Johnson, 2005)
speechfileformats	Read and write HTK files in matlab TGZ archive, SVN repository (Mark Hasegawa-Johnson, 2004)
CTMRedit	Manually and automatically segment CT and MR image stacks Description, SVN repository (Jul Cha and MH 1999)
improved MVA	Perform mean and variance normalization and ARMA filtering It's essentially this version, improved: better error reporting (e.g. failing to open file tells you so instead of core dumping) more accurate mean and variance estimation (doubles instead of floats in strategic places) faster computation in the case of MV (ARMA order 0) source: svn://mickey.ifp.uiuc.edu/corporaNormalizationScripts/fisher/MVA.cc binary: http://mickey.ifp.uiuc.edu/speech/akantor/fisher/programs/bin.Linux/MVA
Scripts	miscellaneous perl, python, bash, and ruby SVN archive, Documentation

Phonetic Transcription Tool

This tool computes American English phonetic transcriptions from plaintext. Its HMM either generates a most likely phonetic transcription, or forces alignment if a phonetic transcription is provided.

So, it gives a reasonable pronunciation for both out-of-dictionary words and partially pronounced words.

Phonetic Transcription Tool.

Online demo.