Statistical Speech Technology Group Software

Our policy: everything we write is free on the web. This wiki page is intended to be definitive, because anybody in the group can edit it to add their own software. [1] provides a spider-indexable backup copy.

All of our software is available via <a href="http://subversion.tigris.org/">subversion</a>, using login name "anon" with no password (hit "enter" when a password is requested).

If you are using Windows, download TortoiseSVN
If you are using linux, use the svn command interface, e.g., svn co svn://mickey.ifp.uiuc.edu/speechfileformats

Learning
Pronounce	Letters to phones using an HMM Description,Demo, SVN archive (Arthur Kantor, 2007)
HDK	HTK-based Explicit-duration HMM Description, TGZ archive, SVN repository (Ken Chen, 2003)
Signal Processing
PVTK	Extract HTK features as training vecs for libSVM, apply trained SVMs directly to feature files TGZ archive, SVN repository, (Sarah Borys 2008, Mark Hasegawa-Johnson 2005)
VAD	Voice activity detector w/improved noise model Description, lee_vad.m, SVN repository, (Bowon Lee, 2007)
Computation
GMTK Parallel	Split GMTK commands into batch jobs for a cluster Description, SVN repository (Arthur Kantor, 2008)
HTK Parallel	Split an HTK command into batch jobs for a cluster (Bowon Lee, 2006) Description, HCopy.pl, HVite.pl, HERest.pl, HResults.pl, SVN repository
Data
dtmfseg	Segment audio files at DTMF tones SVN repository (Bowon Lee, 2006)
transcription tools	Convert transcription formats TGZ archive, SVN repository (Mark Hasegawa-Johnson, 2005)
speechfileformats	Read and write HTK files in matlab TGZ archive, SVN repository, (Mark Hasegawa-Johnson, 2004)
CTMRedit	Manually and automatically segment CT and MR image stacks Description, SVN repository (Mark Hasegawa-Johnson and Jul Cha, 1999)

Phonetic Transcription Tool

This tool gives the American English phonetic transcription of any string. It uses an HMM model to either generate a most likely phonetic transcription, or if a phonetic transcription is provided, it can perform forced alignment. So, it gives a reasonable pronounciation for out-of-dictionary words, or partially pronounced words.

Phonetic Transcription Tool. You can try out the demo here.