What is it?

This is a tool that maps strings of letters (words) to their phonetic transcriptions via a Hidden Markov Model. It can also give phonetic transcriptions for partial words or words not in a dictionary. If a transcription dictionary is provided, the tool can align letters with their corresponding phones. It has been trained on American English pronounciations, but models for other languages can also be created.


You can try out the demo here.

What's it good for?

  • improved automatic spelling correction,
  • Training a speech recognizer on transcriptions which contain words that are not in the phoneme dictionary, or words that have been marked as partially pronounced (e.g. the switchboard corpus).
  • A toy to explore pronounciations of words (e.g. How would an american pronounce a German word?)

Where can I get it?

How do I install it?

You will need PERL and HTK


How do I train my own models?

