GMTK parallel tools

From SpeechWiki

Jump to: navigation, search


Scripts and Modules

The * scripts parse the command line args and call the routines in the corresponding * modules to actually do the work.

  • and Does a viterbi decoding in parallel and then optionally runs sclite to report recognition accuracy
  • and Does a single iteration of EM training in parallel
  • and Does a sequence of em training iterations to convergence followed by splits/vanishes of gaussians followed by more iterations to convergence according to some convergence and split/vanish schedule
  • and Runs a list of commands in parallel on a SGE cluster.


  • All of the code is packaged in modules: The tools can be used by calling a perl function instead of starting another script in a new process.
  • All tools are restartable. If they are interrupted for any reason (i.e. cluster glitch, or user ctrl-c's the job), rerunning the command will only do the minimum work required to complete the job. Successfully completed sub-tasks are not rerun.
  • A fast sanity check is performed before any parallel jobs are fired off. This way, the user gets fast feedback on simple mistakes.
  • killing the the main script (with ctrl-c, for example) stops all execution all the compute nodes

Installation and Environment

The easiest way is to have the following in your path:

You must also have the *.pm modules in your path. You can do that by setting the PERL5LIB environment variable.


Not much yet. Some very rough overview slides are here. However the *.pm modules are relatively documented - hopefully enough to be useful.

Additional Resources

Bowon's parallel HTK tools

Bowon's SGE basics

--Arthur 17:33, 19 September 2006 (CDT)

Personal tools