:Arthur's Thesis

Revision as of 00:51, 30 September 2009 by Arthur (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Experiment Directions

Variants tested (Standard Baseline in parens):

Accoustic Model
- observations: (MVAed PLP), MVAed PLP+MLP
- units monophone (triphone) error-driven units
- 1-512 gaussians per mixture (64)
- Mixture pruning strategies
- Timeshrinking
Language Model (all combinations of the following, evaluated by LM cross-entropy)
- order: 2,3,4-gram (3-gram)
- vocab: 500,1000,5k,10k,20k, ~70k (10k vocab)
- smoothing: KN, (GT)
- pruning: none, (entropy based)
Pronunciation Model
- Multi-Pronunciation, Single-Pronunciation CMUdict, (Single-Pronunciation CMUdict augmented with auto-generated word fragments and missing words)
Corpora: 500-word Svitchboard, 80% Fisher corpus, (20% Fisher corpus)