:Arthur's Thesis
From SpeechWiki
Experiment Directions
Variants tested (Standard Baseline in parens):
- Accoustic Model
- observations: (MVAed PLP), MVAed PLP+MLP
- units monophone (triphone) error-driven units
- 1-512 gaussians per mixture (64)
- Mixture pruning strategies
- Timeshrinking
- Language Model (all combinations of the following, evaluated by LM cross-entropy)
- order: 2,3,4-gram (3-gram)
- vocab: 500,1000,5k,10k,20k, ~70k (10k vocab)
- smoothing: KN, (GT)
- pruning: none, (entropy based)
- Pronunciation Model
- Multi-Pronunciation, Single-Pronunciation CMUdict, (Single-Pronunciation CMUdict augmented with auto-generated word fragments and missing words)
- Corpora: 500-word Svitchboard, 80% Fisher corpus, (20% Fisher corpus)