Fisher Corpus

From SpeechWiki

Revision as of 18:04, 2 October 2008 by Arthur (Talk | contribs)
Jump to: navigation, search

This page links to the various things I've done with the Fisher corpus. It may be helpful for quickly building a basic speech recognizer.


Train/Devel/Test partitions

For all the models and experiments, the entire Fisher corpus into 80/10/10 percent for Train/Devel/Test partitions as follows

The utterance id file is in uttIds.txt And the splits are as follows:

Set Conversation Sides Lines in uttIds.txt
Training 00001A to 09360B 1 to 1775831
Devel 09361A to 10530B 1775832 to 1991965
Test 10531A to 11699B 1991965 to 2223159


The experiment infrastructure needs its own page.

The experiments

The phonetic and mixed-unit dictionaries, the language models and the front end used in my pronunciation experiments all have their own pages.

The Fisher Baseline Experiments and Mixed Unit Experiments.

Personal tools