Fisher Corpus

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
m
Line 32: Line 32:
=The experiments=
=The experiments=
 +
The goal of these experiments is to explore the utility of using mixed units (phones, syllables and whole words) for large vocabulary speech recognition.
 +
These experiments are preformed on the Fisher Corpus.
The phonetic and mixed-unit [[Fisher Dictionaries| dictionaries]], the [[Fisher Language Model | language model]]s and the [[Fisher Front End | front end]] used in my pronunciation experiments all have their own pages.
The phonetic and mixed-unit [[Fisher Dictionaries| dictionaries]], the [[Fisher Language Model | language model]]s and the [[Fisher Front End | front end]] used in my pronunciation experiments all have their own pages.
The [[Fisher Baseline Experiments]] and [[Mixed Unit Experiments]].
The [[Fisher Baseline Experiments]] and [[Mixed Unit Experiments]].

Revision as of 18:08, 2 October 2008

This page links to the various things I've done with the Fisher corpus. It may be helpful for quickly building a basic speech recognizer.


Train/Devel/Test partitions

For all the models and experiments, the entire Fisher corpus into 80/10/10 percent for Train/Devel/Test partitions as follows

The utterance id file is in uttIds.txt And the splits are as follows:

Set Conversation Sides Lines in uttIds.txt
Training 00001A to 09360B 1 to 1775831
Devel 09361A to 10530B 1775832 to 1991965
Test 10531A to 11699B 1991965 to 2223159


The experiment infrastructure needs its own page.

The experiments

The goal of these experiments is to explore the utility of using mixed units (phones, syllables and whole words) for large vocabulary speech recognition. These experiments are preformed on the Fisher Corpus.

The phonetic and mixed-unit dictionaries, the language models and the front end used in my pronunciation experiments all have their own pages.

The Fisher Baseline Experiments and Mixed Unit Experiments.

Personal tools