Fisher Corpus
From SpeechWiki
The fisher corpus is still relatively new and rough, and this page is to help people quickly build a basic speech recognizer with it.
Contents |
Train/Devel/Test partition
I've split the entire Fisher corpus into 80/10/10 percent for Train/Devel/Test partitions
The utterance id file is in filelists/uttIds.txt And the splits are as follows:
Set | Conversation Sides | Lines in uttIds.txt |
---|---|---|
Training | 00001A to 09360B | 1 to 1775831 |
Devel | 09361A to 10530B | 1775832 to 1991965 |
Test | 10531A to 11699B | 1991965 to 2223159 |
Dictionaries
Language Model
There is a lot to say about the Fisher Language Models so they get their own page.