Computer Resources
From SpeechWiki
(Difference between revisions)
(→Databases) |
(→LVCSR at Illinois Computer Resources) |
||
Line 5: | Line 5: | ||
* [http://www.isle.uiuc.edu/data/README.txt Data Organization README] | * [http://www.isle.uiuc.edu/data/README.txt Data Organization README] | ||
+ | * Data not yet moved to ifp-32-2: | ||
+ | ** Switchboard 1: /workspace/{fluffy1,helmholtz1}/switchboard-1/ contain some data transformations not on ifp-32-2 | ||
+ | ** fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions. | ||
+ | ** fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard-1 extracts | ||
+ | ** fletcher1/bdc - The Boston Directions Corpus, two speakers have prosodic transcriptions, others don't | ||
+ | ** Mandarin Broadcast News: nibbler1/MBN contains some transformations not on ifp-32-2 | ||
+ | ** nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms) | ||
+ | ** AVICAR: mickey1/AVICAR_DIST, also available via sftp to ifp-31.ifp.uiuc.edu | ||
+ | ** Universal Access Corpus (talkers with cerebral palsy): rizzo1/speech_hearing, also available via sftp to ifp-31.ifp.uiuc.edu | ||
+ | ** {speech_web/mri,http://www.isle.uiuc.edu/mri} - much 3D MRI of vowels, a little fast 2D MRI of the alphabet, and a little high-res 3D MRI of excised tongues | ||
+ | * Time-aligned Switchboard Disfluency corpus | ||
+ | ** mickey0/sw_disTime-0.9.9 - merged from the original Switchboard time transcription and the Treebank-3 disfluency transcription (TextGrid included) | ||
+ | ** mickey0/sw_disTime-1.0.0 (TextGrid NOT included) | ||
==Applications== | ==Applications== |
Revision as of 00:18, 16 June 2008
LVCSR at Illinois Computer Resources
- Data not yet moved to ifp-32-2:
- Switchboard 1: /workspace/{fluffy1,helmholtz1}/switchboard-1/ contain some data transformations not on ifp-32-2
- fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
- fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard-1 extracts
- fletcher1/bdc - The Boston Directions Corpus, two speakers have prosodic transcriptions, others don't
- Mandarin Broadcast News: nibbler1/MBN contains some transformations not on ifp-32-2
- nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
- AVICAR: mickey1/AVICAR_DIST, also available via sftp to ifp-31.ifp.uiuc.edu
- Universal Access Corpus (talkers with cerebral palsy): rizzo1/speech_hearing, also available via sftp to ifp-31.ifp.uiuc.edu
- {speech_web/mri,http://www.isle.uiuc.edu/mri} - much 3D MRI of vowels, a little fast 2D MRI of the alphabet, and a little high-res 3D MRI of excised tongues
- Time-aligned Switchboard Disfluency corpus
- mickey0/sw_disTime-0.9.9 - merged from the original Switchboard time transcription and the Treebank-3 disfluency transcription (TextGrid included)
- mickey0/sw_disTime-1.0.0 (TextGrid NOT included)
Applications
- Hidden Markov Models:
- HTK (Cambridge): fluffy0/programs/htk-3.3
- DCD (ATT): nibbler0/speech_apps/dcd-2.0
- Dynamic Bayesian Nets/Graphical Models: nibbler0/speech_apps/GMTK
- Language Models: fluffy0/programs/srilm
- Finite State Machines:
- FSM (ATT): fluffy0/programs/fsm-4.0
- FST (MIT): fluffy0/programs/fst-1.0-RC1 (MIT)
- OpenFST : fluffy0/programs/OpenFst/
- Support Vector Machines:
- SVMLIB (NJTU): fluffy0/programs/svmlib
- svm_light (Joachims): fluffy0/programs/svm_light
- PVTK (UIUC): mickey0/SVM/PVTK
- Neural Nets
- quicknet (ICSI): mickey0/quicknet
- Spectrograms and Waveform Viewing
- XKL (MIT): nibbler0/speech_apps/xkl-2.3.1
- ESPS (Entropic Systems, now Microsoft)
- Praat
- Speech Data File Formats:
- SPHERE (NIST): fluffy0/programs/sphere
- sox (linux): /usr/bin/sox
- HCopy (Cambridge): see HTK
Backups
If you have personal working directories that should be regularly backed up, outside of your own home directory, list them here.
- Art
- mickey0/akantor
- rizzo1/akantor is itself a backup of svn because it cannot be backed up in the normal way.
- Sarah
- nibbler0/data
- rizzo0/sborys
- spot1/sborys
- tico0/sborys
- Bowon
- mickey1/AVICAR_AUDIO
- mickey1/AVICAR_DATA
- mickey1/AVICAR_DIST
- mickey1/AVICAR_DIST_OLD
- rizzo1/bowonlee
- mickey0/bowonlee
- Mital
- mickey0/magandhi
- Xiaodan
- spot1/xzhuang2/newbaseline
- spot1/xzhuang2/workshop
- c1-15/hasegawa/xzhuang2*
- /workspace/tico0/AED/
- Rajiv
- scratch/rreddy