Computer Resources

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
(Applications)
(Databases)
Line 7: Line 7:
==Databases==
==Databases==
-
These directories have been backed upThey may or may not be backed up again unless Paritosh is specifically told to do so.  All paths are relative to the /workspace mount pointThe data below are sorted first by channel (telephone, broadband, audiovisual, biomedical), and second by language.
+
UPDATE MAY 26, 2008: Almost all databases have been re-organized on the RAIDThere are about 190 distinct databases, in about twenty languages, organized into the directories /workspace/ifp-32-2/hasegawa/data/MODALITY/LANG/DBNAMEMODALITY is one of text, telephone, wideband, multimodal, or biomedical.  LANG is the three-letter ISO language code; for a list of codes, see /workspace/ifp-32-2/hasegawa/data/README.txt.  DBNAME is an abbreviated name for the database; for the full database name and LDC code (if it's LDC data), see the file .../DBNAME/README.txt.  The LDC catalog page is often stored in .../DBNAME/CATALOG/*.html.
 +
 
 +
A few of the existing databases include: 
* English Telephone Speech
* English Telephone Speech
-
**/workspace/fluffy1/switchboard-1/  Switchboard word transcription, 10ms PLP, wave, etc.
+
** Switchboard 1: ifp-32-2, and /workspace/{fluffy1,helmholtz1}/switchboard-1/
-
** {helmholtz1/Switchboard,fluffy1/switchboard-1} - Switchboard 1 audio, orthographic transcriptions, and e-speech prosody predictions.
+
** fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
** fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
** fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard-1 extracts
** fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard-1 extracts
-
** spot0/switchboard-2 - Switchboard-2 is like Switchboard-1, but more, and with better U.S. dialect coverage
+
** Switchboard 2: ifp-32-2, and spot0/switchboard-2
-
** fluffy1/{hub5,hub5_eval} - The NIST hub5 training and test data
+
** Hub 5 eval: ifp-32-2, and fluffy1/{hub5,hub5_eval}
-
** mickey0/english_callhome - Conversations among family and friends
+
** Callhome: ifp-32-2, and mickey0/english_callhome
-
** fluffy1/ntimit-{train,test} - The NTIMIT read speech corpus, passed through a telephone channel
+
** NTIMIT: ifp-32-2, and fluffy1/ntimit-{train,test}
* Chinese Telephone Speech
* Chinese Telephone Speech
-
** mickey0/Mandarin_callhome
+
** ifp-32-2, and mickey0/Mandarin_callhome
* Spanish Telephone Speech
* Spanish Telephone Speech
-
** nibbler0/speech_data/Spanish_callhome
+
** ifp-32-2, and nibbler0/speech_data/Spanish_callhome
* Hindi Telephone Speech
* Hindi Telephone Speech
-
** nibbler0/speech_data/Hindi_callfriend
+
** ifp-32-2, and nibbler0/speech_data/Hindi_callfriend
* English Broadband Speech
* English Broadband Speech
-
** mickey1/BN97 - Broadcast News (raw data)
+
** Broadcast News (Hub 4): ifp-32-2, and {mickey1,tico0,nibbler1}/BN97
-
** tico0/BN97 - Broadcast News (chopped waves, PLP, VTLN training data, f-condition(recording condition) infos)
+
-
** nibbler1/BN97 - Broadcast News (1996 ARPA Hub 4 Evaluation data)
+
** mickey1/HUB5E_98 - HUB5 NIST competition Broadcast News data
** mickey1/HUB5E_98 - HUB5 NIST competition Broadcast News data
-
** /workspace/tidigits Tidigits - Isolated Digits and Phone Numbers
+
** TIDIGITS: ifp-32-2, and /workspace/tidigits Tidigits
-
** fletcher1/kidspeech - Recordings of Children 8, 9, 10 years old
+
** Kidspeech: ifp-32-2, and fletcher1/kidspeech
-
** Radio_Speech_Corpus - The Boston University Radio Speech Corpus, seven speakers with prosodic transcription (partly at /workspace/fluffy1/radio_news)
+
** Radio Speech Corpus: ifp-32-2
** fletcher1/bdc - The Boston Directions Corpus, two speakers have prosodic transcriptions, others don't
** fletcher1/bdc - The Boston Directions Corpus, two speakers have prosodic transcriptions, others don't
-
** tico0/TED - Translanguage English Database, presentations at EUROSPEECH spoken by native, non-native English speaker     
+
** tico0/TED and ifp-32-2 - Translanguage English Database, presentations at EUROSPEECH spoken by native, non-native English speaker     
* Chinese Broadband Speech
* Chinese Broadband Speech
-
** nibbler1/MBN - Mandarin Broadcast News
+
** Mandarin Broadcast News: ifp-32-2 and nibbler1/MBN
** nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
** nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
* Audiovisual and Multimicrophone Data
* Audiovisual and Multimicrophone Data
-
** fluffy0/data/{icsi_mr,isl_meeting}_transcr - Transcriptions of meeting room data from ISL, ICSI.  Audio and (remote camera) video available but not online
+
** Meeting room data: ifp-32-2/multimodal/eng, and fluffy0/data/{icsi_mr,isl_meeting}_transcr
-
** mickey1/AVICAR_DIST - 4-camera, 8-microphone recording of read sentences, phone numbers, and isolated letters in a moving car
+
** AVICAR: mickey1/AVICAR_DIST, also available via sftp to ifp-31.ifp.uiuc.edu
-
** rizzo1/speech_hearing - 7-microphone recordings of isolated words produced by talkers with dysarthria
+
** Universal Access Corpus (talkers with cerebral palsy): rizzo1/speech_hearing, also available via sftp to ifp-31.ifp.uiuc.edu
* Biomedical Image Data
* Biomedical Image Data
-
** mickey0/UW_XRAY_MICROBEAM - point tracking data, 100Hz sampling, obtained using X-ray microbeam
+
** X-Ray Microbeam: mickey0/UW_XRAY_MICROBEAM and ifp-32-2
** {speech_web/mri,http://www.isle.uiuc.edu/mri} - much 3D MRI of vowels, a little fast 2D MRI of the alphabet, and a little high-res 3D MRI of excised tongues
** {speech_web/mri,http://www.isle.uiuc.edu/mri} - much 3D MRI of vowels, a little fast 2D MRI of the alphabet, and a little high-res 3D MRI of excised tongues
-
 
-
* Text
 
-
** rizzo0/treebank - The Penn Treebank syntactically parsed corpus
 
* Time-aligned Switchboard Disfluency corpus
* Time-aligned Switchboard Disfluency corpus

Revision as of 21:05, 26 May 2008

Contents

LVCSR at Illinois Computer Resources

This page has two main purposes: (1) Help local users keep track of databases. Disk management is decentralized, therefore messy; this page is intended to help. (2) Specify which directories are archival, and which are temporary.

Directories listed below are accessible only to ISLE/IFP registered users.

Databases

UPDATE MAY 26, 2008: Almost all databases have been re-organized on the RAID. There are about 190 distinct databases, in about twenty languages, organized into the directories /workspace/ifp-32-2/hasegawa/data/MODALITY/LANG/DBNAME. MODALITY is one of text, telephone, wideband, multimodal, or biomedical. LANG is the three-letter ISO language code; for a list of codes, see /workspace/ifp-32-2/hasegawa/data/README.txt. DBNAME is an abbreviated name for the database; for the full database name and LDC code (if it's LDC data), see the file .../DBNAME/README.txt. The LDC catalog page is often stored in .../DBNAME/CATALOG/*.html.

A few of the existing databases include:

  • English Telephone Speech
    • Switchboard 1: ifp-32-2, and /workspace/{fluffy1,helmholtz1}/switchboard-1/
    • fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
    • fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard-1 extracts
    • Switchboard 2: ifp-32-2, and spot0/switchboard-2
    • Hub 5 eval: ifp-32-2, and fluffy1/{hub5,hub5_eval}
    • Callhome: ifp-32-2, and mickey0/english_callhome
    • NTIMIT: ifp-32-2, and fluffy1/ntimit-{train,test}
  • Chinese Telephone Speech
    • ifp-32-2, and mickey0/Mandarin_callhome
  • Spanish Telephone Speech
    • ifp-32-2, and nibbler0/speech_data/Spanish_callhome
  • Hindi Telephone Speech
    • ifp-32-2, and nibbler0/speech_data/Hindi_callfriend
  • English Broadband Speech
    • Broadcast News (Hub 4): ifp-32-2, and {mickey1,tico0,nibbler1}/BN97
    • mickey1/HUB5E_98 - HUB5 NIST competition Broadcast News data
    • TIDIGITS: ifp-32-2, and /workspace/tidigits Tidigits
    • Kidspeech: ifp-32-2, and fletcher1/kidspeech
    • Radio Speech Corpus: ifp-32-2
    • fletcher1/bdc - The Boston Directions Corpus, two speakers have prosodic transcriptions, others don't
    • tico0/TED and ifp-32-2 - Translanguage English Database, presentations at EUROSPEECH spoken by native, non-native English speaker
  • Chinese Broadband Speech
    • Mandarin Broadcast News: ifp-32-2 and nibbler1/MBN
    • nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
  • Audiovisual and Multimicrophone Data
    • Meeting room data: ifp-32-2/multimodal/eng, and fluffy0/data/{icsi_mr,isl_meeting}_transcr
    • AVICAR: mickey1/AVICAR_DIST, also available via sftp to ifp-31.ifp.uiuc.edu
    • Universal Access Corpus (talkers with cerebral palsy): rizzo1/speech_hearing, also available via sftp to ifp-31.ifp.uiuc.edu
  • Biomedical Image Data
    • X-Ray Microbeam: mickey0/UW_XRAY_MICROBEAM and ifp-32-2
    • {speech_web/mri,http://www.isle.uiuc.edu/mri} - much 3D MRI of vowels, a little fast 2D MRI of the alphabet, and a little high-res 3D MRI of excised tongues
  • Time-aligned Switchboard Disfluency corpus
    • mickey0/sw_disTime-0.9.9 - merged from the original Switchboard time transcription and the Treebank-3 disfluency transcription (TextGrid included)
    • mickey0/sw_disTime-1.0.0 (TextGrid NOT included)

Applications

  • Hidden Markov Models:
    • HTK (Cambridge): fluffy0/programs/htk-3.3
    • DCD (ATT): nibbler0/speech_apps/dcd-2.0
  • Dynamic Bayesian Nets/Graphical Models: nibbler0/speech_apps/GMTK
  • Language Models: fluffy0/programs/srilm
  • Finite State Machines:
    • FSM (ATT): fluffy0/programs/fsm-4.0
    • FST (MIT): fluffy0/programs/fst-1.0-RC1 (MIT)
    • OpenFST : fluffy0/programs/OpenFst/
  • Support Vector Machines:
    • SVMLIB (NJTU): fluffy0/programs/svmlib
    • svm_light (Joachims): fluffy0/programs/svm_light
    • PVTK (UIUC): mickey0/SVM/PVTK
  • Neural Nets
    • quicknet (ICSI): mickey0/quicknet
  • Spectrograms and Waveform Viewing
    • XKL (MIT): nibbler0/speech_apps/xkl-2.3.1
    • ESPS (Entropic Systems, now Microsoft)
    • Praat
  • Speech Data File Formats:
    • SPHERE (NIST): fluffy0/programs/sphere
    • sox (linux): /usr/bin/sox
    • HCopy (Cambridge): see HTK

Backups

If you have personal working directories that should be regularly backed up, outside of your own home directory, list them here.

  • Art
    • mickey0/akantor
    • rizzo1/akantor is itself a backup of svn because it cannot be backed up in the normal way.
  • Sarah
    • nibbler0/data
    • rizzo0/sborys
    • spot1/sborys
    • tico0/sborys
  • Bowon
    • mickey1/AVICAR_AUDIO
    • mickey1/AVICAR_DATA
    • mickey1/AVICAR_DIST
    • mickey1/AVICAR_DIST_OLD
    • rizzo1/bowonlee
    • mickey0/bowonlee
  • Mital
    • mickey0/magandhi
  • Xiaodan
    • spot1/xzhuang2/newbaseline
    • spot1/xzhuang2/workshop
    • c1-15/hasegawa/xzhuang2*
    • /workspace/tico0/AED/
  • Rajiv
    • scratch/rreddy
Personal tools