Computer Resources

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
(LVCSR at Illinois Computer Resources)
Line 5: Line 5:
Directories listed below are accessible only to ISLE/IFP registered users.
Directories listed below are accessible only to ISLE/IFP registered users.
-
--Databases and Applications--
+
==Databases and Applications==
These directories have been backed up.  They may or may not be backed up again unless Paritosh is specifically told to do so.  All paths are relative to the /workspace mount point.
These directories have been backed up.  They may or may not be backed up again unless Paritosh is specifically told to do so.  All paths are relative to the /workspace mount point.
-
* Telephone Speech
+
* English Telephone Speech
-
** helmholtz1/Switchboard - Contains speech from the original CDs, original MS98 transcriptions, and e-speech prosody predictions.
+
** {helmholtz1/Switchboard,fluffy1/switchboard-1} - Switchboard 1 audio, orthographic transcriptions, and e-speech prosody predictions.
 +
** fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
 +
** fluffy1/{hub5,hub5_eval} - The NIST hub5 training and test data
 +
** fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard data
 +
** mickey0/english_callhome - Conversations among family and friends
 +
** fluffy1/ntimit-{train,test} - The NTIMIT read speech corpus, passed through a telephone channel
 +
 
 +
* Chinese Telephone Speech
 +
** mickey0/Mandarin_callhome
 +
 
 +
* English Broadband Speech
 +
** mickey1/BN97 - Broadcast News
 +
** tidigits - Isolated Digits and Phone Numbers
 +
 
 +
* Chinese Broadband Speech
 +
** nibbler1/MBN - Mandarin Broadcast News
 +
** nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
 +
 
 +
* Audiovisual and Multimicrophone Data
 +
** fluffy0/data/{icsi_mr,isl_meeting}_transcr - Transcriptions of meeting room data from ISL, ICSI.  Audio and (remote camera) video available but not online
 +
** mickey1/AVICAR_DIST - 4-camera, 8-microphone recording of read sentences, phone numbers, and isolated letters in a moving car
* Children
* Children
-
** fletcher1/kidspeech, helmholtz1/kidspeech (are these duplicates?)
+
** fletcher1/kidspeech
* Prosodically Transcribed Speech
* Prosodically Transcribed Speech
 +
** Radio_Speech_Corpus - The Boston University Radio Speech Corpus, seven speakers with ToBI transcription
** Boston Directions Corpus: fletcher1/bdc
** Boston Directions Corpus: fletcher1/bdc
-
* Meeting Room Data
+
* Audiovisual
-
** fluffy0/icsi_mr_transcr
+
 
-
** fluffy0/isl_meeting_transcr
+
* Biomedical Image Data
 +
** mickey0/UW_XRAY_MICROBEAM - point tracking data, 100Hz sampling, obtained using X-ray microbeam
 +
** {speech_web/mri,http://www.isle.uiuc.edu/mri} - much 3D MRI of vowels, a little fast 2D MRI of the alphabet, and a little high-res 3D MRI of excised tongues
* Programs
* Programs
Line 27: Line 50:
** SPHERE C/C++ headers: fluffy0/programs/sphere
** SPHERE C/C++ headers: fluffy0/programs/sphere
** SRILM language modeling toolkit: fluffy0/programs/srilm
** SRILM language modeling toolkit: fluffy0/programs/srilm
-
** SVM tools: {fluffy0/programs,nibbler0/speech_apps}/svmlib, {fluffy0/programs,nibbler0/speech_apps}/svm_light
+
** SVM tools: {fluffy0/programs,nibbler0/speech_apps}/svmlib, {fluffy0/programs,nibbler0/speech_apps}/svm_light, mickey0/SVM/PVTK
-
--Backups--
+
==Backups==
If you have personal working directories that should be regularly backed up, outside of your own home directory, list them here.
If you have personal working directories that should be regularly backed up, outside of your own home directory, list them here.

Revision as of 17:02, 22 March 2006

LVCSR at Illinois Computer Resources

This page has two main purposes: (1) Help local users keep track of databases. Disk management is decentralized, therefore messy; this page is intended to help. (2) Specify which directories are archival, and which are temporary.

Directories listed below are accessible only to ISLE/IFP registered users.

Databases and Applications

These directories have been backed up. They may or may not be backed up again unless Paritosh is specifically told to do so. All paths are relative to the /workspace mount point.

  • English Telephone Speech
    • {helmholtz1/Switchboard,fluffy1/switchboard-1} - Switchboard 1 audio, orthographic transcriptions, and e-speech prosody predictions.
    • fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
    • fluffy1/{hub5,hub5_eval} - The NIST hub5 training and test data
    • fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard data
    • mickey0/english_callhome - Conversations among family and friends
    • fluffy1/ntimit-{train,test} - The NTIMIT read speech corpus, passed through a telephone channel
  • Chinese Telephone Speech
    • mickey0/Mandarin_callhome
  • English Broadband Speech
    • mickey1/BN97 - Broadcast News
    • tidigits - Isolated Digits and Phone Numbers
  • Chinese Broadband Speech
    • nibbler1/MBN - Mandarin Broadcast News
    • nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
  • Audiovisual and Multimicrophone Data
    • fluffy0/data/{icsi_mr,isl_meeting}_transcr - Transcriptions of meeting room data from ISL, ICSI. Audio and (remote camera) video available but not online
    • mickey1/AVICAR_DIST - 4-camera, 8-microphone recording of read sentences, phone numbers, and isolated letters in a moving car
  • Children
    • fletcher1/kidspeech
  • Prosodically Transcribed Speech
    • Radio_Speech_Corpus - The Boston University Radio Speech Corpus, seven speakers with ToBI transcription
    • Boston Directions Corpus: fletcher1/bdc
  • Audiovisual
  • Biomedical Image Data
    • mickey0/UW_XRAY_MICROBEAM - point tracking data, 100Hz sampling, obtained using X-ray microbeam
    • {speech_web/mri,http://www.isle.uiuc.edu/mri} - much 3D MRI of vowels, a little fast 2D MRI of the alphabet, and a little high-res 3D MRI of excised tongues
  • Programs
    • Finite State Machines: {fluffy0/programs,nibbler0/speech_apps}/fsm-4.0 (ATT), {fluffy0/programs,nibbler0/speech_apps}/fst-1.0-RC1 (MIT)
    • HTK: fluffy0/programs/htk-3.3
    • SPHERE C/C++ headers: fluffy0/programs/sphere
    • SRILM language modeling toolkit: fluffy0/programs/srilm
    • SVM tools: {fluffy0/programs,nibbler0/speech_apps}/svmlib, {fluffy0/programs,nibbler0/speech_apps}/svm_light, mickey0/SVM/PVTK

Backups

If you have personal working directories that should be regularly backed up, outside of your own home directory, list them here.

  • Art
    • mickey0/akantor
    • rizzo1/akantor is itself a backup of svn because it cannot be backed up in the normal way.
  • Sarah
    • nibbler0/data
    • rizzo0/sborys
    • spot1/sborys
    • tico0/sborys
  • Bowon
    • mickey1/AVICAR_AUDIO
    • mickey1/AVICAR_DATA
    • mickey1/AVICAR_DIST
    • mickey1/AVICAR_DIST_OLD
    • rizzo1/bowonlee
    • mickey0/bowonlee
  • Mital
    • mickey0/magandhi
Personal tools