Computer Resources

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
(Databases)
(LVCSR at Illinois Computer Resources)
Line 5: Line 5:
* [http://www.isle.uiuc.edu/data/README.txt Data Organization README]
* [http://www.isle.uiuc.edu/data/README.txt Data Organization README]
 +
* Data not yet moved to ifp-32-2:
 +
** Switchboard 1: /workspace/{fluffy1,helmholtz1}/switchboard-1/ contain some data transformations not on ifp-32-2
 +
** fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
 +
** fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard-1 extracts
 +
** fletcher1/bdc - The Boston Directions Corpus, two speakers have prosodic transcriptions, others don't
 +
** Mandarin Broadcast News: nibbler1/MBN contains some transformations not on ifp-32-2
 +
** nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
 +
** AVICAR: mickey1/AVICAR_DIST, also available via sftp to ifp-31.ifp.uiuc.edu
 +
** Universal Access Corpus (talkers with cerebral palsy): rizzo1/speech_hearing, also available via sftp to ifp-31.ifp.uiuc.edu
 +
** {speech_web/mri,http://www.isle.uiuc.edu/mri} - much 3D MRI of vowels, a little fast 2D MRI of the alphabet, and a little high-res 3D MRI of excised tongues
 +
* Time-aligned Switchboard Disfluency corpus
 +
** mickey0/sw_disTime-0.9.9 - merged from the original Switchboard time transcription and the Treebank-3 disfluency transcription (TextGrid included)
 +
** mickey0/sw_disTime-1.0.0 (TextGrid NOT included)
==Applications==
==Applications==

Revision as of 00:18, 16 June 2008

LVCSR at Illinois Computer Resources

  • Data not yet moved to ifp-32-2:
    • Switchboard 1: /workspace/{fluffy1,helmholtz1}/switchboard-1/ contain some data transformations not on ifp-32-2
    • fluffy1/12hour - 12 hours extracted from Switchboard 1, with SPHERE and WAV audio, MFCCs, transcriptions.
    • fluffy1/{train-ws96,train-ws97,misc-ws97} - The ICSI phonetically transcribed Switchboard-1 extracts
    • fletcher1/bdc - The Boston Directions Corpus, two speakers have prosodic transcriptions, others don't
    • Mandarin Broadcast News: nibbler1/MBN contains some transformations not on ifp-32-2
    • nibbler0/data/ylzheng/WS04/DATA - Tsinghua Wu-accented Mandarin (MFCC and FMT only, no waveforms)
    • AVICAR: mickey1/AVICAR_DIST, also available via sftp to ifp-31.ifp.uiuc.edu
    • Universal Access Corpus (talkers with cerebral palsy): rizzo1/speech_hearing, also available via sftp to ifp-31.ifp.uiuc.edu
    • {speech_web/mri,http://www.isle.uiuc.edu/mri} - much 3D MRI of vowels, a little fast 2D MRI of the alphabet, and a little high-res 3D MRI of excised tongues
  • Time-aligned Switchboard Disfluency corpus
    • mickey0/sw_disTime-0.9.9 - merged from the original Switchboard time transcription and the Treebank-3 disfluency transcription (TextGrid included)
    • mickey0/sw_disTime-1.0.0 (TextGrid NOT included)

Applications

  • Hidden Markov Models:
    • HTK (Cambridge): fluffy0/programs/htk-3.3
    • DCD (ATT): nibbler0/speech_apps/dcd-2.0
  • Dynamic Bayesian Nets/Graphical Models: nibbler0/speech_apps/GMTK
  • Language Models: fluffy0/programs/srilm
  • Finite State Machines:
    • FSM (ATT): fluffy0/programs/fsm-4.0
    • FST (MIT): fluffy0/programs/fst-1.0-RC1 (MIT)
    • OpenFST : fluffy0/programs/OpenFst/
  • Support Vector Machines:
    • SVMLIB (NJTU): fluffy0/programs/svmlib
    • svm_light (Joachims): fluffy0/programs/svm_light
    • PVTK (UIUC): mickey0/SVM/PVTK
  • Neural Nets
    • quicknet (ICSI): mickey0/quicknet
  • Spectrograms and Waveform Viewing
    • XKL (MIT): nibbler0/speech_apps/xkl-2.3.1
    • ESPS (Entropic Systems, now Microsoft)
    • Praat
  • Speech Data File Formats:
    • SPHERE (NIST): fluffy0/programs/sphere
    • sox (linux): /usr/bin/sox
    • HCopy (Cambridge): see HTK

Backups

If you have personal working directories that should be regularly backed up, outside of your own home directory, list them here.

  • Art
    • mickey0/akantor
    • rizzo1/akantor is itself a backup of svn because it cannot be backed up in the normal way.
  • Sarah
    • nibbler0/data
    • rizzo0/sborys
    • spot1/sborys
    • tico0/sborys
  • Bowon
    • mickey1/AVICAR_AUDIO
    • mickey1/AVICAR_DATA
    • mickey1/AVICAR_DIST
    • mickey1/AVICAR_DIST_OLD
    • rizzo1/bowonlee
    • mickey0/bowonlee
  • Mital
    • mickey0/magandhi
  • Xiaodan
    • spot1/xzhuang2/newbaseline
    • spot1/xzhuang2/workshop
    • c1-15/hasegawa/xzhuang2*
    • /workspace/tico0/AED/
  • Rajiv
    • scratch/rreddy
Personal tools