Projects

From SpeechWiki

(Difference between revisions)

Jump to: navigation, search

Latest revision as of 23:06, 11 March 2013

Here are some projects that SST People are working on. For another view, see our Publications.

@@ Line 1: / Line 1: @@
-Here are some of the projects that [[SST People]] are working on
+Here are some projects that [[SST People]] are working on.  For another view, see our [http://www.isle.uiuc.edu/pubs Publications].
+===SST Group Meetings===
+* [[SST Group Meetings]]
+===Phonetics, Phonology, Semantics===
+; Prosody and Phonology in Automatic Speech Recognition (Landmark-Based Speech Recognition)
+: [[landmarks09F| Group Meeting Schedules and Slides]]
+: [http://www.isle.uiuc.edu/research/landmarks.html Landmark-Based Speech Recognition]
+: [http://www.isle.uiuc.edu/research/prosody_of_disfluency.html Prosody of Disfluency]
 ; Very Large Corpus ASR/ Mixed-Units ASR
-: Large Vocabulary speech recognition using [[:Category:Fisher_Experiments| mixed units on fisher corpus]]
+: [[:Category:Fisher_Experiments|Large Vocabulary speech recognition using mixed units on fisher corpus]]
-: ([http://www.isle.uiuc.edu/~akantor/ Arthur])
-; Acoustic Events
+; [[articulatory_feature_transcription|Articulatory Feature Transcription]]
-: [[compaudition09| Group Meeting Schedules and Slides]]
+: [[Transcription_Guidelines|Transcription Guidelines]]
+: [[Phone-to-Feature_Mapping|Phone-to-Feature Mapping]]
+: [[Meeting_Summaries|Meeting Summaries]]
+: [[Resources|Resources]]
-; Dynamics of Second Language Fluency
+=== Group dynamics and  Discourse ===
-: [[http://serrano.ai.uiuc.edu/CRI/ Group Meeting Schedules and Slides]]
-: [http://www.isle.uiuc.edu/research/fluency.html Fluency Project Description]
-; Groupscope --- Dynamics of Medium-Sized Groups
+; GroupScope --- Dynamics of Medium-Sized Groups
-: [[groupscope09| Group Meeting Schedules and Slides]]
+: [[GroupScope]]
-; Landmark-Based Speech Recognition
+===Language Acquisition, Language Contact, Variability, and Disability===
-: [[landmarks09| Group Meeting Schedules and Slides]]
-: [http://www.isle.uiuc.edu/research/landmarks.html Landmark-Based Speech Recognition Project Desription]
-: [http://www.isle.uiuc.edu/research/prosody_of_disfluency.html Prosody of Disfluency Project Description]
-; Multi-Language Audio Search
+; Multi-Dialect Speech Recognition and Machine Translation for Qatari Broadcast TV
-: [http://hlt.i2r.a-star.edu.sg/starchallenge The Star Challenge] competition home page
+: [[Multi Dialect Arabic]]
-: ([http://netfiles.uiuc.edu/xzhuang2/www/ Xiaodan], [http://www.isle.uiuc.edu/~hsharma Harsh], and others)
+; Cross-Language Transfer Learning
+: [[Linguistic Diversity References]]
+: [http://hlt.i2r.a-star.edu.sg/starchallenge Star Challenge competition]
+; Dynamics of Second Language Fluency
+: [http://serrano.ai.uiuc.edu/CRI/ Group Meeting Schedules and Slides]
+: [http://www.isle.uiuc.edu/research/fluency.html Description]
+: [[Dynamics of Second Language Fluency Data Description|Data Description]]
 ; Universal Access
 : [[dysarthria09|Group Meeting Schedules and Slides]]
+: [http://www.isle.uiuc.edu/ua/index.html Description]
 : [http://www.isle.uiuc.edu/UASpeech UASpeech Database]
-: [http://www.isle.uiuc.edu/ua/index.html UA Project Description]
+===Multimodal Fusion, Speech and Non-Speech===
+; Audiovisual Event Detection and Visualization
+: [[compaudition09| Group Meeting Schedules and Slides]]
+: [[acoustic_events_papers| Papers]]
+: [[Visualization Experiments]]
+; Mobile Platform Acoustic-Frequency Environmental Tomography (was Dereverberation)
+: [[compaudition09| Group Meeting Schedules]]
+: [[Dereverberation Project| Project Status and Working Notes]]
 ; Audiovisual Speech Recognition
-: [http://www.isle.uiuc.edu/research/audiovisual.html AVSR Project Description]
+: [http://www.isle.uiuc.edu/research/audiovisual.html Description]
 : [http://www.isle.uiuc.edu/AVICAR/ AVICAR Database]
-[http://www.isle.uiuc.edu/pubs Publications] | [http://www.isle.uiuc.edu/sst.html Group Web Page]
+; Smaragdis collaboration
+: [[Image:Smaragdis-130218.jpg]]
+: [[Image:Smaragdis-130311.jpg]]
+Pseudocode spec for the sound input class
+(and also output later, but not read-and-write):
+class input_t{
+// Definition of stream characteristics
+class specs_t{
+size_t channels;
+double sample_rate;
+enum sample_format;
+};
+//
+// Constructors
+//
+input_t( ??? stream, bool in_or_out, size_t ch, double sr, enum frm)
+{
+switch( stream){
+case "file"
+use ffmpeg
+case "socket"
+use homebrew code?
+case "url"
+use VLC?
+case "adc"
+Use portaudio
+case "dac"
+Use portaudio
+}
+}
+input_t( ??? stream, input_t example); // copy stream attributes
+input_t( ??? stream, input_t::specs_t example); // copy stream attributes
+Assignment/copy operators
+//
+// Destructor
+//
+~input_t() // bookkeeping with closing file/net/etc.
+//
+// Utilities
+//
+double sample_rate();
+size_t channels();
+enum sample_format();
+bool eof();
+bool();
+//
+// Seeking
+//
+seek( size_t s); // move to sample frame s
+seek( double t); // move to second t
+//
+// Reading
+// output should be channels by sample frames
+array<T> &read( size_t n, size_t offset, int channel_mask); // sample frames
+array<T> &read( double n, double offset, int channel_mask); // seconds
+//
+// Writing
+//
+write( array<T> &x, size_t offset, int channel_mask); // sample frames
+write( array<T> &x, double offset, int channel_mask); // seconds
+write_add( array<T> &x, size_t offset, int channel_mask); // sample frames
+write_add( array<T> &x, double offset, int channel_mask); // seconds
+};
+We are going for a blocking interface instead of cumbersome callbacks for now.  The stream parameters when reading can be used to perform
++on the fly resampling and channel remapping.  I'm attaching the board doodling in case I missed something.
+We are currently working on the getting code to work for the simple case:
+main()
+{
+input_t in( ...);
+while( in){
+x = in.read( ...);
+y = feature( x);
+plot( y);
+}
+}
+I'm working on the feature object, Camille is working on the input object.
+==See also==
+* [http://www.isle.illinois.edu/sst/pubs/ SST publications]
+* [http://www.isle.illinois.edu/sst/ SST group web page]
+* [[Special:Upload]]

Projects

From SpeechWiki

Latest revision as of 23:06, 11 March 2013

Contents

SST Group Meetings

Phonetics, Phonology, Semantics

Group dynamics and Discourse

Language Acquisition, Language Contact, Variability, and Disability

Multimodal Fusion, Speech and Non-Speech

See also

Views

Personal tools

Navigation

Toolbox

Search