Background
From SpeechWiki
Line 9: | Line 9: | ||
We are not aware of any data set that has been labeled at the feature level. There are, of course, some corpora of measured articulation, such as MOCHA or the Wisconsin X-ray microbeam database. These could also be used, but the mapping from measurements to feature values is non-trivial, and often the measurements do not include some important information, such as nasality. This motivates us to generate this new data set. | We are not aware of any data set that has been labeled at the feature level. There are, of course, some corpora of measured articulation, such as MOCHA or the Wisconsin X-ray microbeam database. These could also be used, but the mapping from measurements to feature values is non-trivial, and often the measurements do not include some important information, such as nasality. This motivates us to generate this new data set. | ||
- | Our earlier work is published in , which reports the data collection and analysis for a small set of data from Switchboard (78 SVitchboard and 9 STP utterances) | + | Our earlier work is published in *[[Media:livescu_icassp07_trans.pdf|Manual transcription of conversational speech at the articulatory feature level, Karen Livescu et al., ICASSP, 2007]], which reports the data collection and analysis for a small set of data from Switchboard (78 SVitchboard and 9 STP utterances). |
- | Currently, we are extending our | + | Currently, we are extending our previous work, aiming to achieve the following goals: |
- | * Transcribing additional data (Switchboard and or Buckeye) | + | * Transcribing additional data (Switchboard and/or Buckeye) |
* Updating the transcription protocol and interface | * Updating the transcription protocol and interface | ||
* Making a version of the interface that can be used online across sites | * Making a version of the interface that can be used online across sites | ||
* Analysis and statistical modeling of the data | * Analysis and statistical modeling of the data |
Revision as of 22:50, 9 February 2010
There are several motivations for generating a set of articulatory feature-level transcriptions:
- To serve as reference for measuring feature classifier accuracy
- To train pronunciation models separately from acoustic models
- To study asynchrony and reduction effects
In the past, classifier accuracy has been measured by comparison against a reference phonetic transcription, assuming some mapping from phones to feature values. However, especially for conversational speech, we cannot assume that such a mapping would give us accurate reference feature values; there is too much coarticulation and reduction.
We are not aware of any data set that has been labeled at the feature level. There are, of course, some corpora of measured articulation, such as MOCHA or the Wisconsin X-ray microbeam database. These could also be used, but the mapping from measurements to feature values is non-trivial, and often the measurements do not include some important information, such as nasality. This motivates us to generate this new data set.
Our earlier work is published in *Manual transcription of conversational speech at the articulatory feature level, Karen Livescu et al., ICASSP, 2007, which reports the data collection and analysis for a small set of data from Switchboard (78 SVitchboard and 9 STP utterances).
Currently, we are extending our previous work, aiming to achieve the following goals:
- Transcribing additional data (Switchboard and/or Buckeye)
- Updating the transcription protocol and interface
- Making a version of the interface that can be used online across sites
- Analysis and statistical modeling of the data