Meeting Summaries

From SpeechWiki

Jump to: navigation, search

Contents

May 28, 2010: TTI

May 27, 2010: UIUC

May 24, 2010: UIUC

  • Finished two utterances: sw3070_0035, sw3083_0043
  • Tricky vowel identifications for 'eh vs. ih'. We also had an example of reduced vowel 'ax' for a substantially long portion (longer than 100ms)

May 21, 2010: UIUC

  • Finished three utterances: sw3047_0046, sw3052_0002, sw3068_0023
  • A good (or challenging) example of reduction in sw3068_0023: "I [see what you're] saying"

May 18, 2010: UIUC

  • Finished three utterances: sw3021_0044, sw3034_0087, sw3040_0058

May 14, 2010: TTI

May 10, 2010: UIUC

  • Finished two utterances: sw3015_0011, sw3015_0078.

May 3, 2010: UIUC

  • Official transcription at the UIUC site has started. Two utterances were discussed in this meeting: sw3002_0046, sw3011_0090.

April 23, 2010: UIUC

  • Finished two utterances: sw2065_0095, sw2064_0060
  • Encountered examples of word-final [t] that sounded glottalized.

April 23, 2010: UIUC

April 21, 2010: UIUC

  • Karen and Heejin had a 2nd round meeting for the MP's practice utterances.

April 19, 2010: UIUC

  • Heejin and MP had a practice meeting. There were several cases where we decided to mark RHO on pl2, instead of having a separate [r] segment.

April 16, 2010: UIUC

  • We went over 0088, 0029 and 0041.
  • Some of the challenging segments were:
    • 0088
      • For the last portion of the word SHAME, there is no clear [m] in the spectrogram, but we hear closing action of lips. We decided to put +nas and LIP-APP on the vowel segment, with no separate [m] segment.
    • 0041
      • For [f] in 'FINDING', there is strong burst toward the end portion: we decided not to mark it separately.
      • In PROPER, there is some kind of hesitation before the word. Do we have to label it?
      • An example of LAB-FRIC
    • 0029
      • In I (the first one), the quality of the offset sounds non-canonical (i.e., ix). So we label it as aa-ix (instead of ay1-ix or ay1-ay2).

April 16, 2010: TTI

  • Two transcribers finished comparing 2040_0023: a major difference was about rhoticism on some of the vowels.

April 9, 2010: TTI

  • finished sw2041_0015, sw2051_0024, sw2057_0088
  • Guidelines for BUR+ASP marking were reviewed again since we encountered errors still. We also reviewed some examples of different types of stop burst in English: the data was retrieved from Heejin's lab-recording of a female native speaker of English.
  • We discussed if it would be necessary to add LAB-APP on pl2-dg2 for +rnd sounds: e.g., 'deali[ng w]ith' in 0015. If there is evidence of even narrower lip constriction, we would, but the evidence could be very subtle. We will test this issue as we label more utterances.
  • We dealt with a lot of [l]s, and discussed if they were CLO vs. APP, and if they were velarized (if so, VEL-APP should be added on pl2-dg2).

April 2, 2010: UIUC

  • We have formed a new transcription group at the UIUC site: transcriber 1 (MP) and transcriber 2 (HK).
  • In this meeting, we had a practice session for a new transcriber (MP).
  • MP got the transcription interface ready in her laptop, and we went over basic features and transcription procedures together.
  • Practiced labeling sw2019A-ws96-i0088 and sw2006A-ws96-i-0037

March 26, 2010: TTI

  • finished sw2039_0075 and sw2040_0011, and half-way through with sw2040_0023

March 12, 2010: TTI

  • This was our first official transcription meeting.
  • We finished two utterances: sw2024_0003, sw2038_0023
  • In this meeting, we encountered several examples of APP for /v/ and /b/.
  • For a canonical [w], we decided to add VER and APP for pl2 and dg2, respectively.
  • When a segment has alternate short intervals of voicing features such as IRR - VOI - IRR -VOI, we decided to label it IRR for the whole area.

February 26, 2010: Practice week 6

Assigned utterances

  • sw2012A-ws96-i-0012
  • sw2020B-ws96-i-0029 ("I I really I have I have strong objections to that")
  • sw2015A-ws96-i-0026 ("peddling products if I wanted their products I would...")

Discussed utterances

  • finished sw2012A-ws96-i-0012
  • finished up to 'strong' in sw2020B-ws96-i-0029 ("I I really I have I have strong objections to that")

Discussion

  • This was our last practice meeting! (We will finish 0029 in the next meeting though becasue this utterance contains interesting segments such as 'strong'.
  • We will begin our official transcription, with Set 1(14 STP utterances). Utterances were picked from the STP train data by Heejin: all utterances are 5- to 14-word length, as in the WS06 study.
  • Transcribers will transcribe utterances in the order of ascending number of file names (i.e., start with 'sw2024_0003.wav'). Transcribers are encouraged to do as many utterances as they can before next meeting (March 12).
  • Issues that were discussed in this meeting (mostly for clarification purpose before moving to our official transcription)
    • Diphthongs
      • When diphthongs are realized as monophthongs, these should be labeled as monophthongs. e.g., "aw1" should be labeld as 'ae' rather than 'aw1' when there is no offset.
      • When a diphthong is realized as a diphthong, but parts of a diphthong are non-canonical, diphthong labels should be used for the canonical portion. e.g., when /aɪ/ is realized something like [ai], transcribe as [ay1 iy], instead of [aa iy].
      • If a diphthong is entirely non-canonical, mark it as it should be in terms of phonetic correspondence.
      • If it is difficult to tell where the offglide of a diphthong begins, then mark it in the middle of the transition.
    • Convention for BUR followed by ASP
      • For aspirated obstruents like in the word "invi[tʰ]ation" (0012), do labeling in the following sequence:
        • Label the burst portion as 't' in the tier 2 and do phone-to-gesture conversion.
        • If you see a clear distinction between burst and aspiration, mark the aspirated portion as APP in the tier 4 (dg1) and ASP in the tier 8 (glo). You don't have to change other features such as place.
        • If voicing extends into the ASP section, then label it A+VO (aspirated with voice) as necessary.
    • A reminder about when to use A+VO
      • voiced [h]
      • aspirated vowels/liquids/glides
      • aspirated part of stop burst when voicing for the following vowel starts prior to the onset of vowel (This is newly added from the above discussion).
    • Creaky voice vs. IRR
      • We have noticed that low-pitched voices have creaky voice almost always, and we want to clarify how to deal with creaky voices.
      • Creaky voice should be labeled as VOI as long as intervals between pulses are regular.
      • Creaky voice receives the IRR label ONLY if a segment shows irregular spacing. That is, it is NOT the case that creaky voices are always marked distinctively from others; this is mostly based on the fact that creaky voicing isn't phonemically significant in English.
    • H#
      • H# in the beginning and end of file: Label as SIL
      • For H# in the middle of file, decide whether it is SIL or xxx (non-speech), and mark it accordingly.
    • [l]
      • The default dg1 is CLO, but make sure to check if it has full closure or it is just approximant (in this case, transcribers should manually change CLO to APP).
    • [hh]
      • Remember that [hh] has features just like a vowel, except ASP for the glo feature. Rounding and vowel quality are contextual, so transcribers should fill in manually.

February 19, 2010: Practice week 5

Assigned utterances (will discuss in the following order)

  • will finish sw2019A-ws96-i-0071 ("but she is a great comfort to me")
  • sw2012A-ws96-i-0012 ("right it turned out to be an invitation")
  • sw2019A-ws96-i-0088 ("aw what a shame")
  • sw2020B-ws96-i-0029 ("I I really I have I have strong objections to that")
  • sw2015A-ws96-i-0026 ("peddling products if I wanted their products I would...")

Discussed utterances

  • finished sw2019A-ws96-i-0071 ("but she is a great comfort to me")
  • finished sw2019A-ws96-i-0088 ("aw what a shame")
  • half way through sw2012A-ws96-i-0012 ("right it turned out to be an invitation")

Discussion

  • If transcribers want to flag a non-canonical segment for discussion, use '!' (instead of '*') in the phone tier.

February 5, 2010: Practice week 4

Assigned utterances

  • no new utterances assigned

Discussed utterances

  • finished sw2006A-ws96-i-0037 ("everybody has their home phone number type of job")
  • finished up to the word 'comport' in sw2019A-ws96-i-0071

Discussion

  • pl1/dg1 vs. pl2/dg2
    • We decided to keep the more forward constriction in pl1/dg1 unless doing so breaks up the intervals: e.g., a segment has RHO in pl1, while the following segment has the simultaneous constrictions LAB and RHO -> In this case, we put LAB in the pl2 so that RHO can be put in one interval (i.e., pl1).
    • We decided to revive the distinction between pl1/dg1 vs. pl2/dg2 because we find it harder to compare the two transcribers' tiers when they put the simultaneous constrictions in different orders. (We have considered an option that we do not force the transcribers to do this themselves but rather we sort these features later. However, this could be a problem when the end boundaries of intervals to be sorted are not perfectly time-aligned.
  • Our transcription procedures will be streamlined as follows:
    • 1) Heejin assigns utterances through our wiki page: Heejin will distribute pairs of .wav and .textgrid.
    • 2) L1 and L2 transcribe an utterance, using the Praat interface and remove scripts.
    • 3) When transcribers finish the whole utterance of a file, they run the script check_correctness.
      • This script prints out three kinds of mistakes : 1) * (unspecified feature), 2) empty intervals, and 3) violations to the vowel rule.
      • Transcribers fix the mistakes, and run the script again until no mistake exits.
      • This process is necessary to avoid spending our discussion time in correcting mistakes.
    • 4) Transcribers send the finished textgrid file to Heejin.
      • Assuming that we will have biweekly meetings on Fridays, transcriber send finished utterances to Heejin by Wednesday night, two days before the meeting.
    • 5) Heejin will run 'merge_labels' and 'merge_files" scripts
      • This process is meant to make our discussion more efficient: e.g., a comparison window is more readable.
    • 6) Meeting: Transcribers revise their original textgrid files, if any change is made during discussion.
    • 7) Transcribers send the revised version to Heejin.
    • 8) Heejin runs 'merge_labels' and 'merge_textgrids" scripts again to produce finalized transcriptions for the 1st round.
    • 9) 2nd round?
    • 10) Heejin and Karen do inter-transcribers agreement analysis.


January 29, 2010: Practice week 3

Assigned utterances

  • sw2015A-ws96-i-0026
  • sw2019A-ws96-i-0088
  • sw2020B-ws96-i-0029

Discussed utterances

  • finished sw2005A-ws96-i-0041

Discussion

  • Add a new symbol 'xxx' for marking any non-speech regions (e.g., inhalation, laughter)
    • Update the two scripts ('gesture.man', and 'convert.praat') accordingly.
  • Additional revisions to the Praat interface script.
    • We have noticed that the 'directory.txt' gets a new line inserted for some machines or OS (not predictable when it happens). Thus we decide to remove 'directory.txt'. Instead, transcribers will update the directory directly on the two scripts; i.e., 'gesture-init-button.praat', and 'load-gesture-manual.praat'.
  • Transcribers find that when they change the location of a boundary and have new boundaries inserted using the conversion function, they need to remove previous boundaries in multiple tiers. It might be helpful to have a script that removes such boundaries all at once. Heejin will write a script and transcribers will evaluate its efficiency.

January 22, 2010: Practice week 2

Assigned utterances

  • sw2010B-ws96-i-0034
  • sw2012A-ws96-i-0012
  • sw2019A-ws96-i-0071

Discussed utterances

  • finished sw2005B-ws96-i-0015
  • finished sw2010B-ws96-i-0034

Discussion

  • Distinguish non-canonical segments from canonical ones in the phone tier.
    • Whenever a segment is non-canonical, use a '*' in the phone tier. Feel free to use other symbols too, e.g. 'uw/ux*', as long as there's a '*' in there somewhere to flag the segment for later.
  • Where to put the boundary between a glide and a vowel.
    • When there is a smooth transition between a glide and a vowel, put a boundary in the middle of the transition region.
  • Several changes to stops.
    • A new value BUR is added to the degree feature tiers for stop bursts. Any canonical p, t, k, b, d, g will have dg1 = BUR. This is to avoid the situation where canonical stop bursts have the same features as fricatives.
    • All stop bursts will be canonically voiceless (glo = VL). Voiced vs. voiceless stops will only be distinguished by their VOTs (as it should be).
    • Canonical stop closures pcl, tcl, kcl, bcl, dcl, gcl will stay as before: canonically, glo = VOI for voiced stops, and glo = VL for voiceless stops. However, if you don't see a voice bar, that should of course be changed to glo = VL.
    • The vowel after a stop burst starts when the upper formant structure starts, not at the beginning of voicing. The onset of glo = VOI is still to be marked at the actual onset of voicing, i.e. at the first sign of periodicity in the signal.
  • Make new utterances available to transcribers by Sunday night.

January 15, 2010: Practice week 1

Assigned utterances

  • sw2005A-ws96-i-0041
  • sw2005B-ws96-i-0015
  • sw2006A-ws96-i-0037

Discussed utterances

  • half way through sw2005B-ws96-i-0015

Discussion

  • We tried the new Praat interface; major components are 1) automatic creation of TextGrid files, 2) a function of label insertion (instead of typing), 3) automatic phone-to-gesture conversion for canonical phones. We think that this interface works better than the previous WaveSurfer interface.
  • We will continue going through the STP utterances that were used for the WS06 transcriptions for practice. Then once things are set, we will do additional utterances from the STP "TRAIN" set, which will be our first set of "official" new transcriptions.

January 6, 2010: Planning meeting

  • New transcribers tried to do articulatory transcriptions for two utterances (i.e., everybody_KL.wav; everybody_2006.wav) independently, using the existing WaveSurfer interface, and shared their experiences at the meeting.
  • Our goal will be to transcribe roughly 25 minutes (equivalent to ~300 Switchboard-sized utterances) in the winter and spring quarters. This assumes that we will have at least 4 transcribers, that each utterance will be transcribed by at least 2 transcribers, and that each transcriber can do ~30 seconds of speech per week. (With some variation among transcribers, as they may be available different numbers of hours per week.)
  • We will use utterances from Switchboard and/or Buckeye. Buckeye is cleaner audio, has more contiguous data per speaker, and possibly has fewer non-speech artifacts. On the other hand, Switchboard is larger and more often used in speech recognition experiments. We'll stick to Switchboard for the next batch of practice utterances, then decide.
  • We will try switching to Praat. Heejin will put together the Praat configuration and send us instructions.
  • We think we can get all of the functionality of the WaveSurfer interface in Praat, except possibly the colors in different tiers and the multiple spectrograms. We think one spectrogram should be fine. Te colors are mainly useful when comparing multiple people's transcriptions side by side. Heejin will try to figure out if this can be done in Praat, or else if the tiers can be separated in some other way, e.g. with thick dividing lines or different label font colors.
  • We agreed to make the following changes to the transcription protocol (at least for the next batch of practice utterances; we may revise a bit further after that).
    • For now, we will start with the word transcription given, but not the phone transcription. Everyone seemed to agree that they find it useful to come up with their own phone transcription first (at least mentally), then do the articulatory tiers.
    • We will not do the phone-feature hybrid version, just the all-feature version, of the protocol, for now. However, if possible in Praat, we will allow transcribers to mark a segment as corresponding to a canonical phone, which will be automatically entered as the corresponding features in the multiple tiers. Heejin will try to figure out if this is possible in Praat.
    • The two constrictions in dg1/pl1, dg2/pl2 can be ordered arbitrarily (as opposed to constriction 1 always being the more forward one as before). In order to compare multiple transcribers' work, we'll post-process these tiers to sort them into a consistent order.
  • Karen will change the guidelines to reflect the above, and will add word examples to the phone symbol table.
  • We discussed the possibility of having a web interface for submitting and viewing transcriptions at different sites. In principle, we'd like to allow anyone anywhere to submit transcriptions and even suggest new/modified tiers. We agreed that it would be too big a project to make the entire interface web-based, but that it would be doable to have a site where people submit transcriptions and view a static waveform and spectrogram, along with any transcriptions submitted so far, and can play back segments. Mark volunteered to script this up (in one day, where the one day will occur at an unspecified future time :)
Personal tools