Meeting Summaries

From SpeechWiki

Revision as of 16:56, 22 February 2010 by Hkim17 (Talk | contribs)
Jump to: navigation, search

Contents

March 12, 2010

March 26, 2010

February 26, 2010: Practice week 6

Assigned utterances

  • will finish sw2012A-ws96-i-0012
  • sw2020B-ws96-i-0029 ("I I really I have I have strong objections to that")
  • sw2015A-ws96-i-0026 ("peddling products if I wanted their products I would...")

February 19, 2010: Practice week 5

Assigned utterances (will discuss in the following order)

  • will finish sw2019A-ws96-i-0071 ("but she is a great comfort to me")
  • sw2012A-ws96-i-0012 ("right it turned out to be an invitation")
  • sw2019A-ws96-i-0088 ("aw what a shame")
  • sw2020B-ws96-i-0029 ("I I really I have I have strong objections to that")
  • sw2015A-ws96-i-0026 ("peddling products if I wanted their products I would...")

Discussed utterances

  • finished sw2019A-ws96-i-0071 ("but she is a great comfort to me")
  • finished sw2019A-ws96-i-0088 ("aw what a shame")
  • half way through sw2012A-ws96-i-0012 ("right it turned out to be an invitation")


February 5, 2010: Practice week 4

Assigned utterances

  • no new utterances assigned

Discussed utterances

  • finished sw2006A-ws96-i-0037 ("everybody has their home phone number type of job")
  • finished up to the word 'comport' in sw2019A-ws96-i-0071

Discussion

  • pl1/dg1 vs. pl2/dg2
    • We decided to keep the more forward constriction in pl1/dg1 unless doing so breaks up the intervals: e.g., a segment has RHO in pl1, while the following segment has the simultaneous constrictions LAB and RHO -> In this case, we put LAB in the pl2 so that RHO can be put in one interval (i.e., pl1).
    • We decided to revive the distinction between pl1/dg1 vs. pl2/dg2 because we find it harder to compare the two transcribers' tiers when they put the simultaneous constrictions in different orders. (We have considered an option that we do not force the transcribers to do this themselves but rather we sort these features later. However, this could be a problem when the end boundaries of intervals to be sorted are not perfectly time-aligned.
  • Our transcription procedures will be streamlined as follows:
    • 1) Heejin assigns utterances through our wiki page: Heejin will distribute pairs of .wav and .textgrid.
    • 2) L1 and L2 transcribe an utterance, using the Praat interface and remove scripts.
    • 3) When transcribers finish the whole utterance of a file, they run the script check_correctness.
      • This script prints out three kinds of mistakes : 1) * (unspecified feature), 2) empty intervals, and 3) violations to the vowel rule.
      • Transcribers fix the mistakes, and run the script again until no mistake exits.
      • This process is necessary to avoid spending our discussion time in correcting mistakes.
    • 4) Transcribers send the finished textgrid file to Heejin.
      • Assuming that we will have biweekly meetings on Fridays, transcriber send finished utterances to Heejin by Wednesday night, two days before the meeting.
    • 5) Heejin will run 'merge_labels' and 'merge_files" scripts
      • This process is meant to make our discussion more efficient: e.g., a comparison window is more readable.
    • 6) Meeting: Transcribers revise their original textgrid files, if any change is made during discussion.
    • 7) Transcribers send the revised version to Heejin.
    • 8) Heejin runs 'merge_labels' and 'merge_textgrids" scripts again to produce finalized transcriptions for the 1st round.
    • 9) 2nd round?
    • 10) Heejin and Karen do inter-transcribers agreement analysis.


January 29, 2010: Practice week 3

Assigned utterances

  • sw2015A-ws96-i-0026
  • sw2019A-ws96-i-0088
  • sw2020B-ws96-i-0029

Discussed utterances

  • finished sw2005A-ws96-i-0041

Discussion

  • Add a new symbol 'xxx' for marking any non-speech regions (e.g., inhalation, laughter)
    • Update the two scripts ('gesture.man', and 'convert.praat') accordingly.
  • Additional revisions to the Praat interface script.
    • We have noticed that the 'directory.txt' gets a new line inserted for some machines or OS (not predictable when it happens). Thus we decide to remove 'directory.txt'. Instead, transcribers will update the directory directly on the two scripts; i.e., 'gesture-init-button.praat', and 'load-gesture-manual.praat'.
  • Transcribers find that when they change the location of a boundary and have new boundaries inserted using the conversion function, they need to remove previous boundaries in multiple tiers. It might be helpful to have a script that removes such boundaries all at once. Heejin will write a script and transcribers will evaluate its efficiency.

January 22, 2010: Practice week 2

Assigned utterances

  • sw2010B-ws96-i-0034
  • sw2012A-ws96-i-0012
  • sw2019A-ws96-i-0071

Discussed utterances

  • finished sw2005B-ws96-i-0015
  • finished sw2010B-ws96-i-0034

Discussion

  • Distinguish non-canonical segments from canonical ones in the phone tier.
    • Whenever a segment is non-canonical, use a '*' in the phone tier. Feel free to use other symbols too, e.g. 'uw/ux*', as long as there's a '*' in there somewhere to flag the segment for later.
  • Where to put the boundary between a glide and a vowel.
    • When there is a smooth transition between a glide and a vowel, put a boundary in the middle of the transition region.
  • Several changes to stops.
    • A new value BUR is added to the degree feature tiers for stop bursts. Any canonical p, t, k, b, d, g will have dg1 = BUR. This is to avoid the situation where canonical stop bursts have the same features as fricatives.
    • All stop bursts will be canonically voiceless (glo = VL). Voiced vs. voiceless stops will only be distinguished by their VOTs (as it should be).
    • Canonical stop closures pcl, tcl, kcl, bcl, dcl, gcl will stay as before: canonically, glo = VOI for voiced stops, and glo = VL for voiceless stops. However, if you don't see a voice bar, that should of course be changed to glo = VL.
    • The vowel after a stop burst starts when the upper formant structure starts, not at the beginning of voicing. The onset of glo = VOI is still to be marked at the actual onset of voicing, i.e. at the first sign of periodicity in the signal.
  • Make new utterances available to transcribers by Sunday night.


January 15, 2010: Practice week 1

Assigned utterances

  • sw2005A-ws96-i-0041
  • sw2005B-ws96-i-0015
  • sw2006A-ws96-i-0037

Discussed utterances

  • half way through sw2005B-ws96-i-0015

Discussion

  • We tried the new Praat interface; major components are 1) automatic creation of TextGrid files, 2) a function of label insertion (instead of typing), 3) automatic phone-to-gesture conversion for canonical phones. We think that this interface works better than the previous WaveSurfer interface.
  • We will continue going through the STP utterances that were used for the WS06 transcriptions for practice. Then once things are set, we will do additional utterances from the STP "TRAIN" set, which will be our first set of "official" new transcriptions.

January 6, 2010: Planning meeting

  • New transcribers tried to do articulatory transcriptions for two utterances (i.e., everybody_KL.wav; everybody_2006.wav) independently, using the existing WaveSurfer interface, and shared their experiences at the meeting.
  • Our goal will be to transcribe roughly 25 minutes (equivalent to ~300 Switchboard-sized utterances) in the winter and spring quarters. This assumes that we will have at least 4 transcribers, that each utterance will be transcribed by at least 2 transcribers, and that each transcriber can do ~30 seconds of speech per week. (With some variation among transcribers, as they may be available different numbers of hours per week.)
  • We will use utterances from Switchboard and/or Buckeye. Buckeye is cleaner audio, has more contiguous data per speaker, and possibly has fewer non-speech artifacts. On the other hand, Switchboard is larger and more often used in speech recognition experiments. We'll stick to Switchboard for the next batch of practice utterances, then decide.
  • We will try switching to Praat. Heejin will put together the Praat configuration and send us instructions.
  • We think we can get all of the functionality of the WaveSurfer interface in Praat, except possibly the colors in different tiers and the multiple spectrograms. We think one spectrogram should be fine. Te colors are mainly useful when comparing multiple people's transcriptions side by side. Heejin will try to figure out if this can be done in Praat, or else if the tiers can be separated in some other way, e.g. with thick dividing lines or different label font colors.
  • We agreed to make the following changes to the transcription protocol (at least for the next batch of practice utterances; we may revise a bit further after that).
    • For now, we will start with the word transcription given, but not the phone transcription. Everyone seemed to agree that they find it useful to come up with their own phone transcription first (at least mentally), then do the articulatory tiers.
    • We will not do the phone-feature hybrid version, just the all-feature version, of the protocol, for now. However, if possible in Praat, we will allow transcribers to mark a segment as corresponding to a canonical phone, which will be automatically entered as the corresponding features in the multiple tiers. Heejin will try to figure out if this is possible in Praat.
    • The two constrictions in dg1/pl1, dg2/pl2 can be ordered arbitrarily (as opposed to constriction 1 always being the more forward one as before). In order to compare multiple transcribers' work, we'll post-process these tiers to sort them into a consistent order.
  • Karen will change the guidelines to reflect the above, and will add word examples to the phone symbol table.
  • We discussed the possibility of having a web interface for submitting and viewing transcriptions at different sites. In principle, we'd like to allow anyone anywhere to submit transcriptions and even suggest new/modified tiers. We agreed that it would be too big a project to make the entire interface web-based, but that it would be doable to have a site where people submit transcriptions and view a static waveform and spectrogram, along with any transcriptions submitted so far, and can play back segments. Mark volunteered to script this up (in one day, where the one day will occur at an unspecified future time :)
Personal tools