Timeshrinking

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
Line 19: Line 19:
|-
|-
| 1 || 1 || [{{FisherPath}}/exp/timeshrink/test/unit.tri.timeshrink.1/LATEST.log 53.7%] || [{{FisherPath}}/exp/timeshrink/test/triphoneSingleGausian/unit.tri.timeshrink.1.onSingleGaussian/LATEST.log 78.2%] || baseline rerun exactly as timeshrinking to really make sure it's not getting an unfair advantage  
| 1 || 1 || [{{FisherPath}}/exp/timeshrink/test/unit.tri.timeshrink.1/LATEST.log 53.7%] || [{{FisherPath}}/exp/timeshrink/test/triphoneSingleGausian/unit.tri.timeshrink.1.onSingleGaussian/LATEST.log 78.2%] || baseline rerun exactly as timeshrinking to really make sure it's not getting an unfair advantage  
 +
|-
 +
| 1 || 1 || [{{FisherPath}}/exp/timeshrink/test/unit.tri.timeshrink.1/LATEST.log 54.5%] || || baseline rerun exactly as timeshrinking LM_scale 16 to double check it's tuned.  Should be worse, and it is.
|-
|-
| .6 || .6 || 69.3%  
| .6 || .6 || 69.3%  
Line 34: Line 36:
| 1 || 1 || [{{FisherPath}}/exp/timeshrink/test/unit.tri.timeshrink.1.mlp.lmAdj/LATEST.log 50.6%] ||  || PLP+MLP tandem, LM scale 16
| 1 || 1 || [{{FisherPath}}/exp/timeshrink/test/unit.tri.timeshrink.1.mlp.lmAdj/LATEST.log 50.6%] ||  || PLP+MLP tandem, LM scale 16
|-
|-
-
| .9 || .9 || [{{FisherPath}}/exp/timeshrink/test/.../LATEST.log TR] || [{{FisherPath}}/exp/timeshrink/test/triphoneSingleGausian/unit.tri.timeshrink.point9.mlp.onSingleGaussian/LATEST.log 73.1% ] || PLP+MLP tandem
+
| 1 || 1 || [{{FisherPath}}/exp/timeshrink/test/unit.tri.timeshrink.1.mlp.lmSc20/LATEST.log 49.7%] ||  || PLP+MLP tandem, LM scale 20
 +
|-
 +
| 1 || 1 || [{{FisherPath}}/exp/timeshrink/test/unit.tri.timeshrink.1.mlp.lmSc25/LATEST.log 50.0%] ||  || PLP+MLP tandem, LM scale 25
 +
|-
 +
| .9 || .9 || [{{FisherPath}}/exp/timeshrink/test/triphoneSingleGausian/unit.tri.timeshrink.point9.mlp.lmSc16/LATEST.log 49.9% ] || || PLP+MLP tandem lm_scale 16
 +
|-
 +
| .9 || .9 || [{{FisherPath}}/exp/timeshrink/test/triphoneSingleGausian/unit.tri.timeshrink.point9.mlp.lmSc20/LATEST.log 49.3% ] || || PLP+MLP tandem lm_scale 20
 +
|-
 +
 
|-|}
|-|}
Line 42: Line 52:
* Test svitchboard with fisher-trained model to see if we still get good results
* Test svitchboard with fisher-trained model to see if we still get good results
-
* Train and test on plp+mlp, like svitchboard timeshrinking was done.
+
* Train and test on plp+mlp, like svitchboard timeshrinking was done (done, improved baseline and test by 5% WER!).
* Do baseline train+test to see if something changed in going from baseline to timeshrink structure files. (done, helped)
* Do baseline train+test to see if something changed in going from baseline to timeshrink structure files. (done, helped)
Line 50: Line 60:
==final test==
==final test==
20k utterances, at tau=.9, 6.02% of the frames are dropped, 158839 segments and 3.5 frames per segment.
20k utterances, at tau=.9, 6.02% of the frames are dropped, 158839 segments and 3.5 frames per segment.
 +
 +
lm_scale was roughly tuned on the baseline, and the same one was used on the test, although tuning for the test would help because there are %5 fewer frames per word on average.
 +
 +
{| class="wikitable"
 +
|+ Final test Timeshrinking results on fisher
 +
! <math>\tau</math> !! test 20k  utt WER !! comments
 +
|-
 +
| 1 || [{{FisherPath}}/exp/timeshrink/test/unit.tri.timeshrink.1.mlp.lmSc20.final/LATEST.log TE] || PLP+MLP tandem, LM scale 20 (tuned)
 +
|-
 +
| .9 || [{{FisherPath}}/exp/timeshrink/test/unit.tri.timeshrink.point9.mlp.lmSc20.final/LATEST.log TE]  || everything except <math>\tau</math> is same as baseline.
 +
|-|}
 +
==Future Directions==
==Future Directions==
* Can be viewed as a two-mode special case of best-first viterbi search.  So make a real best-first lattice search.  Mark mentioned some attempts in the 80'ies to do this.
* Can be viewed as a two-mode special case of best-first viterbi search.  So make a real best-first lattice search.  Mark mentioned some attempts in the 80'ies to do this.
[[Category:Fisher Experiments]]
[[Category:Fisher Experiments]]

Revision as of 23:54, 11 October 2009

Contents

Fisher experiments

number of frames dropped on Fisher corpus
<math>\tau</math> frames dropped
1 0%
.9 ~5%
.6 ~35%
I have to check for bugs. It could be that the threshold is too low or it could be something else too. We should probably rerun baseline too, just to make sure I didn't optimize it unfairly.

Things to try

  • Test svitchboard with fisher-trained model to see if we still get good results
  • Train and test on plp+mlp, like svitchboard timeshrinking was done (done, improved baseline and test by 5% WER!).
  • Do baseline train+test to see if something changed in going from baseline to timeshrink structure files. (done, helped)

LM penalty and scale

Since we now have 62 PLP+MLP features instead of 39 PLP, we should probably change LM scale by a factor 62/39=1.58. The original (not carefully tuned)PLP LM scale was 10. Perhaps it would make sense to multiply the LM penalty (-1 for PLP) by the same 1.58 factor.

final test

20k utterances, at tau=.9, 6.02% of the frames are dropped, 158839 segments and 3.5 frames per segment.

lm_scale was roughly tuned on the baseline, and the same one was used on the test, although tuning for the test would help because there are %5 fewer frames per word on average.

Timeshrinking results on fisher
train <math>\tau</math> test <math>\tau</math> dev 2000 utt WER dev 2000 utt on triphone single-gaussian model WER comments
1 1 51.6% old baseline
1 1 53.7% 78.2% baseline rerun exactly as timeshrinking to really make sure it's not getting an unfair advantage
1 1 54.5% baseline rerun exactly as timeshrinking LM_scale 16 to double check it's tuned. Should be worse, and it is.
.6 .6 69.3%
.9 .9 56.3% 80.4%
1 .9 53.9%
.9 .9 57.2 80.7% using the non-timeshrinking str file for test
.9 1 54.6
1 1 55.4% 72.8% PLP+MLP tandem
1 1 50.6% PLP+MLP tandem, LM scale 16
1 1 49.7% PLP+MLP tandem, LM scale 20
1 1 50.0% PLP+MLP tandem, LM scale 25
.9 .9 49.9% PLP+MLP tandem lm_scale 16
.9 .9 49.3% PLP+MLP tandem lm_scale 20

Future Directions

  • Can be viewed as a two-mode special case of best-first viterbi search. So make a real best-first lattice search. Mark mentioned some attempts in the 80'ies to do this.
Final test Timeshrinking results on fisher
<math>\tau</math> test 20k utt WER comments
1 TE PLP+MLP tandem, LM scale 20 (tuned)
.9 TE everything except <math>\tau</math> is same as baseline.
Personal tools