:Units Paper

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
Line 51: Line 51:
|-
|-
|}
|}
 +
 +
==The units make it worse==
 +
 +
=== with LM_PENALTY=0 ===
 +
DETAILED OVERALL REPORT FOR THE SYSTEM: test/config16/test0/accuracy/out.nosil.trn
 +
 +
SENTENCE RECOGNITION PERFORMANCE
 +
 +
sentences                                        500
 +
with errors                            89.2%  ( 446)
 +
 +
  with substitions                      72.4%  ( 362)
 +
  with deletions                        26.2%  ( 131)
 +
  with insertions                      74.4%  ( 372)
 +
 +
 +
WORD RECOGNITION PERFORMANCE
 +
 +
Percent Total Error      =  99.1%  (5107)
 +
 +
Percent Correct          =  28.6%  (1476)
 +
 +
Percent Substitution      =  66.2%  (3411)
 +
Percent Deletions        =    5.1%  ( 265)
 +
Percent Insertions        =  27.8%  (1431)
 +
Percent Word Accuracy    =    0.9%
 +
 +
 +
Ref. words                =          (5152)
 +
Hyp. words                =          (6318)
 +
Aligned words            =          (6583)
 +
 +
CONFUSION PAIRS                  Total                (2790)
 +
                                With >=  1 occurances (2790)
 +
 +
 +
=== with LM_PENALTY=-1 ===
 +
test2kUtt/config16Disaster/test0/accuracy/out.nosil.trn.dtl
 +
DETAILED OVERALL REPORT FOR THE SYSTEM: test2kUtt/config16/test0/accuracy/out.nosil.trn
 +
 +
SENTENCE RECOGNITION PERFORMANCE
 +
 +
sentences                                        500
 +
with errors                            88.2%  ( 441)
 +
 +
  with substitions                      72.8%  ( 364)
 +
  with deletions                        32.8%  ( 164)
 +
  with insertions                      68.6%  ( 343)
 +
 +
 +
WORD RECOGNITION PERFORMANCE
 +
 +
Percent Total Error      =  94.6%  (4857)
 +
 +
Percent Correct          =  27.6%  (1417)
 +
 +
Percent Substitution      =  65.4%  (3358)
 +
Percent Deletions        =    7.0%  ( 357)
 +
Percent Insertions        =  22.3%  (1142)
 +
Percent Word Accuracy    =    5.4%
 +
 +
 +
Ref. words                =          (5132)
 +
Hyp. words                =          (5917)
 +
Aligned words            =          (6274)
 +
-
==For Mark tomorrow==
 
-
* What is the story to tell with the above experiments?  Do we need a clean replacement: 2000 leaf nodes through just DTs and also 2000 leaf nodes through DT+units?
 
-
*
 
[[Category:Fisher Experiments]]
[[Category:Fisher Experiments]]

Revision as of 15:05, 11 April 2009

Contents

Outline

  1. Intro
  2. Unit Selection
    • Mistake instance
      Unit
      Replacement
    • Multwords
  3. Baseline Description
    • Vocab: single most frequent pronunciation from a multi-pronunciation dictionary (better than multi-pronunciation)
  4. Results
    • what to emphasize? Ideally, units+DTs will beat just DTs for every number of components. Even if we cannot grow the components until improvement bottoms out, at least there will be a trend.
  5. Conclusion
    • Future work: consider context during unit selection (right now the unit is context-free - the same unit appearing in all contexts where replacements took place).

Tests for units paper

tests to run
compPer: units: monophone states Mix: totalComp: WER Test WER Important
512 1 503 256k TR
256 1 1000 256k TR
64 1 3854 256k 49.3
64 2 2000 256k  ?
32 4 2000 256k  ?
32 2 4000 256k  ?
alternatively
256 48 137 503 127971 53.0
128 48 137 1033 131185 50.9
32 48 137 3845 122907 51.4
64 112 615 2024 ~128k TR
16 4 2000 128k  ?
16 112 615 4000 128k  ?

The units make it worse

with LM_PENALTY=0

DETAILED OVERALL REPORT FOR THE SYSTEM: test/config16/test0/accuracy/out.nosil.trn

SENTENCE RECOGNITION PERFORMANCE

sentences                                         500
with errors                             89.2%   ( 446)
  with substitions                      72.4%   ( 362)
  with deletions                        26.2%   ( 131)
  with insertions                       74.4%   ( 372)


WORD RECOGNITION PERFORMANCE

Percent Total Error = 99.1% (5107)

Percent Correct = 28.6% (1476)

Percent Substitution = 66.2% (3411) Percent Deletions = 5.1% ( 265) Percent Insertions = 27.8% (1431) Percent Word Accuracy = 0.9%


Ref. words = (5152) Hyp. words = (6318) Aligned words = (6583)

CONFUSION PAIRS Total (2790)

                                With >=  1 occurances (2790)


with LM_PENALTY=-1

test2kUtt/config16Disaster/test0/accuracy/out.nosil.trn.dtl DETAILED OVERALL REPORT FOR THE SYSTEM: test2kUtt/config16/test0/accuracy/out.nosil.trn

SENTENCE RECOGNITION PERFORMANCE

sentences                                         500
with errors                             88.2%   ( 441)
  with substitions                      72.8%   ( 364)
  with deletions                        32.8%   ( 164)
  with insertions                       68.6%   ( 343)


WORD RECOGNITION PERFORMANCE

Percent Total Error = 94.6% (4857)

Percent Correct = 27.6% (1417)

Percent Substitution = 65.4% (3358) Percent Deletions = 7.0% ( 357) Percent Insertions = 22.3% (1142) Percent Word Accuracy = 5.4%


Ref. words = (5132) Hyp. words = (5917) Aligned words = (6274)

Personal tools