:Units Paper
From SpeechWiki
(Difference between revisions)
Line 55: | Line 55: | ||
=== with LM_PENALTY=0 === | === with LM_PENALTY=0 === | ||
- | DETAILED OVERALL REPORT FOR THE SYSTEM: test/config16/test0/accuracy/out.nosil.trn | + | DETAILED OVERALL REPORT FOR THE SYSTEM: test/config16/test0/accuracy/out.nosil.trn |
- | + | ||
- | SENTENCE RECOGNITION PERFORMANCE | + | SENTENCE RECOGNITION PERFORMANCE |
- | + | ||
sentences 500 | sentences 500 | ||
with errors 89.2% ( 446) | with errors 89.2% ( 446) | ||
- | + | ||
with substitions 72.4% ( 362) | with substitions 72.4% ( 362) | ||
with deletions 26.2% ( 131) | with deletions 26.2% ( 131) | ||
with insertions 74.4% ( 372) | with insertions 74.4% ( 372) | ||
- | + | ||
- | + | ||
- | WORD RECOGNITION PERFORMANCE | + | WORD RECOGNITION PERFORMANCE |
- | + | ||
- | Percent Total Error = 99.1% (5107) | + | Percent Total Error = 99.1% (5107) |
- | + | ||
- | Percent Correct = 28.6% (1476) | + | Percent Correct = 28.6% (1476) |
- | + | ||
- | Percent Substitution = 66.2% (3411) | + | Percent Substitution = 66.2% (3411) |
- | Percent Deletions = 5.1% ( 265) | + | Percent Deletions = 5.1% ( 265) |
- | Percent Insertions = 27.8% (1431) | + | Percent Insertions = 27.8% (1431) |
- | Percent Word Accuracy = 0.9% | + | Percent Word Accuracy = 0.9% |
- | + | ||
- | + | ||
- | Ref. words = (5152) | + | Ref. words = (5152) |
- | Hyp. words = (6318) | + | Hyp. words = (6318) |
- | Aligned words = (6583) | + | Aligned words = (6583) |
- | + | ||
- | CONFUSION PAIRS Total (2790) | + | CONFUSION PAIRS Total (2790) |
With >= 1 occurances (2790) | With >= 1 occurances (2790) | ||
=== with LM_PENALTY=-1 === | === with LM_PENALTY=-1 === | ||
- | test2kUtt/config16Disaster/test0/accuracy/out.nosil.trn.dtl | + | test2kUtt/config16Disaster/test0/accuracy/out.nosil.trn.dtl |
- | DETAILED OVERALL REPORT FOR THE SYSTEM: test2kUtt/config16/test0/accuracy/out.nosil.trn | + | DETAILED OVERALL REPORT FOR THE SYSTEM: test2kUtt/config16/test0/accuracy/out.nosil.trn |
- | + | ||
- | SENTENCE RECOGNITION PERFORMANCE | + | SENTENCE RECOGNITION PERFORMANCE |
- | + | ||
- | sentences 500 | + | sentences 500 |
- | + | with errors 88.2% ( 441) | |
- | + | ||
with substitions 72.8% ( 364) | with substitions 72.8% ( 364) | ||
with deletions 32.8% ( 164) | with deletions 32.8% ( 164) | ||
with insertions 68.6% ( 343) | with insertions 68.6% ( 343) | ||
- | + | ||
- | + | ||
- | WORD RECOGNITION PERFORMANCE | + | WORD RECOGNITION PERFORMANCE |
- | + | ||
- | Percent Total Error = 94.6% (4857) | + | Percent Total Error = 94.6% (4857) |
- | + | ||
- | Percent Correct = 27.6% (1417) | + | Percent Correct = 27.6% (1417) |
- | + | ||
- | Percent Substitution = 65.4% (3358) | + | Percent Substitution = 65.4% (3358) |
- | Percent Deletions = 7.0% ( 357) | + | Percent Deletions = 7.0% ( 357) |
- | Percent Insertions = 22.3% (1142) | + | Percent Insertions = 22.3% (1142) |
- | Percent Word Accuracy = 5.4% | + | Percent Word Accuracy = 5.4% |
- | + | ||
- | + | ||
- | Ref. words = (5132) | + | Ref. words = (5132) |
- | Hyp. words = (5917) | + | Hyp. words = (5917) |
- | Aligned words = (6274) | + | Aligned words = (6274) |
+ | |||
Revision as of 16:36, 11 April 2009
Contents |
Outline
- Intro
- Unit Selection
- Mistake instance
- Unit
- Replacement
- Multwords
- Baseline Description
- Vocab: single most frequent pronunciation from a multi-pronunciation dictionary (better than multi-pronunciation)
- Results
- what to emphasize? Ideally, units+DTs will beat just DTs for every number of components. Even if we cannot grow the components until improvement bottoms out, at least there will be a trend.
- Conclusion
- Future work: consider context during unit selection (right now the unit is context-free - the same unit appearing in all contexts where replacements took place).
Tests for units paper
compPer: | units: | monophone states | Mix: | totalComp: | WER | Test WER | Important |
---|---|---|---|---|---|---|---|
512 | 1 | 503 | 256k | TR | |||
256 | 1 | 1000 | 256k | TR | |||
64 | 1 | 3854 | 256k | 49.3 | |||
64 | 2 | 2000 | 256k | ? | |||
32 | 4 | 2000 | 256k | ? | |||
32 | 2 | 4000 | 256k | ? | |||
alternatively | |||||||
256 | 48 | 137 | 503 | 127971 | 53.0 | ||
128 | 48 | 137 | 1033 | 131185 | 50.9 | ||
32 | 48 | 137 | 3845 | 122907 | 51.4 | ||
64 | 112 | 615 | 2024 | ~128k | TR | ||
16 | 4 | 2000 | 128k | ? | |||
16 | 112 | 615 | 4000 | 128k | ? |
The units make it worse
with LM_PENALTY=0
DETAILED OVERALL REPORT FOR THE SYSTEM: test/config16/test0/accuracy/out.nosil.trn SENTENCE RECOGNITION PERFORMANCE sentences 500 with errors 89.2% ( 446) with substitions 72.4% ( 362) with deletions 26.2% ( 131) with insertions 74.4% ( 372) WORD RECOGNITION PERFORMANCE Percent Total Error = 99.1% (5107) Percent Correct = 28.6% (1476) Percent Substitution = 66.2% (3411) Percent Deletions = 5.1% ( 265) Percent Insertions = 27.8% (1431) Percent Word Accuracy = 0.9% Ref. words = (5152) Hyp. words = (6318) Aligned words = (6583) CONFUSION PAIRS Total (2790) With >= 1 occurances (2790)
with LM_PENALTY=-1
test2kUtt/config16Disaster/test0/accuracy/out.nosil.trn.dtl DETAILED OVERALL REPORT FOR THE SYSTEM: test2kUtt/config16/test0/accuracy/out.nosil.trn SENTENCE RECOGNITION PERFORMANCE sentences 500 with errors 88.2% ( 441) with substitions 72.8% ( 364) with deletions 32.8% ( 164) with insertions 68.6% ( 343) WORD RECOGNITION PERFORMANCE Percent Total Error = 94.6% (4857) Percent Correct = 27.6% (1417) Percent Substitution = 65.4% (3358) Percent Deletions = 7.0% ( 357) Percent Insertions = 22.3% (1142) Percent Word Accuracy = 5.4% Ref. words = (5132) Hyp. words = (5917) Aligned words = (6274)