2014-01-17

来自cslt Wiki

2014年1月20日 (一) 12:35Cslt（讨论 | 贡献）的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)

跳转至：导航、搜索

目录

[隐藏]

1 AM development
2 LM development
- 2.1 NN LM
3 Embedded development
4 Speech QA

AM development

Sparse DNN

Optimal Brain Damage(OBD).

Online OBD held.
OBD + L1 norm start to investigation.

Efficient computing

Conducting rearrangement the matrix structure and compose zero blocks by some smart approaches, leading to better computing speed.

Efficient DNN training

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
Fbank feature used to train GMM+DNN, leads to very high training Acc, but reduces accuracy on test.

Optimal phoneset

Ch/En training with concatenated phone set is completed.
Initial test seems reasonable on Chinese. A bit worse than the original test
Need to compare the two systems both on Fbank
Need to extend the state number

Engine optimization

Investigating LOUDS FST. On progress.

LM development

NN LM

Training character-based NN LM, 12134 Chinese chars
Prepare data for training word2vector on Gigawords CHS 4.0

Embedded development

CLG embedded decoder is almost done. The graph compilation is highly fast.
Work on layer-by-layer DNN training, initial model is incorrect.

Speech QA

Use N-best to expand match in QA. Better performance were obtained.

1-best matches 96/121
10-best matches 102/121

Use N-best to recover errors in entity check.

Design a non-entity pattern to discover the possible place of an entity
By this position range, search entities within the N-best result

Use Pinyin to recover errors in entity check. Future work.

Design a non-entity pattern to discover the possible place of an entity (as above)
Match the Pinying strings of all the entities, and then match the pinyin strings with the entity pinyin
Keep the most matched entity based on Pinyin with a threshold
A bit worse then the original test.
A possible problem is that the LM is over-strong, thus lead to unmatched Pinyin string in acoustic space
Liu rong will provide a weak LM to support the research.

Investigate some errors in entity-based LM.

Still some errors exist
Running entity-base LM with a small entity list

取自“http://index.cslt.org/mediawiki/index.php?title=2014-01-17&oldid=9114”