2014-01-03

AM development

Sparse DNN

Optimal Brain Damage(OBD).

Online OBD held.
OBD + L1 norm start to investigation.

Efficient computing

Conducting rearrangement the matrix structure and compose zero blocks by some smart approaches, leading to better computing speed.

Efficient DNN training

L1-L2 grid checking: L1/L2(< 1e-6) seems good for record1900 but worse for other test sets.

link here

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
Frame-skipping. Skipping 1 frame speeds up decoding in a consistent way while retaining the accuracy largely. Skipping more frames lead to unacceptable performance degradation.
Interpolation does not provide performance gain.

link here

Optimal phoneset

Analyze Tencent English phone set. Found some errors in CH/EN phone sharing.
Develop a new sharing scheme, start training the new system.
Start training for all-separated phones
Start training mixed system with Chinglish data.

Engine optimization

Investigating LOUDS FST. On progress.

LM development

NN LM

Collecting a bigger lexicon: 40k words related to music, 56k words from an official dictionary.
Working on NN LM based on word2vector.

Embedded development

Liuchao's cellphone, Qualcomm Snapdragon Krait MSM8960 @ 1.5GHz, using 1 core

small nnet 100/600/600/600/600/1264 with MFCC input

4500 words:

construct LG: 0.41s
compose HCLG with det: 13.70s, 5.318 MB
compose HCLG without det: 6.61s, 5.488 MB

950 words:

construct LG: 0.15s
compose HCLG with det: 2.63s, 0.947 MB, decode RT 0.649
compose HCLG without det: 1.74s, 0.998 MB, decode RT 0.548

For word list or simple grammars, determinization leads to small RT increase, but can improve HCLG compiling dramatically. This is particularly the case for embedded devices.
The accuracy does not change with/without determinization.

Speech QA

Use N-best to expand match in QA. Better performance were obtained.

1-best matches 96/121
10-best matches 102/121

Use N-best to recover errors in entity check. Working on.
Use Pinyin to recover errors in entity check. Future work.

2014-01-03

目录

AM development

Sparse DNN

Efficient DNN training

Optimal phoneset

Engine optimization

LM development

NN LM

Embedded development

Speech QA

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具