“2013-11-29”版本间的差异
来自cslt Wiki
(以内容“ == AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # Online OBD held. # OBD + L1 norm start to investigation. ===Engine optimization=== * Co...”创建新页面) |
(→Speech QA) |
||
(相同用户的一个中间修订版本未显示) | |||
第34行: | 第34行: | ||
==Embedded development== | ==Embedded development== | ||
− | * Embedded stream mode | + | * Embedded stream mode on progress. |
==Speech QA== | ==Speech QA== | ||
* Designed a QA recording client on mobiles. | * Designed a QA recording client on mobiles. | ||
− | * | + | * 15 persons recording done, 199 utterances per people. |
* Ready to do recognition. | * Ready to do recognition. |
2013年11月29日 (五) 03:51的最后版本
目录
AM development
Sparse DNN
- Optimal Brain Damage(OBD).
- Online OBD held.
- OBD + L1 norm start to investigation.
Engine optimization
- Constant FST Applied. Original RT (GCC + vector FST): 0.35. Optimized (GCC+const FST + cache): 0.29
- Start to investigate LOUDS FST.
- Investigate spare computing
English NN training
- WSJ+Chinglish data training, done
- 300 English song graph. Looks fine.
- Preparing test.
LM development
NN LM
- 3 iteration 500 M training done. 24 hours per iteration.
- PPL 189 after 3 iterations.
- NN-based CSLM merge done (10240*100*10240). The PPL and WER are both worse than the original 10 network outputs.
- Need to investigate why the merge is not accurate.
Embedded development
- Embedded stream mode on progress.
Speech QA
- Designed a QA recording client on mobiles.
- 15 persons recording done, 199 utterances per people.
- Ready to do recognition.