“2013-11-29”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
Embedded development
Speech QA
 
第39行: 第39行:
  
 
* Designed a QA recording client on mobiles.  
 
* Designed a QA recording client on mobiles.  
* CSLT QA recording done. 10 people, 199 utterances per people.  
+
* 15 persons recording done, 199 utterances per people.  
 
* Ready to do recognition.
 
* Ready to do recognition.

2013年11月29日 (五) 03:51的最后版本

AM development

Sparse DNN

  • Optimal Brain Damage(OBD).
  1. Online OBD held.
  2. OBD + L1 norm start to investigation.


Engine optimization

  • Constant FST Applied. Original RT (GCC + vector FST): 0.35. Optimized (GCC+const FST + cache): 0.29
  • Start to investigate LOUDS FST.
  • Investigate spare computing

English NN training

  • WSJ+Chinglish data training, done
  • 300 English song graph. Looks fine.
  • Preparing test.


LM development

NN LM

  • 3 iteration 500 M training done. 24 hours per iteration.
  • PPL 189 after 3 iterations.
  • NN-based CSLM merge done (10240*100*10240). The PPL and WER are both worse than the original 10 network outputs.
  • Need to investigate why the merge is not accurate.


Embedded development

  • Embedded stream mode on progress.

Speech QA

  • Designed a QA recording client on mobiles.
  • 15 persons recording done, 199 utterances per people.
  • Ready to do recognition.