2013-12-20

来自cslt Wiki
2013年12月20日 (五) 01:32Cslt讨论 | 贡献的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航搜索

AM development

Sparse DNN

  • Optimal Brain Damage(OBD).
  1. Online OBD held.
  2. OBD + L1 norm start to investigation.
  • Efficient computing
  1. Conducting rearrangement the matrix structure and compose zero blocks by some smart approaches, leading to better computing speed.


Efficient DNN training

  1. Moment-based training. With m=0.2 performs the best on WER. 6.8% improvement on WER. Other settings are tried on 0.05,0.1,0.2,..0.6,0.8,1.0.
  2. Asymmetric window: left 20, right 5. NN accuracy increase by 7%, however WER is a bit worse than the baseline. Move back to Tencent 100h training.
  3. Frame-skipping is on implementation.

Optimal phoneset

  1. Experiment 3 phone sets: Tencent, CSLT, PQ
  2. Some errors occur in pure CHS experiments


Engine optimization

  • Investigating LOUDS FST. On progress.


LM development

NN LM

  • Trained with 500M QA data, 110k vocabulary.
  • Tested on number of hidden layers (DNN), performance is better for some tests, but not for others.
  • Tested on larger projection layer, from 256 to 384, the performance is consistently improved.


Embedded development

  • Embedded stream mode on progress.


Speech QA

  • SP-QA accuracy 45.14% in all the input (18*199).
  • Investigate the error patterns:
  • 70% errors are caused by incorrect name entity recognition. Working on entity recovery (character, pinyin, ... distance penalty).
  • 8% errors are caused by English names. Use class-based LM to solve the problem. Ready to work.
  • Use N-best to recover errors in QA.