2013-12-20
来自cslt Wiki
目录
AM development
Sparse DNN
- Optimal Brain Damage(OBD).
- Online OBD held.
- OBD + L1 norm start to investigation.
- Efficient computing
- Conducting rearrangement the matrix structure and compose zero blocks by some smart approaches, leading to better computing speed.
Efficient DNN training
- Moment-based training. With m=0.2 performs the best on WER. 6.8% improvement on WER. Other settings are tried on 0.05,0.1,0.2,..0.6,0.8,1.0.
- Asymmetric window: left 20, right 5. NN accuracy increase by 7%, however WER is a bit worse than the baseline. Move back to Tencent 100h training.
- Frame-skipping is on implementation.
Optimal phoneset
- Experiment 3 phone sets: Tencent, CSLT, PQ
- Some errors occur in pure CHS experiments
Engine optimization
- Investigating LOUDS FST. On progress.
LM development
NN LM
- Trained with 500M QA data, 110k vocabulary.
- Tested on number of hidden layers (DNN), performance is better for some tests, but not for others.
- Tested on larger projection layer, from 256 to 384, the performance is consistently improved.
Embedded development
- Embedded stream mode on progress.
Speech QA
- SP-QA accuracy 45.14% in all the input (18*199).
- Investigate the error patterns:
- 70% errors are caused by incorrect name entity recognition. Working on entity recovery (character, pinyin, ... distance penalty).
- 8% errors are caused by English names. Use class-based LM to solve the problem. Ready to work.
- Use N-best to recover errors in QA.