Cslt：以内容“== AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # Online OBD held. # OBD + L1 norm start to investigation. * Efficient computing # Conducti...”创建新页面

2013-12-20T01:32:54Z

以内容“== AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # Online OBD held. # OBD + L1 norm start to investigation. * Efficient computing # Conducti...”创建新页面

新页面

== AM development ==

=== Sparse DNN ===

* Optimal Brain Damage(OBD).

# Online OBD held.
# OBD + L1 norm start to investigation.

* Efficient computing

# Conducting rearrangement the matrix structure and compose zero blocks by some smart approaches, leading to better computing speed.

=== Efficient DNN training ===

# Moment-based training. With m=0.2 performs the best on WER. 6.8% improvement on WER. Other settings are tried on 0.05,0.1,0.2,..0.6,0.8,1.0.
# Asymmetric window: left 20, right 5. NN accuracy increase by 7%, however WER is a bit worse than the baseline. Move back to Tencent 100h training.
# Frame-skipping is on implementation.

=== Optimal phoneset===

# Experiment 3 phone sets: Tencent, CSLT, PQ
# Some errors occur in pure CHS experiments

===Engine optimization===

* Investigating LOUDS FST. On progress.

==LM development==

===NN LM===

* Trained with 500M QA data, 110k vocabulary.
* Tested on number of hidden layers (DNN), performance is better for some tests, but not for others.
* Tested on larger projection layer, from 256 to 384, the performance is consistently improved.

==Embedded development==

* Embedded stream mode on progress.

==Speech QA==

* SP-QA accuracy 45.14% in all the input (18*199).
* Investigate the error patterns:
:* 70% errors are caused by incorrect name entity recognition. Working on entity recovery (character, pinyin, ... distance penalty).
:* 8% errors are caused by English names. Use class-based LM to solve the problem. Ready to work.
:* Use N-best to recover errors in QA.

2013-12-20 - 版本历史

Cslt：以内容“== AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # Online OBD held. # OBD + L1 norm start to investigation. * Efficient computing # Conducti...”创建新页面