“2013-06-07”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第30行: 第30行:
 
=== Tencent exps ===
 
=== Tencent exps ===
  
*6000小时模型训
+
*6000h model training, could be finished on 25th approximately.
 +
*working on sequential DNN DT: refer to "Error Back Propagation For Sequence Training of Context-Dependent Deep Networks For Conversation Speech Transcription"
 +
 
  
 
=== GPU & CPU merge ===
 
=== GPU & CPU merge ===

2013年6月7日 (五) 08:40的版本

Data sharing

  • LM count files still undelivered!

DNN progress

Experiments

  • sparse DNN: sticky training (retrain the nnet while keeping the sparsness)

zero small values(test set: 1900), with extremely sparseness:

threshold 0 0.2 0.3 0.4 0.5
shrinkage% 0.0 66.4 81.6 0.90 0.94
without sticky: WER 7.55 9.46 53.23 98.99 -
with sticky: WER 7.55 7.56 7.87 8.81 9.87

Conclusion: The extremely sparse network can largely pertain the performance of DNN. The structure seems more important than the parameter tuning based on the structure.

  • fixed-point DNN forwarding
  1. working on migrating the Atlas lib to ARM.
  2. working on atlas/mkl independent implementation.

Tencent exps

  • 6000h model training, could be finished on 25th approximately.
  • working on sequential DNN DT: refer to "Error Back Propagation For Sequence Training of Context-Dependent Deep Networks For Conversation Speech Transcription"


GPU & CPU merge

  1. on progress.

RNN LM progress

  • Initial work started. 100M data with a 10k vocabulary obtained a perplexity 180.
  • More exploration continuous.

Embedded progress

  • Status:
check the reference, and change the compiling options
the large-scale AM training based on the Tencent 400h data is done, continuous HMM.


Sys WER RT
SP model 8.81 0.07
Tencent tone 6.33 0.40
Tencent notone 5.04 0.31
  • To be done
  1. large scale parallel training.
  2. NN based engine(dynamic and static).
  3. Semi-continuous model with the Tencent data