“2013-05-24”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以内容“== Data sharing == * LM count files still undelivered! == DNN progress == === Experiments === * sparse DNN zero small values(WER 1900): threshold 0 0.01 ...”创建新页面)
 
Experiments
第15行: 第15行:
  
 
* fixed-point DNN
 
* fixed-point DNN
 +
 +
ORG      WER(1900) 7.25%
  
 
val=-math.log(abs(vv)/1000.0)*20
 
val=-math.log(abs(vv)/1000.0)*20
第22行: 第24行:
 
* fixed-pint HCLG
 
* fixed-pint HCLG
  
 +
ORG      WER(1900) 7.25%
 +
 +
INT 50  WER(1900) 7.30%
  
INIT 50  WER(1900) 7.30%
+
INT 10  WER(1900) 7.12%
INIT 10  WER(1900) 7.12%  
+
  
 
=== Tencent exps ===
 
=== Tencent exps ===

2013年5月24日 (五) 05:30的版本

Data sharing

  • LM count files still undelivered!

DNN progress

Experiments

  • sparse DNN

zero small values(WER 1900):

threshold 0 0.01 0.03 0.05 0.08 0.1 shrinkage 0.0 4.6% 13.5% 21.8% 33.4% 40.5% performance 7.25% 7.21% 7.28% 7.41% 7.61% 7.67%


  • fixed-point DNN

ORG WER(1900) 7.25%

val=-math.log(abs(vv)/1000.0)*20

WER(1900): 7.30%

  • fixed-pint HCLG

ORG WER(1900) 7.25%

INT 50 WER(1900) 7.30%

INT 10 WER(1900) 7.12%

Tencent exps

1:1000小时训练DNN模型,同时跑2个有关学习率的实验。一个learning rate指数下降,一个采用newbob的方式。实验接近尾声,下周前可以全部结束实验。对比效果后,采用较好的学习率递减方式,训练更大规模数据的dnn模型。 we are looking forward to the 1000 hour results.. 2:解码器端尝试了sse,定点化等加速优化策略,仍不能再高并发的环境下,将实时率降到1以下,直接在测试端采用low-rank matrix approximations,测试性能衰减较多。训练段使用这种方法,公式有待推导。 we probably need to rely on the sparse net solution plus fix point computing.

待验证工作: 1:pretrain的2种策略:rbm和discriminative pretrain方法。MS suggested the latter, while the performance difference for large networks (more than 7 layers) is not significant according to the publications. For large data, it deserves to try, though the rbm approach is highly costly. 2:hmm-dnn训练之后,使用hmm-dnn模型alignment,更新转移概率之后,重新训练hmm-dnn性能。should be promising 3:hmm-dnn+sequential dt训练性能提升比例。 4:dnn训练端采用low-rank的方式。(the low-rank is a bit strange to me, it does not related to a reason objective function directly, and the structure of the weight matrix is nothing to do with the objective.)


GPU & CPU merge

  1. just started


Kaldi/HTK merge

  • HTK2Kaldi: hold.
  • Kaldi2HTK: pdf error problem.
Kaldi Monophone: 30.91%  HDecode: 41.40%
  • workaround; use the BN feature to train HTK models, so without kaldi training.

Embedded progress

  • Status:
  1. first embedded demo done, 1000 words take 3.2M memory.
  2. accuracy test finished. The test data involves 3 speakers recorded in a car with Chongqing dialect, 1000 address names.
  3. training acoustic model for sphinx. The an4 training process is done, while the test seems problematic.
Test Set #utt ERR RT
806 23.33 0.07
887 13.64 0.08
876 17.58 0.07
  • To be done
  1. finish the large scale AM training