“2013-05-24”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第28行: 第28行:
 
ORG      WER(1900) 7.25%
 
ORG      WER(1900) 7.25%
  
INT 50  WER(1900) 7.30%
+
INT 50  WER(1900) 7.27%
  
 
INT 10  WER(1900) 7.12%
 
INT 10  WER(1900) 7.12%
第35行: 第35行:
  
  
1:1000小时训练DNN模型,同时跑2个有关学习率的实验。一个learning rate指数下降,一个采用newbob的方式。实验接近尾声,下周前可以全部结束实验。对比效果后,采用较好的学习率递减方式,训练更大规模数据的dnn模型。
+
#:1000小时训练DNN模型,同时跑2个有关学习率的实验。一个learning rate指数下降,一个采用newbob的方式。实验接近尾声,下周前可以全部结束实验。对比效果后,采用较好的学习率递减方式,训练更大规模数据的dnn模型。
  
[[we are looking forward to the 1000 hour results..]]
+
:we are looking forward to the 1000 hour results..
 +
 
 +
#:解码器端尝试了sse,定点化等加速优化策略,仍不能再高并发的环境下,将实时率降到1以下,直接在测试端采用low-rank matrix approximations,测试性能衰减较多。训练段使用这种方法,公式有待推导。
 +
 
 +
:we probably need to rely on the sparse net solution plus fix point computing. The low rank seems less reasonable than L1. The behind idea of low rank is to treat the weight matrix between two hidden layers as a mapping function spanning in low rank space, which may help to recover some prominent patterns however is not directly related to the objective function and not directly simply the computing.
  
2:解码器端尝试了sse,定点化等加速优化策略,仍不能再高并发的环境下,将实时率降到1以下,直接在测试端采用low-rank matrix approximations,测试性能衰减较多。训练段使用这种方法,公式有待推导。
 
  
[[we probably need to rely on the sparse net solution plus fix point computing. ]]
 
  
 
待验证工作:
 
待验证工作:
1:pretrain的2种策略:rbm和discriminative pretrain方法。
+
#:pretrain的2种策略:rbm和discriminative pretrain方法。
  
[[MS suggested the latter, while the performance difference for large networks (more than 7 layers) is not significant according to the publications. For large data, it deserves to try, though the rbm approach is highly costly. ]]
+
:MS suggested the latter, while the performance difference for large networks (more than 7 layers) is not significant according to the publications. For large data, it deserves to try, though the rbm approach is highly costly.  
  
2:hmm-dnn训练之后,使用hmm-dnn模型alignment,更新转移概率之后,重新训练hmm-dnn性能。
+
#:hmm-dnn训练之后,使用hmm-dnn模型alignment,更新转移概率之后,重新训练hmm-dnn性能。
  
[[should be promising]]
+
:should be promising
  
3:hmm-dnn+sequential dt训练性能提升比例。
+
#:hmm-dnn+sequential dt训练性能提升比例。
  
4:dnn训练端采用low-rank的方式。
+
#:dnn训练端采用low-rank的方式。
  
(the low-rank is a bit strange to me, it does not related to a reason  objective function directly, and the structure of the weight matrix is nothing to do with the objective.)
 
  
 
=== GPU & CPU merge ===
 
=== GPU & CPU merge ===
  
#just started
+
#on progress.
  
  
第66行: 第67行:
  
 
* HTK2Kaldi: hold.
 
* HTK2Kaldi: hold.
* Kaldi2HTK: pdf error problem.
+
* Kaldi2HTK: still under debugging.  
Kaldi Monophone: 30.91%  HDecode: 41.40%
+
* workaround; use the BN feature to train HTK models, so without kaldi training.
+
  
 
== Embedded progress ==
 
== Embedded progress ==
 
*Status:
 
*Status:
:# first embedded demo done, 1000 words take 3.2M memory.
+
:# check the VAD results, recall some missed utterances, obtain the new performances.
:# accuracy test finished. The test data involves 3 speakers recorded in a car with Chongqing dialect, 1000 address names.
+
:# training acoustic model for sphinx. The an4 training process is done, while the test seems problematic.
+
  
 
{| class="wikitable"
 
{| class="wikitable"
! Test Set !! #utt !! ERR !! RT
+
! Test Set !! #utt !! Acc !! RT
 
|-
 
|-
|   || 806 || 23.33 || 0.07
+
| cw  || 993 || 13.64 || 0.07
 
|-
 
|-
|   || 887 || 13.64 || 0.08
+
| hfc || 986 || 9.84 || 0.08
 
|-  
 
|-  
|   || 876 || 17.58 || 0.07
+
| zz  || 984 || 16.87 || 0.08
 
|-
 
|-
 
|}
 
|}
 +
 +
:# first large sphinx chinese model training done, with reasonable performance. Need investigate parallel training.
  
 
*To be done
 
*To be done
:# finish the large scale AM training
+
:# parallel training.
 +
:# Kaldi based engine design.
 +
:# debug the random output issue with the demo.

2013年5月24日 (五) 06:16的版本

Data sharing

  • LM count files still undelivered!

DNN progress

Experiments

  • sparse DNN

zero small values(WER 1900):

threshold 0 0.01 0.03 0.05 0.08 0.1

shrinkage 0.0 4.6% 13.5% 21.8% 33.4% 40.5%

performance 7.25% 7.21% 7.28% 7.41% 7.61% 7.67%


  • fixed-point DNN

ORG WER(1900) 7.25%

val=-math.log(abs(vv)/1000.0)*20

WER(1900): 7.30%

  • fixed-pint HCLG

ORG WER(1900) 7.25%

INT 50 WER(1900) 7.27%

INT 10 WER(1900) 7.12%

Tencent exps

  1. :1000小时训练DNN模型,同时跑2个有关学习率的实验。一个learning rate指数下降,一个采用newbob的方式。实验接近尾声,下周前可以全部结束实验。对比效果后,采用较好的学习率递减方式,训练更大规模数据的dnn模型。
we are looking forward to the 1000 hour results..
  1. :解码器端尝试了sse,定点化等加速优化策略,仍不能再高并发的环境下,将实时率降到1以下,直接在测试端采用low-rank matrix approximations,测试性能衰减较多。训练段使用这种方法,公式有待推导。
we probably need to rely on the sparse net solution plus fix point computing. The low rank seems less reasonable than L1. The behind idea of low rank is to treat the weight matrix between two hidden layers as a mapping function spanning in low rank space, which may help to recover some prominent patterns however is not directly related to the objective function and not directly simply the computing.


待验证工作:

  1. :pretrain的2种策略:rbm和discriminative pretrain方法。
MS suggested the latter, while the performance difference for large networks (more than 7 layers) is not significant according to the publications. For large data, it deserves to try, though the rbm approach is highly costly.
  1. :hmm-dnn训练之后,使用hmm-dnn模型alignment,更新转移概率之后,重新训练hmm-dnn性能。
should be promising
  1. :hmm-dnn+sequential dt训练性能提升比例。
  1. :dnn训练端采用low-rank的方式。


GPU & CPU merge

  1. on progress.


Kaldi/HTK merge

  • HTK2Kaldi: hold.
  • Kaldi2HTK: still under debugging.

Embedded progress

  • Status:
  1. check the VAD results, recall some missed utterances, obtain the new performances.
Test Set #utt Acc RT
cw 993 13.64 0.07
hfc 986 9.84 0.08
zz 984 16.87 0.08
  1. first large sphinx chinese model training done, with reasonable performance. Need investigate parallel training.
  • To be done
  1. parallel training.
  2. Kaldi based engine design.
  3. debug the random output issue with the demo.