2013-05-10

来自cslt Wiki
2013年5月13日 (一) 07:32Cslt讨论 | 贡献的版本

跳转至: 导航搜索

Data sharing

  • LM count files still undelivered!

DNN progress

Experiments

  • setups for input layer
s1: mfcc(13), splice +-5[143]
s2: mfcc(13), splice +-5(143), LDA[143]
s3: mfcc(13), delta(39), splice +-5(429), LDA[143]
s4: mfcc(13), delta(39), splice +-5(429), LDA[300]
  • setups for alignment
tri1: triphone training, feature input: mfcc(13), delta[39]. #pdfs 1651, #gaussians 10028
tri2: LDA/MLLT training, feature input: mfcc(13), delta(39), splice +-4(351), LDA[40]. #pdfs 3536, #gaussians 39995
  • other notes
about 100 hours training data
88k LM, biglm decoding (1e-5 / 1e-9)
gpu-based nnet training, in-1200-1200-1200-1200-out
  • results
Test Set fMMI s1/tri1 s2/tri1 s3/tri1 s4/tri1 s2/tri2 s4/tri2 cpu-based (like s4/tri1)
map 25.38 24.47 26.16 26.20 22.85 24.27 26.45
2044 23.58 22.82 23.84 24.13 21.45 22.76 24.66
notetp3 16.08 14.89 15.92 15.97 14.89 14.79 16.14
1900 8.55 8.43 8.66 8.90 7.30 7.91 8.23
general 36.18 34.79 35.88 35.90 33.06 33.79 38.02
online1 34.68 33.90 33.45 33.38 32.93 32.43 33.00
online2 27.27 26.61 26.26 26.36 25.94 25.69 26.63
speedup 24.97 24.40 24.55 25.42 23.04 23.67 27.17
  • conclusion
  1. the GPU approach is comparable with the CPU approach (see s4/tri1 & CPU results). The former works slightly better in most cases.
  2. the fine training leads to significant better performance than the rough training (see s2/tri1 vs s2/tri2 & s4/tri1 vs s4/tri2)
  3. the delta features do not help, actually harm the perfromance (see s2/tri1 vs s3/tri1 & s4/tri1)
  4. the linear LDA helps the performance (see s1/tri1 vs s2/tri1)
  5. the best system is s2/tri2: with out delta, apply a linear LDA.

Tencent exps

TODO

GPU & CPU merge

  1. current schedule is to migrate GPU to CPU. Will start to code in this week.


L-1 sparse initial training

  • experiments on L1 and L2 penalty
  1. LM: 4G gigabyte LM, 1e-5 small LM
  2. AM: 100hour tri4b_nn


Test Set 0 1.00E-06 1.00E-05 2.50E-05 5.00E-05 7.50E-05 1.00E-04
map     61.06 61.18 61.31 61.72 62.89 62.84 62.48
2044   49.53 49.58 49.84 49.94 50.71 51.08 51.08
notetp3 43.44 43.44 43.55 44.58 45.22 44.95 45.76
1900   38.50 38.54 39.00 39.21 40.33 40.53 40.60
general 61.24 61.22 61.69 61.84 62.71 62.83 62.93
online1 58.02 58.05 58.31 58.23 58.83 59.06 59.46
online2 53.62 53.70 54.03 53.93 54.65 54.94 55.51
speedup 57.51 57.49 57.93 58.31 59.75 60.08 59.52
  • Conclusions
  1. L1 and L2 penalty do not work in the current nnet-GPU code.
  2. will work to check the L1 code and change the penalty scheme.


Kaldi/HTK merge

  • HTK2Kaldi: hold.
  • Kaldi2HTK: stuck. various sp models tried but don't help
  • Need thorough debug this week.


Embedded progress

  • Status:
  1. first embedded demo done, 1000 words take 3.2M memory.
  2. accuracy test not yet finished
  3. training acoustic model for sphinx
  • To be done
  1. finish AM training
  2. run offline test