2013-05-10

来自cslt Wiki
2013年5月13日 (一) 07:25Cslt讨论 | 贡献的版本

跳转至: 导航搜索

Data sharing

  • LM count files still undelivered!

DNN progress

Experiments

  • setups for input layer
s1: mfcc(13), splice +-5[143]
s2: mfcc(13), splice +-5(143), LDA[143]
s3: mfcc(13), delta(39), splice +-5(429), LDA[143]
s4: mfcc(13), delta(39), splice +-5(429), LDA[300]
  • setups for alignment
tri1: triphone training, feature input: mfcc(13), delta[39]. #pdfs 1651, #gaussians 10028
tri2: LDA/MLLT training, feature input: mfcc(13), delta(39), splice +-4(351), LDA[40]. #pdfs 3536, #gaussians 39995
  • other notes
about 100 hours training data
88k LM, biglm decoding (1e-5 / 1e-9)
gpu-based nnet training, in-1200-1200-1200-1200-out
  • results
Test Set fMMI s1/tri1 s2/tri1 s3/tri1 s4/tri1 s2/tri2 s4/tri2 cpu-based (like s4/tri1)
map 25.38 24.47 26.16 26.20 22.85 24.27 26.45
2044 23.58 22.82 23.84 24.13 21.45 22.76 24.66
notetp3 16.08 14.89 15.92 15.97 14.89 14.79 16.14
1900 8.55 8.43 8.66 8.90 7.30 7.91 8.23
general 36.18 34.79 35.88 35.90 33.06 33.79 38.02
online1 34.68 33.90 33.45 33.38 32.93 32.43 33.00
online2 27.27 26.61 26.26 26.36 25.94 25.69 26.63
speedup 24.97 24.40 24.55 25.42 23.04 23.67 27.17
  • conclusion
  1. the GPU approach is comparable with the CPU approach (see s4/tri1 & CPU results). The former works slightly better in most cases.
  2. the fine training leads to significant better performance than the rough training (see s2/tri1 vs s2/tri2 & s4/tri1 vs s4/tri2)
  3. the delta features do not help, actually harm the perfromance (see s2/tri1 vs s3/tri1 & s4/tri1)
  4. the linear LDA helps the performance (see s1/tri1 vs s2/tri1)

Tencent exps

TODO

GPU & CPU merge

TODO

L-1 sparse initial training

  • experiments on L1 penalty
Test Set 0 1.00E-06 1.00E-05 2.50E-05 5.00E-05 7.50E-05 1.00E-04
map     61.06 61.18 61.31 61.72 62.89 62.84 62.48
2044   49.53 49.58 49.84 49.94 50.71 51.08 51.08
notetp3 43.44 43.44 43.55 44.58 45.22 44.95 45.76
1900   38.50 38.54 39.00 39.21 40.33 40.53 40.60
general 61.24 61.22 61.69 61.84 62.71 62.83 62.93
online1 58.02 58.05 58.31 58.23 58.83 59.06 59.46
online2 53.62 53.70 54.03 53.93 54.65 54.94 55.51
speedup 57.51 57.49 57.93 58.31 59.75 60.08 59.52
  • TODO

To be done

TODO

Kaldi/HTK merge

  • HTK2Kaldi: hold.
  • Kaldi2HTK: stuck. various sp models tried but don't help
  • To be done
  1. TODO

Embedded progress

  • Status:
  1. training acoustic model for sphinx
  • To be done
  1. finish AM training
  2. run test