2014-03-28

来自cslt Wiki
跳转至: 导航搜索

Resoruce Building

  • Current text resource has been re-arranged and listed

Leftover questions

  • Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
  • Multi GPU training: Error encountered
  • Multilanguage training
  • Investigating LOUDS FST.
  • CLG embedded decoder plus online compiler.
  • DNN-GMM co-training

AM development

Sparse DNN

  • GA-based block sparsity
  • 88% element-sparsity: 25.09
  • 80% block-sparsity: 25.5

Noise training

  • More experiments with no-noise
  • More experiments with additional noise types


AMR compression re-training

  • 1700h MPE adaptation
  • iter1:

amr: %WER 13.40 [ 6398 / 47753, 252 ins, 829 del, 5317 sub ] wav: %WER 11.19 [ 5343 / 47753, 178 ins, 710 del, 4455 sub ]

  • iter2:

amr: %WER 13.31 [ 6358 / 47753, 255 ins, 798 del, 5305 sub ] wav: %WER 11.33 [ 5409 / 47753, 180 ins, 732 del, 4497 sub ]


GFbank

  • gfbank on Tentent 100h

Denoising=

Word to Vector

  • LDA baseline (sogou 1700*9 training set)
  • Memory usage more than 20G
  • Word-vector classification on going
  • Model based on category wordvector clustering


LM development

NN LM

  • Character-based NNLM (6700 chars, 7gram), 500M data training done.
  • boundary-involved char NNLM training done
  • Investigate MS RNN LM training


Pronunciation scoring

  • 8k model delivered
  • MLP-based scoring completed


QA

FST-based matching

  • Char FST
  • Prepare FST-based QA patent

Speech QA

  • Class LM QA
  • Now find that with smaller weight to the class FST, better performance is obtained
  • Now it is very difficult to retrieve the words that can not be found by the original FST
  • Test negative weights