2014-07-18

来自cslt Wiki
2014年7月18日 (五) 01:54Cslt讨论 | 贡献的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航搜索

Resoruce Building

Leftover questions

  • Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test.
  • Multi GPU training: Error encountered
  • Multilanguage training
  • Investigating LOUDS FST.
  • CLG embedded decoder plus online compiler.
  • DNN-GMM co-training

AM development

Sparse DNN

  • GA-based block sparsity (++++++++++)

Noise training

  • Journal paper writing on going

Multilingual ASR


LM = Tel201406.HW.v2.1.1

     AM\testset    |JS27H_100|  JS_2h  |ShanXi_2h|ShaanXi2h|Unknown2h|   ENG   |
 Tel201406.v1.0.S  |         |    -    |    -    |    -    |    -    |    -    |
 Tel201406.v1.1.S  |    -    |    -    |    -    |    -    |    -    |    -    |
Tel201406.HW.v2.0.B|  20.18  |  17.49  |  23.85  |  22.81  |  22.48  |  55.06  |
Tel201406.HW.v2.0.S|  19.95  |  17.74  |  23.73  |  22.36  |  22.49  |  37.63  |
Tel201406.HW.v2.1.B|  19.14  |  16.97  |  24.26  |  22.28  |  22.97  |  55.35  |
Tel201406.HW.v2.1.S|  19.44  |  17.62  |  24.49  |  23.06  |  23.60  |  44.81  |
  • v1.*: no English words involved.
  • v2.*: with English words involved.

Denoising & Farfield ASR

  • Sparse linear prediction. Need to correct the model.
  • Use xEnt as the adaptation object, instead of MSE based feature mapping
  • Use the simulation tool to add reverberation.
  • [1]
  • Investigate the impact of speech rate. Use Tencent 200h data to conduct the experiments.
  • Investigate the correlation between phone speed & entropy.

VAD

  • Waiting for engineering work

Scoring

  • Refine the acoustic model with AMIDA database. problem solved by involving both wsj and AMIDA.
  • Model ready for picking up


Embedded decoder

  • The first deliver is Emb201407_BG_v0.0
  • Train two smaller network: 500x4+600, 400x4+500

LM development

Domain specific LM

h2. Domain specific LM construction

h3. TAG LM

  • Some problems with the tagging. all numbers are tagged.

h3. Chatting LM

  • Building chatting lexicon


Word2Vector

W2V based doc classification

  • Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.

Semantic word tree

  • Version v2.0 released (filter with query log)
  • Please deliver to /nfs/disk/perm/data/corpora/semanticTree (Xingchao)
  • Version v3.0 under going. Further refinement with Baidu Baike hierarchy


NN LM

  • Character-based NNLM (6700 chars, 7gram), 500M data training done.
  • Inconsistent pattern in WER were found on Tenent test sets
  • probably need to use another test set to do investigation.
  • Investigate MS RNN LM training

Speaker ID

  • reading materials
  • prepare to run sre08

Translation

  • Initial version released
  • collecting more data (Xinhua parallel text, bible, name entity) for the second version