“2014-07-05”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以内容“==Resoruce Building== == Leftover questions== * Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. * Multi...”创建新页面)
 
第20行: 第20行:
  
 
<pre>
 
<pre>
                                  HW 27h (HW TR LM not involved)    HW27h (HW TR LM involved)
+
    AM\\testset    |JS27H_100| JS_126H |  JS_2h  |ShanXi_2h|ShaanXi2h|Hubei2h|  ENG  |
Fbank stream (monolang)            21.64                                  20.72
+
Tel201406.v1.0.S  |        |        |    -    |    -    |    -    |    -    |    -    |
FBank non-stream (MPE4)            22.23                                   21.38
+
Tel201406.v1.1.S  |    -   |        |    -    |    -    |    -    |    -    |    -    |
FBank stream (MPE4)                21.99                                    -  
+
Tel201406.HW.v2.0.B|  20.51  |  18.30  |  17.61  |  24.18  |  23.04  |  22.51  |  56.26  |
 +
Tel201406.HW.v2.0.S|  20.07  |  17.80  |  17.75  |  23.79  |  22.44  |  22.53  |  36.77  |
 +
Tel201406.HW.v2.1.B|  19.24  |  16.53  |  17.09  |  24.35  |  22.29  |  22.89  |  55.74  |
 +
Tel201406.HW.v2.1.S|  19.48  |  16.81  |  17.68  |  24.56  |  23.02  |  23.58  |  44.69 |
 
</pre>
 
</pre>
 +
 +
* v1.*: no English words involved.
 +
* v2.*: with English words involved.
  
 
===Denoising & Farfield ASR===
 
===Denoising & Farfield ASR===

2014年7月9日 (三) 00:16的版本

Resoruce Building

Leftover questions

  • Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test.
  • Multi GPU training: Error encountered
  • Multilanguage training
  • Investigating LOUDS FST.
  • CLG embedded decoder plus online compiler.
  • DNN-GMM co-training

AM development

Sparse DNN

  • GA-based block sparsity (+++++++++)

Noise training

  • Journal paper writing on going

Multilingual ASR

     AM\\testset    |JS27H_100| JS_126H |  JS_2h  |ShanXi_2h|ShaanXi2h|Hubei2h|   ENG   |
 Tel201406.v1.0.S  |         |         |    -    |    -    |    -    |    -    |    -    |
 Tel201406.v1.1.S  |    -    |         |    -    |    -    |    -    |    -    |    -    |
Tel201406.HW.v2.0.B|  20.51  |  18.30  |  17.61  |  24.18  |  23.04  |  22.51  |  56.26  |
Tel201406.HW.v2.0.S|  20.07  |  17.80  |  17.75  |  23.79  |  22.44  |  22.53  |  36.77  |
Tel201406.HW.v2.1.B|  19.24  |  16.53  |  17.09  |  24.35  |  22.29  |  22.89  |  55.74  |
Tel201406.HW.v2.1.S|  19.48  |  16.81  |  17.68  |  24.56  |  23.02  |  23.58  |  44.69  |
  • v1.*: no English words involved.
  • v2.*: with English words involved.

Denoising & Farfield ASR

  • Reverberant data delivered
  • global CMN based spectrum checking done. Seems the signal/feature transform with DNN is not a very reasonable waycheck here.

VAD

  • Waiting for engineering work

Scoring

  • Refine the acoustic model with AMIDA database. problem solved by involving both wsj and AMIDA.


Embedded decoder

  • WER vs RT vs graph size done.
  • The first deliver is Emb201407_BG_v0.0
  • Demo done


LM development

Domain specific LM

h2. Domain specific LM construction

h3. Mixture LM

  • TAG model: 127h HuaWei tag analysis done.
  • Performance on the NUM-tagged model under testing.

Word2Vector

W2V based doc classification

  • Good performance obtained with the SSA (semantic space allocation). That is, train a general GMM, and then represent each doc as the vector of the GMM weight.
  • APSIPA paper submitted

Semantic word tree

  • Version v2.0 released (filter with query log)
  • Please deliver to /nfs/disk/perm/data/corpora/semanticTree (Xingchao)
  • Version v3.0 under going. Further refinement with Baidu Baike hierarchy


NN LM

  • Character-based NNLM (6700 chars, 7gram), 500M data training done.
  • Inconsistent pattern in WER were found on Tenent test sets
  • probably need to use another test set to do investigation.
  • Investigate MS RNN LM training