2014-06-06

来自cslt Wiki
跳转至: 导航搜索

Resoruce Building

  • Release management has been started

Leftover questions

  • Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test.
  • Multi GPU training: Error encountered
  • Multilanguage training
  • Investigating LOUDS FST.
  • CLG embedded decoder plus online compiler.
  • DNN-GMM co-training

AM development

Sparse DNN

  • GA-based block sparsity (++++++)

Noise training

  • Paper writing will be started this week

GFbank

  • Running into Sinovoice 8k 1400 + 100 mixture training.
  • GFbank 14 xEnt iteration completed:
                                  Huawei disanpi     BJ mobile   8k English data

FBank non-stream (17 iteration) 22.01% 26.63% - GFbank stream (14 iteration) 22.47%; 27.52% -

Multilingual ASR

                                  Huawei disanpi     BJ mobile   8k English data

FBank non-stream - - -

  • Multilingual LM decoding
  • TAG-based decoding still problematic. Decoding goes into subgraph, however the decoding results are incorrect.
  • Investigate with free-loop grammar.
  • Non-tag test should be conducted on both Baidu & micro blob data
  • Should test the 8k shujutang data on the mixture model.


Denoising & Farfield ASR

  • Add artificial reverberant with various energy decay & time delay. Draw a plot decay vs WER, delay vs WER.
  • Use more training data to do adaptation.
  • Record the wave with a single speaker & near-field microphone and do test again.

VAD

  • DNN-based VAD (7.49) showers much better performance than energy based VAD (45.74)
  • Need to test small scale network (+)
  • 600-800 network test
  • 100 X 4 + 2 network training

Scoring

  • Collect more data with human scoring to train discriminative models


Embedded decoder

1200 X 4 + 10k AM:

       150k       20k     10k      5k 

WER 42.23 43.45 44.54 46.07 RT 1h31 48m 44m 43m

LM development

Domain specific LM

  • Retrieve both Baidu & microblog
  • Need to check into gitLab(+).


Word2Vector

  • Design network spider
  • Design semantic related word tree
  • First version based on pattern match done
  • Filter with query log
  • Further refinement with Baidu Baike hierarchy


NN LM

  • Character-based NNLM (6700 chars, 7gram), 500M data training done.
  • Inconsistent pattern in WER were found on Tenent test sets
  • probably need to use another test set to do investigation.
  • Investigate MS RNN LM training