2014-05-09
来自cslt Wiki
目录
Resoruce Building
- Maxi onboard
- Release management should be started: Zhiyong (+)
- Blaster 0.1 & vivian 0.0 system release
Leftover questions
- Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
- Multi GPU training: Error encountered
- Multilanguage training
- Investigating LOUDS FST.
- CLG embedded decoder plus online compiler.
- DNN-GMM co-training
AM development
Sparse DNN
- GA-based block sparsity (++)
- Found a paper in 2000 with similar ideas.
- Try to get a student working on high performance computing to do the optimization
Noise training
- With-clean training done. Much better on clean testing
- Experiments done. Prepare paper.
GFbank
- GFBank sinovoice 1400 MPE stream
- GFBank sinovoice 6000 MPE stream
Multilingual ASR
- MPE-based training is not very sensitive to data imbalance for English & Chinese
- Data duplication can trade-off the performance of two languages
- Test sharing shemes
Denoising & Farfield ASR
- Baseline: close-talk model decode far-field speech: 92.65
- Will investigate DAE model.
VAD
- VAD bug fixed???
- Test frame VAD accuracy
Scoring
- Phone-sequence based graph decoding done
- online scoring on going
Word to Vector
- Paper writing
LM development
NN LM
- Character-based NNLM (6700 chars, 7gram), 500M data training done.
- Inconsistent pattern in WER were found on Tenent test sets
- probably need to use another test set to do investigation.
- Investigate MS RNN LM training
QA
FST-based matching
- Word-based FST 1-2 seconds with 1600 patterns. Huilan's implementation <1 second.
- THRAX toolkit for grammar to FST
- Investigate determinization of G embedding
- Refer to Kaldi new code