2014-06-06
来自cslt Wiki
目录
Resoruce Building
- Release management has been started
Leftover questions
- Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test.
- Multi GPU training: Error encountered
- Multilanguage training
- Investigating LOUDS FST.
- CLG embedded decoder plus online compiler.
- DNN-GMM co-training
AM development
Sparse DNN
- GA-based block sparsity (++++++)
Noise training
- Paper writing will be started this week
GFbank
- Running into Sinovoice 8k 1400 + 100 mixture training.
- GFbank 14 xEnt iteration completed:
Huawei disanpi BJ mobile 8k English data
FBank non-stream (17 iteration) 22.01% 26.63% - GFbank stream (14 iteration) 22.47%; 27.52% -
Multilingual ASR
Huawei disanpi BJ mobile 8k English data
FBank non-stream - - -
- Multilingual LM decoding
- TAG-based decoding still problematic. Decoding goes into subgraph, however the decoding results are incorrect.
- Investigate with free-loop grammar.
- Non-tag test should be conducted on both Baidu & micro blob data
- Should test the 8k shujutang data on the mixture model.
Denoising & Farfield ASR
- Add artificial reverberant with various energy decay & time delay. Draw a plot decay vs WER, delay vs WER.
- Use more training data to do adaptation.
- Record the wave with a single speaker & near-field microphone and do test again.
VAD
- DNN-based VAD (7.49) showers much better performance than energy based VAD (45.74)
- Need to test small scale network (+)
- 600-800 network test
- 100 X 4 + 2 network training
Scoring
- Collect more data with human scoring to train discriminative models
Embedded decoder
1200 X 4 + 10k AM:
150k 20k 10k 5k
WER 42.23 43.45 44.54 46.07 RT 1h31 48m 44m 43m
LM development
Domain specific LM
- Retrieve both Baidu & microblog
- Need to check into gitLab(+).
Word2Vector
- Design network spider
- Design semantic related word tree
- First version based on pattern match done
- Filter with query log
- Further refinement with Baidu Baike hierarchy
NN LM
- Character-based NNLM (6700 chars, 7gram), 500M data training done.
- Inconsistent pattern in WER were found on Tenent test sets
- probably need to use another test set to do investigation.
- Investigate MS RNN LM training