2014-02-14
来自cslt Wiki
目录
AM development
Sparse DNN
- Optimal Brain Damage(OBD).
- GA-based block sparsity
Efficient DNN training
- Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
Multilanguage training
- Pure Chinese training reached 4.9%
- Chinese + English reduced to 7.9%
- English phone set should discriminate beginning phone and ending phone
- Should set up multilingual network structure which shares low layers but separate languages at high layers
Engine optimization
- Decoder RT reached lower than 0.2: HCLG + MKL + icc
- Investigating LOUDS FST.
Adaptation
- Using linear hidden transform reduce WER from 14% to 11%.
Word to Vector
- Test a training toolkit Standford University, which can involve global information into word2vector training
- C++ implementation (instead of python) for data pre-processing
- Ready for training 100M data
- Ready for training word sense
- Investigating Senna toolkit from NEC. Intending to implement POS tagging based on word vectors.
LM development
NN LM
- Word-based and character-based NNLM using google word2vector completed
- Character-based NNLM completed (6000 characters, 7gram)
3T Sogou LM
- split the data into 24 sub sets, train 3gram for each set, prune with 1e-9
- Merge completed with equal weights
Embedded development
- CLG embedded decoder is almost done. Online compiler is on progress.
- Zhiyong is working on layer-by-layer DNN training.
Speech QA
- Use N-best to expand match in QA. Better performance were obtained.
- 1-best matches 96/121
- 10-best matches 102/121
- Use N-best to recover errors in entity check.
- Design a non-entity pattern to discover the possible place of an entity
- By this position range, search entities within the N-best result
- Use Pinyin to recover errors in entity check. Future work.
- Design a non-entity pattern to discover the possible place of an entity (as above)
- Match the Pinying strings of all the entities, and then match the pinyin strings with the entity pinyin
- Keep the most matched entity based on Pinyin with a threshold
- A bit worse then the original test.
- A possible problem is that the LM is over-strong, thus lead to unmatched Pinyin string in acoustic space
- Liu rong will provide a weak LM to support the research.
- Investigate some errors in entity-based LM.
- Still some errors exist
- Running entity-base LM with a small entity list