“2014-03-21”版本间的差异
来自cslt Wiki
(相同用户的2个中间修订版本未显示) | |||
第15行: | 第15行: | ||
:* code ready, testing on pure matrix multiplication | :* code ready, testing on pure matrix multiplication | ||
− | ===GMM | + | ===GMM/DNN co-training=== |
* Co-training using Tencent data | * Co-training using Tencent data | ||
+ | :* slightly better in GMM modeling when using DNN alignment | ||
+ | :* worse performance when using the re-trained GMMs | ||
===Noise training=== | ===Noise training=== | ||
第81行: | 第83行: | ||
− | ==Speech QA== | + | ===Speech QA=== |
*Class LM QA | *Class LM QA | ||
:* Now find that with smaller weight to the class FST, better performance is obtained | :* Now find that with smaller weight to the class FST, better performance is obtained | ||
:* Now it is very difficult to retrieve the words that can not be found by the original FST | :* Now it is very difficult to retrieve the words that can not be found by the original FST | ||
:* Test negative weights | :* Test negative weights |
2014年3月21日 (五) 07:20的最后版本
目录
[隐藏]Resoruce Building
- Current text resource has been re-arranged and listed
Leftover questions
- Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
- Multi GPU training: Error encountered
- Multilanguage training
- Investigating LOUDS FST.
- CLG embedded decoder plus online compiler.
AM development
Sparse DNN
- GA-based block sparsity
- code ready, testing on pure matrix multiplication
GMM/DNN co-training
- Co-training using Tencent data
- slightly better in GMM modeling when using DNN alignment
- worse performance when using the re-trained GMMs
Noise training
- Single noise injection
- Multi noise injection
AMR compression re-training
- 1700h AMR training on going
GFbank
- gfbank is better than gfcc
- gfbank is better than fbank
- gfbank + fbank seems outperforms others
Word to Vector
- Data preparation
- Prepared 7 category totally 500+ articles
- Prepared Sogou 9-class text, totally 9*2000 articles
- Achieved Fudan 11-class text data, only for testing
- Improved wordvector with multi sense
- Almost impossible with the toolkit
- Can think of pre-training vectors and then do clusering
- WordVecteor-based keyword extraction
- Decide to use the Sogou data to do extraction
- Evaluate the keyword in the classification task
- Wordvector based on classification
- Decide to use the Sogou data to do extraction
LM development
NN LM
- Character-based NNLM (6700 chars, 7gram), 500M data training done.
- boundary-involved char NNLM training done
- Test on going
- Investigate MS RNN LM training
Pronunciation scoring
- G-score done on 16k English model
- The distribution of frames over phone/frame posterior scores seem highly discriminative
- The distribution of the distance of the test utterance against the reference utterance seems a high discriminative score
QA
FST-based matching
- Code done. Simple test done
- Ready for large scale test
Speech QA
- Class LM QA
- Now find that with smaller weight to the class FST, better performance is obtained
- Now it is very difficult to retrieve the words that can not be found by the original FST
- Test negative weights