“2014-03-21”版本间的差异
来自cslt Wiki
(以内容“==Resoruce Building== * Current text resource has been re-arranged and listed == AM development == === Sparse DNN === * Optimal Brain Damage(OBD). # GA-based block...”创建新页面) |
|||
第1行: | 第1行: | ||
==Resoruce Building== | ==Resoruce Building== | ||
* Current text resource has been re-arranged and listed | * Current text resource has been re-arranged and listed | ||
+ | |||
+ | == Leftover questions== | ||
+ | * Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting? | ||
+ | * Multi GPU training: Error encountered | ||
+ | * Multilanguage training | ||
+ | * Investigating LOUDS FST. | ||
+ | * CLG embedded decoder plus online compiler. | ||
== AM development == | == AM development == | ||
第10行: | 第17行: | ||
# GA-based block sparsity | # GA-based block sparsity | ||
# code ready, testing on pure matrix multiplication | # code ready, testing on pure matrix multiplication | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
===GMM - DNN co-training=== | ===GMM - DNN co-training=== | ||
− | * | + | * Co-training using Tencent data |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
===Noise training=== | ===Noise training=== | ||
− | |||
* Train with wsj database by corrupting data with various noise types | * Train with wsj database by corrupting data with various noise types | ||
− | :* | + | :* Single noise injection |
− | : | + | [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/7/7e/White-eps-converted-to.pdf White noise training] |
+ | [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/e/ec/Cafe-eps-converted-to.pdf Cafe noise training] | ||
+ | [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/3/39/Car-eps-converted-to.pdf car noise training] | ||
− | + | :* Multi noise injection | |
− | * | + | [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/f/fc/White_cafe_clean-eps-converted-to.pdf white+cafe noise training] |
− | + | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
+ | ===AMR compression re-training=== | ||
− | * | + | * 1700h AMR training on going |
− | + | ||
===GFbank=== | ===GFbank=== | ||
+ | * gfbank is better than gfcc | ||
+ | * gfbank is better than fbank | ||
+ | * gfbank + fbank seems outperforms others | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
第128行: | 第77行: | ||
*3T + tencent LM combination: | *3T + tencent LM combination: | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
*3T + QA model combination | *3T + QA model combination | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
==Embedded development== | ==Embedded development== | ||
+ | * English scoring looks fine | ||
− | + | ==QA== | |
− | + | ||
+ | ===FST-based matching=== | ||
+ | :* Code done. Simple test done | ||
+ | :* Ready for large scale test | ||
==Speech QA== | ==Speech QA== | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
*Class LM QA | *Class LM QA | ||
− | * | + | :* Now find that with smaller weight to the class FST, better performance is obtained |
− | + | :* Now it is very difficult to retrieve the words that can not be found by the original FST | |
− | + | :* Test negative weights | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + |
2014年3月21日 (五) 07:10的版本
目录
[隐藏]Resoruce Building
- Current text resource has been re-arranged and listed
Leftover questions
- Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
- Multi GPU training: Error encountered
- Multilanguage training
- Investigating LOUDS FST.
- CLG embedded decoder plus online compiler.
AM development
Sparse DNN
- Optimal Brain Damage(OBD).
- GA-based block sparsity
- code ready, testing on pure matrix multiplication
GMM - DNN co-training
- Co-training using Tencent data
Noise training
- Train with wsj database by corrupting data with various noise types
- Single noise injection
White noise training Cafe noise training car noise training
- Multi noise injection
AMR compression re-training
- 1700h AMR training on going
GFbank
- gfbank is better than gfcc
- gfbank is better than fbank
- gfbank + fbank seems outperforms others
Word to Vector
- Data preparation
- Prepared 7 category totally 500+ articles
- Prepared Sogou 9-class text, totally 9*2000 articles
- Achieved Fudan 11-class text data, only for testing
- Improved wordvector with multi sense
- Almost impossible with the toolkit
- Can think of pre-training vectors and then do clusering
- WordVecteor-based keyword extraction
- Decide to use the Sogou data to do extraction
- Evaluate the keyword in the classification task
- Wordvector based on classification
- Decide to use the Sogou data to do extraction
LM development
NN LM
- Character-based NNLM (6700 chars, 7gram), 500M data training done.
- boundary-involved char NNLM training done
- Test on going
- Investigate MS RNN LM training
3T Sogou LM
- 3T + tencent LM combination:
- 3T + QA model combination
Embedded development
- English scoring looks fine
QA
FST-based matching
- Code done. Simple test done
- Ready for large scale test
Speech QA
- Class LM QA
- Now find that with smaller weight to the class FST, better performance is obtained
- Now it is very difficult to retrieve the words that can not be found by the original FST
- Test negative weights