“2014-03-21”版本间的差异

2014年3月21日 (五) 07:15的版本

Resoruce Building

Current text resource has been re-arranged and listed

Leftover questions

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
Multi GPU training: Error encountered
Multilanguage training
Investigating LOUDS FST.
CLG embedded decoder plus online compiler.

AM development

Sparse DNN

GA-based block sparsity

code ready, testing on pure matrix multiplication

GMM - DNN co-training

Co-training using Tencent data

Noise training

Single noise injection

Multi noise injection

white+cafe noise training

AMR compression re-training

1700h AMR training on going

GFbank

gfbank is better than gfcc
gfbank is better than fbank
gfbank + fbank seems outperforms others

Word to Vector

Data preparation

Prepared 7 category totally 500+ articles
Prepared Sogou 9-class text, totally 9*2000 articles
Achieved Fudan 11-class text data, only for testing

Improved wordvector with multi sense

Almost impossible with the toolkit
Can think of pre-training vectors and then do clusering

WordVecteor-based keyword extraction

Decide to use the Sogou data to do extraction
Evaluate the keyword in the classification task

Wordvector based on classification

Decide to use the Sogou data to do extraction

LM development

NN LM

Character-based NNLM (6700 chars, 7gram), 500M data training done.

boundary-involved char NNLM training done
Test on going

Investigate MS RNN LM training

3T Sogou LM

3T + tencent LM combination:
3T + QA model combination

Embedded development

English scoring looks fine

QA

FST-based matching

Code done. Simple test done
Ready for large scale test

Speech QA

Class LM QA

Now find that with smaller weight to the class FST, better performance is obtained
Now it is very difficult to retrieve the words that can not be found by the original FST
Test negative weights

@@ 第12行： / 第12行： @@
 === Sparse DNN ===
+* GA-based block sparsity
-* Optimal Brain Damage(OBD).
+:* code ready, testing on pure matrix multiplication
-# GA-based block sparsity
-# code ready, testing on pure matrix multiplication
 ===GMM - DNN co-training===
@@ 第40行： / 第37行： @@
 * gfbank is better than fbank
 * gfbank + fbank seems outperforms others
 ==Word to Vector==

“2014-03-21”版本间的差异

2014年3月21日 (五) 07:15的版本

目录

Resoruce Building

Leftover questions

AM development

Sparse DNN

GMM - DNN co-training

Noise training

AMR compression re-training

GFbank

Word to Vector

LM development

NN LM

3T Sogou LM

Embedded development

QA

FST-based matching

Speech QA

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具