2014-03-21

来自cslt Wiki

2014年3月21日 (五) 07:20Cslt（讨论 | 贡献）的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)

跳转至：导航、搜索

目录

1 Resoruce Building
2 Leftover questions
3 AM development
4 Word to Vector
5 LM development
- 5.1 NN LM
6 Pronunciation scoring
7 QA
- 7.1 FST-based matching
- 7.2 Speech QA

Resoruce Building

Current text resource has been re-arranged and listed

Leftover questions

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
Multi GPU training: Error encountered
Multilanguage training
Investigating LOUDS FST.
CLG embedded decoder plus online compiler.

AM development

Sparse DNN

GA-based block sparsity

code ready, testing on pure matrix multiplication

GMM - DNN co-training

Co-training using Tencent data

slightly better in GMM modeling when using DNN alignment
worse performance when using the re-trained GMMs

Noise training

Single noise injection

Multi noise injection

white+cafe noise training

AMR compression re-training

1700h AMR training on going

GFbank

gfbank is better than gfcc
gfbank is better than fbank
gfbank + fbank seems outperforms others

Word to Vector

Data preparation

Prepared 7 category totally 500+ articles
Prepared Sogou 9-class text, totally 9*2000 articles
Achieved Fudan 11-class text data, only for testing

Improved wordvector with multi sense

Almost impossible with the toolkit
Can think of pre-training vectors and then do clusering

WordVecteor-based keyword extraction

Decide to use the Sogou data to do extraction
Evaluate the keyword in the classification task

Wordvector based on classification

Decide to use the Sogou data to do extraction

LM development

NN LM

Character-based NNLM (6700 chars, 7gram), 500M data training done.

boundary-involved char NNLM training done
Test on going

Investigate MS RNN LM training

Pronunciation scoring

G-score done on 16k English model
The distribution of frames over phone/frame posterior scores seem highly discriminative
The distribution of the distance of the test utterance against the reference utterance seems a high discriminative score

QA

FST-based matching

Code done. Simple test done
Ready for large scale test

Speech QA

Class LM QA

Now find that with smaller weight to the class FST, better performance is obtained
Now it is very difficult to retrieve the words that can not be found by the original FST
Test negative weights

取自“http://index.cslt.org/mediawiki/index.php?title=2014-03-21&oldid=9441”