2014-02-14

AM development

Sparse DNN

Optimal Brain Damage(OBD).

GA-based block sparsity

Efficient DNN training

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?

Multilanguage training

Pure Chinese training reached 4.9%
Chinese + English reduced to 7.9%
English phone set should discriminate beginning phone and ending phone
Should set up multilingual network structure which shares low layers but separate languages at high layers

Engine optimization

Decoder RT reached lower than 0.2: HCLG + MKL + icc
Investigating LOUDS FST.

Adaptation

Using linear hidden transform reduce WER from 14% to 11%.

Word to Vector

Test a training toolkit Standford University, which can involve global information into word2vector training
C++ implementation (instead of python) for data pre-processing
Ready for training 100M data
Ready for training word sense
Investigating Senna toolkit from NEC. Intending to implement POS tagging based on word vectors.

LM development

NN LM

Word-based and character-based NNLM using google word2vector completed
Character-based NNLM completed (6000 characters, 7gram)

3T Sogou LM

split the data into 24 sub sets, train 3gram for each set, prune with 1e-9
Merge completed with equal weights

Embedded development

CLG embedded decoder is almost done. Online compiler is on progress.
Zhiyong is working on layer-by-layer DNN training.

Speech QA

Use N-best to expand match in QA. Better performance were obtained.

1-best matches 96/121
10-best matches 102/121

Use N-best to recover errors in entity check.

Design a non-entity pattern to discover the possible place of an entity
By this position range, search entities within the N-best result

Use Pinyin to recover errors in entity check. Future work.

Design a non-entity pattern to discover the possible place of an entity (as above)
Match the Pinying strings of all the entities, and then match the pinyin strings with the entity pinyin
Keep the most matched entity based on Pinyin with a threshold
A bit worse then the original test.
A possible problem is that the LM is over-strong, thus lead to unmatched Pinyin string in acoustic space
Liu rong will provide a weak LM to support the research.

Investigate some errors in entity-based LM.

Still some errors exist
Running entity-base LM with a small entity list

2014-02-14

目录

AM development

Sparse DNN

Efficient DNN training

Multilanguage training

Engine optimization

Adaptation

Word to Vector

LM development

NN LM

3T Sogou LM

Embedded development

Speech QA

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具