2014-03-28

Resoruce Building

Current text resource has been re-arranged and listed

Leftover questions

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
Multi GPU training: Error encountered
Multilanguage training
Investigating LOUDS FST.
CLG embedded decoder plus online compiler.
DNN-GMM co-training

AM development

Sparse DNN

GA-based block sparsity

88% element-sparsity: 25.09
80% block-sparsity: 25.5

Noise training

More experiments with no-noise
More experiments with additional noise types

AMR compression re-training

1700h MPE adaptation

iter1:

amr: %WER 13.40 [ 6398 / 47753, 252 ins, 829 del, 5317 sub ] wav: %WER 11.19 [ 5343 / 47753, 178 ins, 710 del, 4455 sub ]

iter2:

amr: %WER 13.31 [ 6358 / 47753, 255 ins, 798 del, 5305 sub ] wav: %WER 11.33 [ 5409 / 47753, 180 ins, 732 del, 4497 sub ]

GFbank

gfbank on Tentent 100h

Denoising=

Word to Vector

LDA baseline (sogou 1700*9 training set)

Memory usage more than 20G

Word-vector classification on going

Model based on category wordvector clustering

LM development

NN LM

Character-based NNLM (6700 chars, 7gram), 500M data training done.

boundary-involved char NNLM training done

Investigate MS RNN LM training

Pronunciation scoring

8k model delivered
MLP-based scoring completed

QA

FST-based matching

Char FST
Prepare FST-based QA patent

Speech QA

Class LM QA

Now find that with smaller weight to the class FST, better performance is obtained
Now it is very difficult to retrieve the words that can not be found by the original FST
Test negative weights