2014-05-16

Resoruce Building

Maxi onboard
Release management should be started: Zhiyong (+)
Blaster 0.1 & vivian 0.0 system release

Leftover questions

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
Multi GPU training: Error encountered
Multilanguage training
Investigating LOUDS FST.
CLG embedded decoder plus online compiler.
DNN-GMM co-training

AM development

Sparse DNN

GA-based block sparsity (+++)

Found a paper in 2000 with similar ideas.
Try to get a student working on high performance computing to do the optimization

Noise training

More with-clean training completed. 2 conditions left

GFbank

8k train

GFBank sinovoice 1400 MPE stream

16k train

GFBank sinovoice 6000 MPE1 stream: worse than 1700h (10.18-11.11)

Multilingual ASR

Test sharing scheme:

decision tree share, xent improvement obtained, MPE no improvement (Chinese worse a bit, English a bit better).

English model

                             mic          tel
pure eng                    voxforge    fisher
chinese eng                 shujutang   convert-from-shujutang

Denoising & Farfield ASR

Baseline: close-talk model decode far-field speech: 92.65
Will investigate DAE model.

Kaiser Window

window function test based on 23 Mel channel number  8k wsj databas	
window function	%WER	       ins	del     sub
kaiser	 278 / 5643=4.93	39	15	224
povey	 265 / 5643=4.70	34	14	217

window function test based on 30 Mel channel number  8k wsj databas
window function	%WER	        ins	del	sub
kaiser	 270 / 5643= 4.78	38	17	215
povey	283 / 5643= 5.02 	36	24	223

VAD

DNN-based VAD (24.77) shower better performance than energy based VAD (45.73)

Scoring

online scoring done??
checked into gitlab?

Word to Vector

Paper submitted

LM development

Domain specific LM

Prepare English lexicon

NN LM

Character-based NNLM (6700 chars, 7gram), 500M data training done.

Inconsistent pattern in WER were found on Tenent test sets
probably need to use another test set to do investigation.

Investigate MS RNN LM training

QA

FST-based matching

Word-based FST 1-2 seconds with 1600 patterns. Huilan's implementation <1 second.
THRAX toolkit for grammar to FST

Investigate determinization of G embedding

Refer to Kaldi new code

2014-05-16

目录

Resoruce Building

Leftover questions

AM development

Sparse DNN

Noise training

GFbank

Multilingual ASR

English model

Denoising & Farfield ASR

Kaiser Window

VAD

Scoring

Word to Vector

LM development

Domain specific LM

NN LM

QA

FST-based matching

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具