2014年4月1日 (二) 07:35的版本

Environment setting

Corpora

New Beijing Mobile labeling done (109h).
Next will label the corrupted speech from Huawei(97h).
300h text transcription will be ready in April.
Now totally 1338h (470 + 346 + 105BJ mobile + 200 PICC + 108h HBTc + 109h New BJ mobile) telephone speech is ready.
16k 6000h data: 978h online data from DataTang + 656h online mobile data + 4300h recording data.
LM corpus preparation done.

Acoustic modeling

Telephone model training

1000h Training

Baseline: 8k states, 470+300 MPE4, 20.29
Jietong phone, 200 hour seed, 10k states training:

Error in training found.
Xent run into 5 iterations

CSLT phone, 8k states training

MPE1: 20.60
MPE2: 20.37
MPE3: 20.37

6000 hour 16k training

Training progress

6000h/CSLT phone set alignment/denlattice completed

Xent: 12.83
MPE1: 9.21

6000h/jt phone set alignment/denlattice completed

denlattice re-run

Train Analysis

The Qihang model used a subset of the 6k data

2500+950H+tang500h*+20131220, approximately 1700+2400 hours

GMM training using this subset achieved 22.47%. Xiaoming's result is 16.1%.

Seems the database is still not very consistent
Xiaoming kicked off the job to reproduce the Qihang training using this subset

Language modeling

Training data ready
Xiaoxi and Wufei will collaborate to make familiar the training & testing process
Initial optimization from telecom.

DNN Decoder

Online decoder adaptation

Incremental training finished (stream mode)
Online decoder completed

Test the proportion of DNN forward and Graph search in decoding

“Sinovoice-2014-04-01”版本间的差异

2014年4月1日 (二) 07:35的版本

目录

Environment setting

Corpora

Acoustic modeling

Telephone model training

1000h Training

6000 hour 16k training

Training progress

Train Analysis

Language modeling

DNN Decoder

Online decoder adaptation

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具