“Sinovoice-2014-04-01”版本间的差异
来自cslt Wiki
(以内容“=Environment setting= =Corpora= * New Beijing Mobile labeling done (109h). * Next will label the corrupted speech from Huawei(97h). * 300h text transcription will be...”创建新页面) |
(没有差异)
|
2014年4月1日 (二) 07:35的版本
目录
Environment setting
Corpora
- New Beijing Mobile labeling done (109h).
- Next will label the corrupted speech from Huawei(97h).
- 300h text transcription will be ready in April.
- Now totally 1338h (470 + 346 + 105BJ mobile + 200 PICC + 108h HBTc + 109h New BJ mobile) telephone speech is ready.
- 16k 6000h data: 978h online data from DataTang + 656h online mobile data + 4300h recording data.
- LM corpus preparation done.
Acoustic modeling
Telephone model training
1000h Training
- Baseline: 8k states, 470+300 MPE4, 20.29
- Jietong phone, 200 hour seed, 10k states training:
- Error in training found.
- Xent run into 5 iterations
- CSLT phone, 8k states training
- MPE1: 20.60
- MPE2: 20.37
- MPE3: 20.37
6000 hour 16k training
Training progress
- 6000h/CSLT phone set alignment/denlattice completed
- Xent: 12.83
- MPE1: 9.21
- 6000h/jt phone set alignment/denlattice completed
- denlattice re-run
Train Analysis
- The Qihang model used a subset of the 6k data
- 2500+950H+tang500h*+20131220, approximately 1700+2400 hours
- GMM training using this subset achieved 22.47%. Xiaoming's result is 16.1%.
- Seems the database is still not very consistent
- Xiaoming kicked off the job to reproduce the Qihang training using this subset
Language modeling
- Training data ready
- Xiaoxi and Wufei will collaborate to make familiar the training & testing process
- Initial optimization from telecom.
DNN Decoder
Online decoder adaptation
- Incremental training finished (stream mode)
- Online decoder completed
- Test the proportion of DNN forward and Graph search in decoding