Sinovoice-2014-01-20

来自cslt Wiki

2014年1月20日 (一) 08:13Cslt（讨论 | 贡献）的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)

跳转至：导航、搜索

目录

1 DNN training
2 DNN Decoder

DNN training

Environment setting

Cluster accounts rearrangement
Withdraw root/sudo previelege
Changed NFS server to 40 processes, hope to increase the disk reading speed
Create a RAID-0 with 3 or 4 3T disks

Corpora

Change the data labeling strategy: do not label gender and the length of noise in the rest of the corpora.
Automatic labeling

Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score embedded.
The first step is to investigate the raw accuracy on the domain-dependent test, and then decide the quality of automatic labeling

Xiao Na prepare 300h telephone data (Sinovoice recording) to improve the 8k model.

470 hour 8k training

MPE training done

Model	CE	MPE1	MPE2	MPE3	MPE4
4k states	23.27/22.85	21.35/18.87	21.18/18.76	21.07/18.54	20.93/18.32
8k states	22.16/22.22	20.55/18.03	20.36/17.94	20.32/17.78	20.29/17.80

6000 hour 16k training

Feature extraction done: solved three problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate
Training goes to tri4b, quick increase of states/pdfs
DNN training could be started from Tuesday

DNN Decoder

Sinovoice decoder: some errors in FST building. Many triphones are lost after graph building. Problems in cdgen?
Kaldi decoder:

A minor difference between CLG/HCLG results was find. Debugging into the problem.
CLG RT is comparable to the HCLG RT, 0.3-0.4 in CSLT grid-2.
Additional optimization on pdf-pre-computing will be investigated.
Code deliver today.

取自“http://index.cslt.org/mediawiki/index.php?title=Sinovoice-2014-01-20&oldid=9106”