2014年1月20日 (一) 07:43的版本

DNN training

Environment setting

Cluster accounts rearrangement
Withdraw root/sudo previelege
Changed NFS server to 40 processes, hope to increase the disk reading speed
Create a RAID-0 with 3 or 4 3T disks

Corpora

Change the data labeling strategy: do not label gender and the length of noise in the rest of the corpora.
Automatic labeling

Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score embedded.
The first step is to investigate the raw accuracy on the domain-dependent test, and then decide the quality of automatic labeling

470 hour 8k training

MPE training done

Model	CE	MPE1	MPE2	MPE3	MPE4
4k states	23.27/22.85	21.35/18.87	21.18/18.76	21.07/18.54	20.93/18.32
8k states	22.16/22.22	20.55/18.03	20.36/17.94	20.32/17.78	20.29/17.80

6000 hour 16k training

Feature extraction done: solved three problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate
Training goes to tri4b, quick increase of states/pdfs
DNN training could be started from Tuesday

DNN Decoder

Sinovoice decoder: some errors in FST building. Many triphones are lost after graph building. Problems in cdgen?
Kaldi decoder:

A minor difference between CLG/HCLG results was find. Debugging into the problem.
CLG RT is comparable to the HCLG RT, 0.3-0.4 in CSLT grid-2.
Additional optimization on pdf-pre-computing will be investigated.
Code deliver today.

@@ 第1行： / 第1行： @@
-=Project management=
-* Xiaoming and Xiao Na were added into the mail list
-* Potential Huawei conference-transcribing project was discussed
 =DNN training=
 ==Environment setting==
-* New disk space (3T) was created and mounted at /nfs/disk1
+* Cluster accounts rearrangement
-* Jobs with 100 threads work fine on the cluster
+* Withdraw root/sudo previelege
+* Changed NFS server to 40 processes, hope to increase the disk reading speed
+* Create a RAID-0 with 3 or 4 3T disks
 ==Corpora==
-* 60 hour data were cut this week
+* Change the data labeling strategy: do not label gender and the length of noise in the rest of the corpora.
-* Just send out to vendors for labeling
+* Automatic labeling
-* Waiting for out-source platform construction
+:* Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score embedded.
-* We assume 60 hour data per week in future
+:* The first step is to investigate the raw accuracy on the domain-dependent test, and then decide the quality of automatic labeling
 ==470 hour 8k training==
-* CE training done
+* MPE training done
-* MPE training partially done
 {| class="wikitable"
 ! Model !! CE !! MPE1!! MPE2 !! MPE3 !! MPE4
 |-
-|4k states||23.27/22.85 || 21.35/18.87 || 21.18/18.76 || 21.07/18.54
+|4k states ||23.27/22.85 || 21.35/18.87 || 21.18/18.76 || 21.07/18.54 || 20.93/18.32
 |-
-|8k states ||22.16/22.22 || - ||20.36/17.94 || - ||
+|8k states ||22.16/22.22 || 20.55/18.03 ||20.36/17.94  || 20.32/17.78 || 20.29/17.80
 |-
 |}
@@ 第33行： / 第29行： @@
 ==6000 hour 16k training==
-* Audio files ready. Files with incorrect sampling rates were removed
+* Feature extraction done: solved three problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate
-* Lexicon and LM were ready
+* Training goes to tri4b, quick increase of states/pdfs
-* Making MFCC features
+* DNN training could be started from Tuesday
-* Initial model (6 iterations etc) can be delivered before the spring holiday
 =DNN Decoder=
-* Initial trail of DNN decoder based on the Sinovoice code was failed, largely due to FST compiler
-* Change the strategy to an integrated approach: use the sinovoice system to control connections, and use Kaldi base for asr engine
+* Sinovoice decoder: some errors in FST building. Many triphones are lost after graph building. Problems in cdgen?
-* Xiaoming will do some investigation on the Sinovoice FST compiler, while Liu Chao will focus on the Kaldi-based decoder
+* Kaldi decoder:
+:* A minor difference between CLG/HCLG results was find. Debugging into the problem.
+:* CLG RT is comparable to the HCLG RT, 0.3-0.4 in CSLT grid-2.
+:* Additional optimization on pdf-pre-computing will be investigated.
+:* Code deliver today.

“Sinovoice-2014-01-20”版本间的差异

2014年1月20日 (一) 07:43的版本

目录

DNN training

Environment setting

Corpora

470 hour 8k training

6000 hour 16k training

DNN Decoder

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具