“Sinovoice-2014-01-20”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
第3行: 第3行:
 
==Environment setting==
 
==Environment setting==
  
* Cluster accounts rearrangement
+
* Accounts re-arrangement done on the SGE cluster. NO ROOT TO WORK.
* Withdraw root/sudo previelege
+
* Changed NFS server to 40 processes, hope to increase disk reading.
* Changed NFS server to 40 processes, hope to increase the disk reading speed
+
* Agree to withdraw root/sudo privilege.
* Create a RAID-0 with 3 or 4 3T disks
+
* Agree to create a RAID-0 with another 3 3T disks
  
 
==Corpora==
 
==Corpora==
* Change the data labeling strategy: do not label gender and the length of noise in the rest of the corpora.
+
* Changed the data labeling strategy: gender and noise length will not be labelled for the following several corpora.
 
* Automatic labeling
 
* Automatic labeling
:* Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score embedded.  
+
:* Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score held.  
:* The first step is to investigate the raw accuracy on the domain-dependent test, and then decide the quality of automatic labeling
+
:* The first step is to investigate the raw accuracy on the domain-dependent test, and then decide if it is appropriate to use automatic labeling
* Xiao Na prepare 300h telephone data (Sinovoice recording) to improve the 8k model.
+
* Xiao Na will prepare 300h telephone speech data (Sinovoice recording). This will be used to improve the 8k model.
  
  
第31行: 第31行:
 
==6000 hour 16k training==
 
==6000 hour 16k training==
  
* Feature extraction done: solved three problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate
+
* Feature extraction done: solved several problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate.
* Training goes to tri4b, quick increase of states/pdfs
+
* Training has gone to tri4b, quick increase of states/pdfs.
* DNN training could be started from Tuesday
+
* DNN training will be started on Tuesday.
  
 
=DNN Decoder=
 
=DNN Decoder=
  
* Sinovoice decoder: some errors in FST building. Many triphones are lost after graph building. Problems in cdgen?  
+
* Sinovoice decoder: some errors in FST building. Many triphones were lost after C composing. Problems in cdgen?  
 
* Kaldi decoder:  
 
* Kaldi decoder:  
:* A minor difference between CLG/HCLG results was find. Debugging into the problem.
+
:* A minor difference between CLG/HCLG results was found. Debugging into the problem.
:* CLG RT is comparable to the HCLG RT, 0.3-0.4 in CSLT grid-2.
+
:* CLG RT is comparable to the HCLG, roughly 0.3-0.4 in CSLT grid-2.
 
:* Additional optimization on pdf-pre-computing will be investigated.  
 
:* Additional optimization on pdf-pre-computing will be investigated.  
 
:* Code deliver today.
 
:* Code deliver today.

2014年1月20日 (一) 10:12的最后版本

DNN training

Environment setting

  • Accounts re-arrangement done on the SGE cluster. NO ROOT TO WORK.
  • Changed NFS server to 40 processes, hope to increase disk reading.
  • Agree to withdraw root/sudo privilege.
  • Agree to create a RAID-0 with another 3 3T disks

Corpora

  • Changed the data labeling strategy: gender and noise length will not be labelled for the following several corpora.
  • Automatic labeling
  • Xiaoming will work with Zhiyong to discover how to generate transcriptions with confidence score held.
  • The first step is to investigate the raw accuracy on the domain-dependent test, and then decide if it is appropriate to use automatic labeling
  • Xiao Na will prepare 300h telephone speech data (Sinovoice recording). This will be used to improve the 8k model.


470 hour 8k training

  • MPE training done
Model CE MPE1 MPE2 MPE3 MPE4
4k states 23.27/22.85 21.35/18.87 21.18/18.76 21.07/18.54 20.93/18.32
8k states 22.16/22.22 20.55/18.03 20.36/17.94 20.32/17.78 20.29/17.80

6000 hour 16k training

  • Feature extraction done: solved several problems in the data: (1) short wave (2) unmatched file length (3) unmatched sample rate.
  • Training has gone to tri4b, quick increase of states/pdfs.
  • DNN training will be started on Tuesday.

DNN Decoder

  • Sinovoice decoder: some errors in FST building. Many triphones were lost after C composing. Problems in cdgen?
  • Kaldi decoder:
  • A minor difference between CLG/HCLG results was found. Debugging into the problem.
  • CLG RT is comparable to the HCLG, roughly 0.3-0.4 in CSLT grid-2.
  • Additional optimization on pdf-pre-computing will be investigated.
  • Code deliver today.