“Sinovoice-2014-01-06”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以内容“=Project negotiation= =DNN training= ==Environment setting== ==Corpora== ==470hour training== 1. System environment completed in sinovoice. Prepare a document for t...”创建新页面)
 
第1行: 第1行:
=Project negotiation=
+
=Project management=
 +
 
 +
* Working items negotiation done
 +
* 2014 contract setup
 +
* The first amount (150k) delivered.
 +
* Project team setup
  
 
=DNN training=
 
=DNN training=
第5行: 第10行:
 
==Environment setting==
 
==Environment setting==
  
==Corpora==
+
* Wiki setup
 +
* Weekly meeting setup
 +
* SGE environment settled in Sinovoice
  
==470hour training==
+
==Corpora==
1. System environment completed in sinovoice. Prepare a document for the building process.
+
* New standard for data labeling is set
 +
* The current standard involves regular sentences and noise, and the former may involve noise words
  
==600 hour training==
+
==470 hour training==
2. 470h training started in Sinovoice server. Reach DNN 11 iterations.  Training Acc 48., CV Acc 47.15. Higher than 170h results.
+
  470h training just started in CSLT cluster, In the monophone step.
+
  
  470 training with 8400 states, running into the first iteration.
+
* 470h training started in Sinovoice server. Reached the 11th iteration of DNN.  Training acc 48 and  cv acc 47.15.
 +
* 470h training with 8400 states also runs in the Sinovoice cluster.
 +
* Parallel 470h training just started in CSLT cluster.
 +
* Xiaoming will prepare the test set.
 +
* More configurations on schedule.
  
3. Prepare 6k hour data.
+
==6000 hour trainin==
  
4. Zhiyong & Xiaoming work on training recipe and configurations.
+
* Data preparation should be done in 1 day
5. Xiaoming work on test set preparation.
+
* Start the training in 2 days
  
 
=Decoder=
 
=Decoder=
6. CLG decoder. LiuChao need to handle some code change for (1) Kaldi tree loading (2) bigLM compose (3) DNN feature computing
+
* Chao need to investigate the code change with Dr. Chen.
 +
* The work items involve (1) Kaldi tree loading (2) bigLM composition (3) DNN feature computing

2014年1月6日 (一) 10:51的版本

Project management

  • Working items negotiation done
  • 2014 contract setup
  • The first amount (150k) delivered.
  • Project team setup

DNN training

Environment setting

  • Wiki setup
  • Weekly meeting setup
  • SGE environment settled in Sinovoice

Corpora

  • New standard for data labeling is set
  • The current standard involves regular sentences and noise, and the former may involve noise words

470 hour training

  • 470h training started in Sinovoice server. Reached the 11th iteration of DNN. Training acc 48 and cv acc 47.15.
  • 470h training with 8400 states also runs in the Sinovoice cluster.
  • Parallel 470h training just started in CSLT cluster.
  • Xiaoming will prepare the test set.
  • More configurations on schedule.

6000 hour trainin

  • Data preparation should be done in 1 day
  • Start the training in 2 days

Decoder

  • Chao need to investigate the code change with Dr. Chen.
  • The work items involve (1) Kaldi tree loading (2) bigLM composition (3) DNN feature computing