“2013-04-12”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第1行: 第1行:
 
1. Data sharing
 
1. Data sharing
  
  (1) Acoustic data ready. PLP feature + HLDA (with LDA transform). Model is PLP+HLDA+ML+MPE. All software ready. Good.
+
  (1) Acoustic data ready. The feature is PLP with HLDA, and the model is PLP+HLDA+MPE. All softwares ready.  
  
 
  (2)The LM data and model are being transferred.
 
  (2)The LM data and model are being transferred.
第7行: 第7行:
 
2. DNN progress
 
2. DNN progress
  
  (1) 400hour Tencent data training ready. MFCC+LDA (300/1200/1200/1220/40/1200/38xx), followed by (MFCC+BN) X LDA.  
+
  (1) 400 hour BN training is done. MFCC+LDA (300/1200/1200/1220/40/1200/38xx), followed by (MFCC+BN) with LDA.  
  (2) comparision between MFCC and BN with fmpe. Relative improvement is xx%.
+
  (2) comparision between MFCC and BN (fmpe applied). Relative improvement is xx%.  
  (3) BN system and hybrind system: relative perforamnce.  
+
  (3) BN system and hybrind system: relative perforamnce comparision %xx vs %xx.  
  (4) GPU and CPU style comparision: still on progress. Data checking. SGE is still problematic (Chao can help).  
+
  (4) GPU and CPU style comparision: still on progress. Working on data checking. SGE is still problematic (Chao can help). Hopefully done in 1 or two weeks.
  (5) RTF comparision between DNN hybrid and GMM:
+
  (5) RTF comparision between DNN hybrid and GMM: %xx vs %xx.
  
  
 
3.Kaldi/HTK merge
 
3.Kaldi/HTK merge
  
  (1) HTK2Kaldi: problematic. HMM structure is erroratic. Need to modify the toolkit in Kaldi.
+
  (1) HTK2Kaldi: The tool kaldi delivered is problematic. The HMM structure seems erroratic. Need to make correction (hopefully in 1 week).
  (2) Kaldi2HTK: need to design a new tool.  
+
  (2) Kaldi2HTK: need to design a new tool (possibley in 1 week).  
  
 
4. Embedded progress
 
4. Embedded progress
  
(1). GFCC testing. noisy robust while not very good in silence.
+
(1). GFCC training/testing. The GFCC seems highly robust to noise,  while not as  good as MFCC in silence.
  
 
(2). Prototype design. Application design is on going. Plan to deliver DNN decoder. Sparse DNN might be a good solution.
 
(2). Prototype design. Application design is on going. Plan to deliver DNN decoder. Sparse DNN might be a good solution.

2013年4月12日 (五) 06:23的版本

1. Data sharing

(1) Acoustic data ready. The feature is PLP with HLDA, and the model is PLP+HLDA+MPE. All softwares ready. 
(2)The LM data and model are being transferred.

2. DNN progress

(1) 400 hour BN training is done. MFCC+LDA (300/1200/1200/1220/40/1200/38xx), followed by (MFCC+BN) with LDA. 
(2) comparision between MFCC and BN (fmpe applied). Relative improvement is xx%. 
(3) BN system and hybrind system: relative perforamnce comparision %xx vs %xx. 
(4) GPU and CPU style comparision: still on progress. Working on data checking. SGE is still problematic (Chao can help). Hopefully done in 1 or two weeks.
(5) RTF comparision between DNN hybrid and GMM: %xx vs %xx.


3.Kaldi/HTK merge

(1) HTK2Kaldi: The tool kaldi delivered is problematic. The HMM structure seems erroratic. Need to make correction (hopefully in 1 week).
(2) Kaldi2HTK: need to design a new tool (possibley in 1 week). 

4. Embedded progress

(1). GFCC training/testing. The GFCC seems highly robust to noise, while not as good as MFCC in silence.

(2). Prototype design. Application design is on going. Plan to deliver DNN decoder. Sparse DNN might be a good solution.

(3). QA LM training, word files ready. trying to start the LM training. Refer to the doc Chao provided. /nfs/asrhome/asr/lm/chs.lm/lm.qa