“2013-09-27”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
FBank features
Noisy training
第26行: 第26行:
 
White noise with car noise are 1/3 respectively in the base distribution. The performance report is here:
 
White noise with car noise are 1/3 respectively in the base distribution. The performance report is here:
  
[http://192.168.0.50:3000/series/?action=view&series=99,99.0,99.1,99.2,99.3,99.4,99.5,99.6,99.7 click]
+
[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/%E6%96%87%E4%BB%B6:Chart3.png click here]
  
 
The conclusions is that:
 
The conclusions is that:

2013年9月29日 (日) 15:24的版本

Data sharing

  • LM count files still undelivered!

DNN progress

Sparse DNN

  • Optimal Brain Damage based sparsity is on going. Prepare the algorithm.
  • An interesting investigation is drop-out 50% weights after each iteration, and then re-training without sticky.

Report on here

FBank features

1000 hour testing: here

Tencent exps

N/A


Noisy training

Sample noise segments randomly for each utterance. Using Dirichlet to sample noise distribution on various types, and use Gaussian to sample SNR.

White noise with car noise are 1/3 respectively in the base distribution. The performance report is here:

click here

The conclusions is that:

1. by sampling noises, most of the noise patterns can be learned efficiently and thus improve performance on noisy test data. 2. by sampling noises with high variance, performance on clean speech is largely remained.

Continuous LM

1. SogouQ n-gram building: 500M text data, 110k words. Two tests:

(1) using Tencent online1 and online2 transcription: online1 1651 online2: 1512
(2) using 70k sogouQ test set : ppl 33
 This means the SogouQ text is significantly different from the online1 and online2 Tencent set, due to the highly different domain.

2. NN LM

  Using 11k words as input, 192 hidden layer. 500M text data from QA data. test with online2 transcription.
 (1)  Take 1-1024 from NN LM, and others predicted by 4-gram. n-gram baseline: 402.37; NN+ngram: 122.54
 (2)  Take 1-2048 from NN LM, and others predicted by 4-gram. n-gram baseline: 402.37; NN+ngram: 127.59
 (3)  Take 1024-2048 from NN LM, and others predicted by 4-gram. n-gram baseline: 402.37; NN+ngram: 118.92


Conclusions: NN LM is extremely good than n-gram, due to its smooth capacity.