“2013-11-15”版本间的差异
来自cslt Wiki
(→QA LM) |
|||
(相同用户的3个中间修订版本未显示) | |||
第17行: | 第17行: | ||
===Noisy training === | ===Noisy training === | ||
+ | |||
+ | * An ICASSP paper submitted. | ||
* Simulated Annealing training. | * Simulated Annealing training. | ||
第35行: | 第37行: | ||
==LM development== | ==LM development== | ||
− | ==NN LM=== | + | ===NN LM=== |
# Results show better performance with NN rescoring. | # Results show better performance with NN rescoring. | ||
第53行: | 第55行: | ||
</pre> | </pre> | ||
− | ==QA LM == | + | ===QA LM === |
The QA model training done. Test on the Sogou Q text. | The QA model training done. Test on the Sogou Q text. | ||
第59行: | 第61行: | ||
{| class="wikitable" | {| class="wikitable" | ||
! Data !! lexicon !! size !! size2 !! PPL !! PPL2 | ! Data !! lexicon !! size !! size2 !! PPL !! PPL2 | ||
+ | |- | ||
|Q (10G)||15w ||1.5G ||800M|| 301.64 || 317.19 | |Q (10G)||15w ||1.5G ||800M|| 301.64 || 317.19 | ||
|- | |- | ||
− | |QA(100G) | + | |QA(100G)||11w ||4.5G ||1G || 287.134 || 315.695 |
|- | |- | ||
− | |QA(100G) | + | |QA(100G)||8w8 ||4.5G ||1G || 559.029 || 626.146 |
|- | |- | ||
|} | |} |
2013年11月18日 (一) 06:48的最后版本
目录
Data sharing
- LM count files still undelivered!
AM development
Sparse DNN
- Optimal Brain Damage(OBD).
- Basic OBD done, with the ICASSP paper submitted.
- Online OBD running
- Try 3 configurations: batch size=256, 13000 (10 prunings), whole data.
- The current results show that the the performance follows the order: Acc(whole data) > Acc(256) > Acc(13000).
- Investigate some in-the-middle update, e.g., update twice for each iteration.
Noisy training
- An ICASSP paper submitted.
- Simulated Annealing training.
- Rejected with small noises.
- Using just the clean speech, it still rejected. This a bit strange.
- Noise concentrated training
- Using pure noise (no silence, narrow SNR band). Most of the results are expected.
- Need to check the case with car-noise 20/25 db training and white noise 20 db test.
- Noise-adding modification
- Need to re-implement the noise-adding. Make it before the fbank computation.
Tencent exps
N/A
LM development
NN LM
- Results show better performance with NN rescoring.
2044 map notetp3 record1900 general online1 online2 speedup scal= 0.5 28.69 34.52 20.56 14.53 45.52 41.3 34.48 33.53 scal = 0.6 28.3 34.28 20.67 14.05 45.34 40.73 33.81 32.71 scal = 0.7 27.84 33.81 20.18 13.74 45.13 40.29 33.17 31.86 scal = 0.8 27.58 33.87 19.16 13.53 44.92 40 32.82 31.74 scal = 0.9 27.86 33.92 19.05 13.41 44.9 39.65 32.5 31.89 scal = 0.95 27.79 34.07 19.05 13.56 44.83 39.76 32.41 31.68 scal = 0.96 27.9 34.1 18.83 13.53 44.83 39.79 32.43 31.68 scal = 0.97 27.94 34.15 18.83 13.47 44.82 39.78 32.44 31.89 scal = 0.99 28.02 34.2 19 13.49 44.86 39.82 32.47 32.01
QA LM
The QA model training done. Test on the Sogou Q text.
Data | lexicon | size | size2 | PPL | PPL2 |
---|---|---|---|---|---|
Q (10G) | 15w | 1.5G | 800M | 301.64 | 317.19 |
QA(100G) | 11w | 4.5G | 1G | 287.134 | 315.695 |
QA(100G) | 8w8 | 4.5G | 1G | 559.029 | 626.146 |