“Sinovoice-2014-01-13”版本间的差异
来自cslt Wiki
(→6000 hour 16k trainin) |
|||
(相同用户的3个中间修订版本未显示) | |||
第12行: | 第12行: | ||
==Corpora== | ==Corpora== | ||
− | * | + | * 60 hour data were cut this week |
− | + | * Just send out to vendors for labeling | |
+ | * Waiting for out-source platform construction | ||
+ | * We assume 60 hour data per week in future | ||
==470 hour 8k training== | ==470 hour 8k training== | ||
第31行: | 第33行: | ||
==6000 hour 16k training== | ==6000 hour 16k training== | ||
− | * Audio files | + | * Audio files ready. Files with incorrect sampling rates were removed |
− | * Lexicon and LM were | + | * Lexicon and LM were ready |
* Making MFCC features | * Making MFCC features | ||
+ | * Initial model (6 iterations etc) can be delivered before the spring holiday | ||
=DNN Decoder= | =DNN Decoder= | ||
* Initial trail of DNN decoder based on the Sinovoice code was failed, largely due to FST compiler | * Initial trail of DNN decoder based on the Sinovoice code was failed, largely due to FST compiler | ||
* Change the strategy to an integrated approach: use the sinovoice system to control connections, and use Kaldi base for asr engine | * Change the strategy to an integrated approach: use the sinovoice system to control connections, and use Kaldi base for asr engine | ||
+ | * Xiaoming will do some investigation on the Sinovoice FST compiler, while Liu Chao will focus on the Kaldi-based decoder |
2014年1月13日 (一) 07:11的最后版本
目录
Project management
- Xiaoming and Xiao Na were added into the mail list
- Potential Huawei conference-transcribing project was discussed
DNN training
Environment setting
- New disk space (3T) was created and mounted at /nfs/disk1
- Jobs with 100 threads work fine on the cluster
Corpora
- 60 hour data were cut this week
- Just send out to vendors for labeling
- Waiting for out-source platform construction
- We assume 60 hour data per week in future
470 hour 8k training
- CE training done
- MPE training partially done
Model | CE | MPE1 | MPE2 | MPE3 | MPE4 |
---|---|---|---|---|---|
4k states | 23.27/22.85 | 21.35/18.87 | 21.18/18.76 | 21.07/18.54 | |
8k states | 22.16/22.22 | - | 20.36/17.94 | - |
6000 hour 16k training
- Audio files ready. Files with incorrect sampling rates were removed
- Lexicon and LM were ready
- Making MFCC features
- Initial model (6 iterations etc) can be delivered before the spring holiday
DNN Decoder
- Initial trail of DNN decoder based on the Sinovoice code was failed, largely due to FST compiler
- Change the strategy to an integrated approach: use the sinovoice system to control connections, and use Kaldi base for asr engine
- Xiaoming will do some investigation on the Sinovoice FST compiler, while Liu Chao will focus on the Kaldi-based decoder