“Sinovoice-2014-01-13”版本间的差异

来自cslt Wiki

跳转至：导航、搜索

2014年1月13日 (一) 05:56的版本

目录

1 Project management
2 DNN training
3 DNN Decoder

Project management

Xiaoming and Xiao Na were added into the mail list
Potential Huawei conference-transcribing project was discussed

DNN training

Environment setting

New disk space (3T) was created and mounted at /nfs/disk1
Jobs with 100 threads work fine on the cluster

Corpora

How many extra data were obtained?

470 hour 8k training

CE training done
MPE training partially done

Model	CE	MPE1	MPE2	MPE3	MPE4
4k states	23.27/22.85	21.35/18.87	21.18/18.76	21.07/18.54
8k states	22.16/22.22	-	20.36/17.94	-

6000 hour 16k training

Audio files done. File with incorrect sampling rates were removed
Lexicon and LM were done
Making MFCC features

DNN Decoder

Initial trail of DNN decoder based on the Sinovoice code was failed, largely due to FST compiler
Change the strategy to an integrated approach: use the sinovoice system to control connections, and use Kaldi base for asr engine

取自“http://index.cslt.org/mediawiki/index.php?title=Sinovoice-2014-01-13&oldid=9032”