2014年3月11日 (二) 06:08的版本

Environment setting

Raid212/Raid215/Disk212 done

Corpora

PICC data are under labeling (200h) done.
Now totally 1121h (470 + 346 + 105BJ mobile + 200 PICC) telephone speech is ready.
16k 6000h data: 978h online data from DataTang + 656h online mobile data + 4300h recording data

DNN training

Telephone model training

470 + 300h + BJ mobile 105h training

Training condition                    NO NOISE        NOISE in LM       opt noise   NOISE LM + opt noise        

No noise:                                30.61%           -                    -             -
noise phone added:                       31.88%          30.76%              31.27%         31.07

BJ mobile incremental training

(1) Original 470 + 300 model: 30.24% WER

MPE2      MPE3         MPE3+iLM       MPE4+iLM
27.01%     26.72%       25.09%         24.53%

PICC dedicated training

Baseline (470+300h): 45.03
+ PICC 105h incremental training (th=0.9): 41.89
+ PICC 105h incremental training (th=0.8): 41.64
+ PICC 105h labelled training: 34.78
+ PICC 105h labelled training + PICC text LM: 29.18

6000 hour 16k training

Training progress

Ran DNN MPE to iteration 5.
Receipe

100h MPE training
1700h MPE alignment/lattice
1700h MPE training

1 week to complete 3 MPE iterations
MPE2 result: 1e-9: 10.67% (8.61%), 1e-10: 10.34% (8.27%)
MPE3 result: 1e-9: 10.48% (8.43%), 1e-10: 10.12% (8.05%)
MPE4 result: 1e-9: 10.34% (8.31%), 1e-10: 10.03% (7.97%)
MPE5 result:

Training Analysis

Shared tree GMM model training completed, WER% is similar to non-shared model .
Selected 100h online data, trained two systems: (1) di-syllable system (2) jt-phone system

        di-syl      jt-ph
GMM:      -         20.86%
Xent    15.42%      14.78%       
MPE1    14.46%      14.23%
MPE2    14.22%      14.09%
MPE3    14.26%      13.80%
MPE4    14.24%      13.68%

HTK training on the same database

HLDA: 18.22
HLDA+MPE: 14.40

Hubei telecom

Hubei telecom data (127 h), retrieve 60k sentence by conf thred=0.9, amounting to 50%

xEnt org:  -             wer_15  29.05
MPE iter1：wer_14 29.23；wer_15 29.38
MPE iter2：wer_14 29.05；wer_15 29.11
MPE iter3：wer_14 29.32；wer_15 29.28
MPE iter4：wer_14 29.29；wer_15 29.28

retrieve 30k sentences by conf thred=0.95, amounting to 25%, plus the original 770h data

xEnt org:     -             wer_15  29.05
MPE iter1:    -             wer_15: 29.36

DNN Decoder

Online decoder

CMN code delivered. Integration is done
CMN pipe code delivered. Model adaptation is on going

@@ 第98行： / 第98行： @@
 ==Online decoder==
-* Various CMN implementation test
+* CMN code delivered. Integration is done
-:* 200ms/500ms frame block adaptation
+* CMN pipe code delivered. Model adaptation is on going
-:* 10ms frame block adaptation: totally wrong
-{| class="wikitable"
-|-
-!prior weight !!   -1  !!     1  !!     5  !!      10  !!    20  !!    50  !!    100
-|-
-|200ms        ||   28.29  ||  37.53  || 35.50 ||  34.08 ||   32.90  || 32.30  || 32.77
-|-
-|500ms        ||   28.29  ||  31.28  || 30.83 ||  30.22 ||   29.50  || 29.32  || 29.36
-|-
-|}
-* CMN code delivery
-* Online model adaptation

“Sinovoice-2014-03-11”版本间的差异

2014年3月11日 (二) 06:08的版本

目录

Environment setting

Corpora

DNN training

Telephone model training

470 + 300h + BJ mobile 105h training

BJ mobile incremental training

PICC dedicated training

6000 hour 16k training

Training progress

Training Analysis

Hubei telecom

DNN Decoder

Online decoder

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具