DNN training

Environment setting

Schedule cluster poweroff on 3/06, construct RAID-0 on train212
The new RAID-0 used three new disks on train212
Change nfs names: disk1 -> /nfs/disk212, the raid disks: /nfs/raid212, disk2->/nfs/raid215

Corpora

PICC data are under labeling (200h), ready in one week.
105h data from BJ mobile
127h Hubei telecom
Now totally 1121h (470 + 346 + 105 + 200) telephone speech will be ready soon.
16k 6000h data: 978h online data from DataTang + 656h online mobile data + 4300h recording data

Telephone model training

470 + 300h + BJ mobile 105h training

Training condition                    NO NOISE        NOISE in LM       opt noise   NOISE LM + opt noise        

No noise:                                30.61%           -                    -             -
noise phone added:                       31.88%          30.76%              31.27%         31.07

BJ mobile incremental training

(1) Original 470 + 300 model: 30.24% WER

MPE2      MPE3         MPE3+iLM       MPE4+iLM
27.01%     26.72%       25.09%         24.53%

6000 hour 16k training

Training progress

Ran CE DNN to iteration 11 (8400 states, 80000 pdf)
Testing results go down to 12.49% WER (Sinovoice results: 10.49).

Model	WER	RT
small LM, it 4, -5/-9	15.80	1.18
large LM, it 4, -5/-9	15.30	1.50
large LM, it 4, -6/-9	15.36	1.30
large LM, it 4, -7/-9	15.25	1.30
large LM, it 5, -5/-9	14.17	1.10
large LM, it 5, -5/-10	13.77	1.29
large LM, it 6, -5/-9	13.64	1.12
large LM, it 6, -5/-10	13.25	1.33
large LM, it 7, -5/-9	13.29	1.12
large LM, it 7, -5/-10	12.87	1.17
large LM, it 8, -5/-9	13.09	-
large LM, it 8, -5/-10	12.69	-
large LM, it 9, -5/-9	12.87	-
large LM, it 9, -5/-10	12.55	-
large LM, it 10, -5/-9	12.83	1.51
large LM, it 10, -5/-10	12.48	1.65
large LM, it 11, -5/-9	12.87	1.61
large LM, it 11, -5/-10	12.46	1.28
large LM, it 12, -5/-9	12.91	1.61
large LM, it 12, -5/-10	12.49	1.28

xEnt training is done

Training Analysis

Shared tree GMM model training completed, WER% is similar to non-shared model .
Selected 100h online data, trained two systems: (1) di-syllable system (2) jt-phone system

        di-syl      jt-ph
Xent    15.42%      14.78%       
MPE1    14.46%      14.23%
MPE2    14.22%      14.09%
MPE3    14.26%      13.80%
MPE4    14.24%      13.68%

Xiaoming is working on HTK training on the same database

Hybrid training

Receipe

100h MPE training
1700h MPE alignment/lattice
1700h MPE training

1 week to complete 3 MPE iterations
MPE2 result: 1e-9: 10.67% (8.61%), 1e-10: 10.34% (8.27%)

Auto Transcription

PICC

PICC development set decoding obtained 45% WER.
PICC auto-trans incremental DT training completed

Threshold  WER
org:     45.03%
0.9:     41.89%
0.8:     41.64%

Current training data with 0.8 involve 80k sentences, amounting to about 60h data.
Sampling 60h labelled data to enrich the training
Prepare to compare the unsupervised incremental training and supervised training

Hubei telecom

Hubei telecom data (127 h), retrieve 60k sentence by conf thred=0.9, amounting to 50%

xEnt org:  -             wer_15  29.05
MPE iter1：wer_14 29.23；wer_15 29.38
MPE iter2：wer_14 29.05；wer_15 29.11
MPE iter3：wer_14 29.32；wer_15 29.28
MPE iter4：wer_14 29.29；wer_15 29.28

retrieve 30k sentences by conf thred=0.95, amounting to 25%, plus the original 770h data

xEnt org:     -             wer_15  29.05
MPE iter1:    -             wer_15: 29.36

DNN Decoder

Online decoder

Various CMN implementation test

200ms/500ms frame block adaptation
10ms frame block adaptation: totally wrong

prior weight	-1	1	5	10	20	50	100
200ms	28.29	37.53	35.50	34.08	32.90	32.30	32.77
500ms	28.29	31.28	30.83	30.22	29.50	29.32	29.36

CMN code delivery
Online model adaptation

Sinovoice-2014-03-04

目录

DNN training

Environment setting

Corpora

Telephone model training

470 + 300h + BJ mobile 105h training

BJ mobile incremental training

6000 hour 16k training

Training progress

Training Analysis

Hybrid training

Auto Transcription

PICC

Hubei telecom

DNN Decoder

Online decoder

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具