“2013-08-23”版本间的差异

2013年8月23日 (五) 03:13的最后版本

Data sharing

LM count files still undelivered!

DNN progress

Discriminative DNN

Running 1200-3620 NN, graph generation is done. Training is still running stupidly.

Sparse DNN

Iterative sparse sticky training runs.

Tencent exps

DNN Confidence estimation

Tested on a high WER test set. The distribution curve is still bizzard, for both correct and incorrect words, a high peak is around zero.
Accumulated DNN confidence is on development.
Generate lattice-based confidence
Prepare MLP-based confidence integration

GFCC DNN

GFCC computing is highly slow. 100 hour speech costs 16 hour cpu time. RT is around 0.2. It is intolerable.
100 hour GFCC-based DNN, Tencent test results:


No noise-added:

1,MFCC 100_1200_1200_1200_1200_3580
       map: %WER 23.75 [ 3474 / 14628, 134 ins, 373 del, 2967 sub ]
       2044: %WER 21.47 [ 4991 / 23241, 304 ins, 664 del, 4023 sub ]
       notetp3: %WER 13.17 [ 244 / 1853, 10 ins, 26 del, 208 sub ]
       record1900: %WER 8.10 [ 963 / 11888, 217 ins, 299 del, 447 sub ]
       general: %WER 34.41 [ 12943 / 37619, 779 ins, 785 del, 11379 sub ]
       online1: %WER 33.02 [ 9388 / 28433, 522 ins, 1465 del, 7401 sub ]
       online2: %WER 25.99 [ 15363 / 59101, 873 ins, 2408 del, 12082 sub ]
       speedup: %WER 23.52 [ 1236 / 5255, 72 ins, 213 del, 951 sub ]
       ----
2,GFCC 100_1200_1200_1200_1200_3625
       map: %WER 22.95 [ 3357 / 14628, 109 ins, 471 del, 2777 sub ]
       2044: %WER 20.93 [ 4865 / 23241, 387 ins, 748 del, 3730 sub ]
       notetp3: %WER 15.43 [ 286 / 1853, 41 ins, 26 del, 219 sub ]
       record1900: %WER 7.32 [ 870 / 11888, 107 ins, 266 del, 497 sub ]
       general: %WER 31.57 [ 11878 / 37619, 587 ins, 861 del, 10430 sub ]
       online1: %WER 31.83 [ 9049 / 28433, 519 ins, 1506 del, 7024 sub ]
       online2: %WER 25.20 [ 14894 / 59101, 839 ins, 2434 del, 11621 sub ]
       speedup: %WER 22.97 [ 1207 / 5255, 73 ins, 221 del, 913 sub ]
       ----

White noise added into the test data:

1,NOISE LEVEL:about 15db
  1) MFCC 100_1200_1200_1200_1200_3580
    map: %WER 65.24 [ 9544 / 14628, 48 ins, 2841 del, 6655 sub ]
    2044: %WER 48.93 [ 11372 / 23241, 176 ins, 2803 del, 8393 sub ]
    notetp3: %WER 55.91 [ 1036 / 1853, 9 ins, 476 del, 551 sub ]
    record1900: %WER 25.43 [ 3023 / 11888, 27 ins, 1387 del, 1609 sub ]
    general: %WER 70.05 [ 26352 / 37619, 141 ins, 5336 del, 20875 sub ]
    online1: %WER 50.40 [ 14329 / 28433, 431 ins, 3827 del, 10071 sub ]
    online2: %WER 48.45 [ 28632 / 59101, 664 ins, 7930 del, 20038 sub ]
    speedup: %WER 64.78 [ 3404 / 5255, 13 ins, 1084 del, 2307 sub ]
    ----
  2)GFCC 100_1200_1200_1200_1200_3625
    map: %WER 62.99 [ 9214 / 14628, 63 ins, 3113 del, 6038 sub ]
    2044: %WER 46.34 [ 10769 / 23241, 251 ins, 2897 del, 7621 sub ]
    notetp3: %WER 52.46 [ 972 / 1853, 18 ins, 545 del, 409 sub ]
    record1900: %WER 26.62 [ 3164 / 11888, 133 ins, 1181 del, 1850 sub ]
    general: %WER 66.04 [ 24843 / 37619, 404 ins, 5277 del, 19162 sub ]
    online1: %WER 46.61 [ 13254 / 28433, 466 ins, 3725 del, 9063 sub ]
    online2: %WER 44.49 [ 26292 / 59101, 813 ins, 7552 del, 17927 sub ]
    speedup: %WER 60.38 [ 3173 / 5255, 25 ins, 1061 del, 2087 sub ]

GFCC is generally better than MFCC, particularly with noise
noise impact is significantly high. Need de-noise algorithms
Try noise-robust training

Stream decoding

The interface for server-side is done. For embedded-side is on development.

To do:

global CMN initialization.

Subgraph integration

Compress subgraph HCLG is done. The integration is around 1-2 seconds.
G.fst integration encounters a problem: after G+L, determinization is halted.

Embedded progress

GFCC-based engine test. Just started.

@@ 第77行： / 第77行： @@
      speedup: %WER 60.38 [ 3173 / 5255, 25 ins, 1061 del, 2087 sub ]
 </pre>
+* GFCC is generally better than MFCC, particularly with noise
+* noise impact is significantly high. Need de-noise algorithms
+* Try noise-robust training
 ==Stream decoding==

“2013-08-23”版本间的差异

2013年8月23日 (五) 03:13的最后版本

目录

Data sharing

DNN progress

Discriminative DNN

Sparse DNN

Tencent exps

DNN Confidence estimation

GFCC DNN

Stream decoding

Subgraph integration

Embedded progress

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具