2013-08-10

来自cslt Wiki

2013年8月20日 (二) 04:56Cslt（讨论 | 贡献）的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)

跳转至：导航、搜索

目录

1 Data sharing
2 DNN progress
3 DNN Confidence estimation
4 GFCC DNN
5 Stream decoding
6 Subgraph integration
7 Embedded progress

Data sharing

LM count files still undelivered!

DNN progress

Discriminative DNN

Running 1200-3620 NN, graph generation is done. DT training should be done in 3 days.

Sparse DNN

Iterative sparse sticky training runs. More sparsity is expected.

Tencent exps

online support
garbage model training
VAD optimization

DNN Confidence estimation

Distribution graph is obtained. The performance seems bad.
A possible reason is that the decoding is LM-based, and the confidence is only acoustic related. So (1) the errors in linguistic layer are not really errors in the acoustic layer (2) the search will automatically choose the almost-correct phones/states.
The conclusion is that the DNN confidence is most suitable for grammar-based applications, or at least LM information is not very strong.

To be done:

CI phone confidence, on going
No-tone confidence, on going

GFCC DNN

GFCC computing is highly slow. 100 hour speech costs 16 hour cpu time. RT is around 0.2. It is intolerable.
GFCC-based DNN training for 100 hour speech data is done. Need to test the noise-robust performance in 2 days.

Stream decoding

the code is done. Simple testing is completed.
Problem 1: CMN initialization is not perfect. Need to train a better initial CMN model.
Problem 2: balance for posterior-based silence detection.

Subgraph integration

G.fst integration is done. Initial test passed. Looks like the zero-probability is better for the NUM class.
HCLG integration is done. A bug fixed, passed initial test.
Online integration cost is 1 minute. Need to optimize.
Need thorough testing with the Tencent test suite.
Need to tune the subgraph feeding probability.

Embedded progress

GFCC-based engine test
Attain a performance curve: RT,memory size,package size Vs vocabulary size.
A new demo released for 4600 song names.

取自“http://index.cslt.org/mediawiki/index.php?title=2013-08-10&oldid=8060”