2014-06-06

Resoruce Building

Release management has been started

Leftover questions

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test.
Multi GPU training: Error encountered
Multilanguage training
Investigating LOUDS FST.
CLG embedded decoder plus online compiler.
DNN-GMM co-training

AM development

Sparse DNN

GA-based block sparsity (++++++)

Noise training

Paper writing will be started this week

GFbank

Running into Sinovoice 8k 1400 + 100 mixture training.
GFbank 14 xEnt iteration completed:

                                  Huawei disanpi     BJ mobile   8k English data

FBank non-stream (17 iteration) 22.01% 26.63% - GFbank stream (14 iteration) 22.47%; 27.52% -

Multilingual ASR

                                  Huawei disanpi     BJ mobile   8k English data

FBank non-stream - - -

Multilingual LM decoding
TAG-based decoding still problematic. Decoding goes into subgraph, however the decoding results are incorrect.
Investigate with free-loop grammar.
Non-tag test should be conducted on both Baidu & micro blob data
Should test the 8k shujutang data on the mixture model.

Denoising & Farfield ASR

Add artificial reverberant with various energy decay & time delay. Draw a plot decay vs WER, delay vs WER.
Use more training data to do adaptation.
Record the wave with a single speaker & near-field microphone and do test again.

VAD

DNN-based VAD (7.49) showers much better performance than energy based VAD (45.74)
Need to test small scale network (+)

600-800 network test
100 X 4 + 2 network training

Scoring

Collect more data with human scoring to train discriminative models

Embedded decoder

1200 X 4 + 10k AM:

       150k       20k     10k      5k

WER 42.23 43.45 44.54 46.07 RT 1h31 48m 44m 43m

LM development

Domain specific LM

Retrieve both Baidu & microblog
Need to check into gitLab(+).

Word2Vector

Design network spider
Design semantic related word tree

First version based on pattern match done
Filter with query log
Further refinement with Baidu Baike hierarchy

NN LM

Character-based NNLM (6700 chars, 7gram), 500M data training done.

Inconsistent pattern in WER were found on Tenent test sets
probably need to use another test set to do investigation.

Investigate MS RNN LM training

2014-06-06

目录

Resoruce Building

Leftover questions

AM development

Sparse DNN

Noise training

GFbank

Multilingual ASR

Denoising & Farfield ASR

VAD

Scoring

Embedded decoder

LM development

Domain specific LM

Word2Vector

NN LM

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具