2014-04-18

来自cslt Wiki

跳转至：导航、搜索

目录

[隐藏]

1 Resoruce Building
2 Leftover questions
3 AM development
4 Word to Vector
5 LM development
- 5.1 NN LM
6 QA
- 6.1 FST-based matching
- 6.2 Speech QA

Resoruce Building

quota on /nfs/disk this Saturday
release management should be started: Zhiyong
Blaster 0.1 & vivian 0.0 system release

Leftover questions

Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
Multi GPU training: Error encountered
Multilanguage training
Investigating LOUDS FST.
CLG embedded decoder plus online compiler.
DNN-GMM co-training

AM development

Sparse DNN

GA-based block sparsity

Found a paper in 2000 with similar ideas.
Try to get a student working on high performance computing to do the optimization

Noise training

More experiments with no-noise
More experiments with additional noise types

AMR compression re-training

1700h MPE adaptation done
1700h stream mode adaptation runs into MPE4 done
Stream model is better than non-stream wave

GFbank

GFBank Sinovoice test on 100h MPE
Tencent 100h MPE training done

Multilingual ASR

all phone strategy baseline done
Testing on Mandarin & English individually

Denoising & Farfield ASR

re-Recording done
Prepare to construct the baseline

VAD

Code ready, migrate to the VAD code framework

Scoring

g-score based on MLP is done
t-score based on linear regression improves the performance

Word to Vector

Dimension of low space varies from 10-100
8-thread word vector generation is 3 times faster than the LDA.

LM development

NN LM

Character-based NNLM (6700 chars, 7gram), 500M data training done.

Non-boundary char LM is better than boundary char LM

Investigate MS RNN LM training

QA

FST-based matching

Word-based FST 1-2 seconds with 1600 patterns. Huilan's implementation <1 second. ????
Char-FST Implementation is done. Not so effective.

Speech QA

Investigate determinization of G embedding

取自“http://index.cslt.org/mediawiki/index.php?title=2014-04-18&oldid=9727”