“ASR:2015-07-13”版本间的差异
来自cslt Wiki
(以“==Speech Processing == === AM development === ==== Environment ====* * the GPU of grid-14 does not work ==== RNN AM==== *hold *morpheme RNN --zhiyuan *train using...”为内容创建页面) |
(→Speech Processing) |
||
(1位用户的2个中间修订版本未显示) | |||
第2行: | 第2行: | ||
=== AM development === | === AM development === | ||
− | ==== Environment ==== | + | ==== Environment ==== |
* the GPU of grid-14 does not work | * the GPU of grid-14 does not work | ||
第16行: | 第16行: | ||
====Data selection unsupervised learning | ====Data selection unsupervised learning | ||
* acoustic feature based submodular using Pinan dataset --zhiyong | * acoustic feature based submodular using Pinan dataset --zhiyong | ||
+ | * write code to speed up --zhiyong | ||
第36行: | 第37行: | ||
===Dark knowledge=== | ===Dark knowledge=== | ||
− | * test random last output layer when train MPE --zhiyuan | + | * test random last output layer when train MPE --zhiyuan,mengyuan |
第44行: | 第45行: | ||
===rectifier=== | ===rectifier=== | ||
+ | * hold | ||
* WER performs worse using auraro4 --zhiyuan | * WER performs worse using auraro4 --zhiyuan | ||
* train using other dataset | * train using other dataset | ||
第62行: | 第64行: | ||
====Order representation ==== | ====Order representation ==== | ||
* Nested Dropout | * Nested Dropout | ||
+ | :*semi-linear --> neural based auto-encoder. | ||
* modify the objective function(hold) | * modify the objective function(hold) | ||
====Balance Representation==== | ====Balance Representation==== | ||
第68行: | 第71行: | ||
====Recommendation==== | ====Recommendation==== | ||
* Reproduce baseline. | * Reproduce baseline. | ||
+ | :*LDA matrix dissovle. | ||
+ | :* LDA (Text classification & Recommendation System) --> AAAI | ||
====DSSM based QA==== | ====DSSM based QA==== | ||
− | + | * Demo Release. | |
− | + | ||
====Seq to Seq(09-15)==== | ====Seq to Seq(09-15)==== | ||
− | :* Review papers | + | :* Review papers.(Reported in 07-08) |
* Reproduce baseline. | * Reproduce baseline. | ||
第84行: | 第88行: | ||
(hold) | (hold) | ||
====Image Baseline==== | ====Image Baseline==== | ||
− | + | :*Demo Release. | |
+ | :*Paper Report. |
2015年7月16日 (四) 05:40的最后版本
Speech Processing
AM development
Environment
- the GPU of grid-14 does not work
RNN AM
- hold
- morpheme RNN --zhiyuan
- train using large dataset--mengyuan
Mic-Array
- hold
- compute EER with kaldi
====Data selection unsupervised learning
- acoustic feature based submodular using Pinan dataset --zhiyong
- write code to speed up --zhiyong
RNN-DAE(Deep based Auto-Encode-RNN)
- hold
- deliver to mengyuan
Speaker ID
- DNN-based sid --Lantian
Ivector&Dvector based ASR
- hold --Tian Lan
- Cluster the speakers to speaker-classes, then using the distance or the posterior-probability as the metric
- dark-konowlege using i-vector
- train on wsj(testbase dev93+evl92)
- --hold
Dark knowledge
- test random last output layer when train MPE --zhiyuan,mengyuan
language vector
- train using language vector with the dataset of 1400h_CN + 100h_EN--mengyuan
- write a paper--zhiyuan
rectifier
- hold
- WER performs worse using auraro4 --zhiyuan
- train using other dataset
- rectifier RNN
audio embedding=
- audio ebedding --Wei Xu
Text Processing
RNN LM
- character-lm rnn(hold)
- lstm+rnn
- check the lstm-rnnlm code about how to Initialize and update learning rate.(hold)
Neural Based Document Classification
- (hold)
Order representation
- Nested Dropout
- semi-linear --> neural based auto-encoder.
- modify the objective function(hold)
Balance Representation
- Find error signal
Recommendation
- Reproduce baseline.
- LDA matrix dissovle.
- LDA (Text classification & Recommendation System) --> AAAI
DSSM based QA
- Demo Release.
Seq to Seq(09-15)
- Review papers.(Reported in 07-08)
- Reproduce baseline.
Text Group Intern Project
- ====Buddhist Process====
(hold)
RNN Poem Process
(hold)
RNN Document Vector
(hold)
Image Baseline
- Demo Release.
- Paper Report.