2015年11月16日 (一) 07:27的最后版本

Speech Processing

AM development

Environment

in disaster

RNN AM

train monophone RNN --zhiyuan

train RNN MPE using large dataset--mengyuan

hold
better mpe result observed ,unknown errors in previous lstm mpe compiling kaldi
http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=403

Adapative learning rate method

sequence training -Xiangyu

http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=458

Mic-Array

hold
compute EER with kaldi

Data selection unsupervised learning

hold
acoustic feature based submodular using Pingan dataset --zhiyong
write code to speed up --zhiyong
curriculum learning --zhiyong

RNN-DAE(Deep based Auto-Encode-RNN)

hold
RNN-DAE has worse performance than DNN-DAE because training dataset is small
extract real room impulse to generate WSJ reverberation data, and then train RNN-DAE

Ivector&Dvector based ASR

learning from ivector --Lantian

CNN ivector learning
DNN ivector learning

binary ivector
metric learning
LDA-vector Transfer Learning
write a technique report

language vector

write a paper--zhiyuan

hold

language vector is added to multi hidden layers--zhiyuan

write code done
check code
http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=480

RNN language vector

hold

multi-GPU

multi-stream training --Sheng Su

write a technique report

kaldi-nnet3 --Xuewei

7*2048 8k 1400h tdnn training Xent done
nnet3 mpe code is under investigation
http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=472

train 7*2048 tdnn using 4000h data --Mengyuan

multi-task

test according to selt-information neural structure learning --mengyuan

hold
write code done
no significant performance improvement observed

speech rate learning --xiangyu

no significant performance improvement observed
http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=483

test using extreme data

Text Processing

RNN LM

character-lm rnn(hold)
lstm+rnn

check the lstm-rnnlm code about how to Initialize and update learning rate.(hold)

Neural Based Document Classification

(hold)

RNN Rank Task

Test.
Paper: RNN Rank Net.

(hold)
Output rank information.

Graph RNN

Entity path embeded to entity.

(hold)

RNN Word Segment

Set bound to word segment.

(hold)

Seq to Seq(09-15)

Review papers.
Reproduce baseline. (08-03 <--> 08-17)

Order representation

Nested Dropout

semi-linear --> neural based auto-encoder.

modify the objective function(hold)

Balance Representation

Find error signal

Recommendation

Reproduce baseline.

LDA matrix dissovle.
LDA (Text classification & Recommendation System) --> AAAI

RNN based QA

Read Source Code.
Attention based QA.
Coding.

RNN Poem Process

Seq based BP.

(hold)

Text Group Intern Project

Buddhist Process

(hold)

RNN Poem Process

Done by Haichao yu & Chaoyuan zuo Mentor : Tianyi Luo.

RNN Document Vector

(hold)

Image Baseline

Demo Release.
Paper Report.

Read CNN Paper.

Text Intuitive Idea

Trace Learning

(Hold)

Match RNN

(Hold)

financial group

model research

RNN

online model, update everyday
modify cost function and learning method
add more feature

rule combination

GA method to optimize the model

basic rule

classical tenth model

multiple-factor

add more factor
use sparse model

display

bug fixed

buy rule fixed

data

data api

download the future data and factor data

@@ 第3行： / 第3行： @@
 ==== Environment ====
-* repair laptop
+* in disaster
 ==== RNN AM====
 *train monophone RNN --zhiyuan
-:* decode using 5-gram
+:* end to end MPE
-:* the train method of batch
+:* http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=446
-:* test using another test set
 * train RNN MPE using large dataset--mengyuan
-:* diverge problem
+:*hold
-:* try adaptation method
+:* better mpe result observed ,unknown errors in previous lstm mpe compiling kaldi
+:*http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=403
+====Adapative learning rate method====
-====Learning rate tunning====
 * sequence training -Xiangyu
+:*http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=458
 ==== Mic-Array ====
@@ 第30行： / 第29行： @@
 ====RNN-DAE(Deep based Auto-Encode-RNN)====
+* hold
 * RNN-DAE has worse performance than DNN-DAE because training dataset is small
 * extract real room impulse to generate WSJ reverberation data, and then train RNN-DAE
 ===Ivector&Dvector based ASR===
-* dark knowledge
+* learning from ivector --Lantian
-:* has much worse performance than baseline (EER: base 29%  dark knowledge 48%)
+:* CNN ivector learning
+:* DNN ivector learning
 * binary ivector
 * metric learning
+* LDA-vector Transfer Learning
+* write a technique report
 ===language vector===
-* hold
 * write a paper--zhiyuan
+:*hold
 * language vector is added to multi hidden layers--zhiyuan
+:* write code done
+:* check code
+:*http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=480
 * RNN language vector
 :*hold
@@ 第48行： / 第54行： @@
 ===multi-GPU===
 * multi-stream training --Sheng Su
-:*two GPUs work well, but four GPUs divergent
+:* write a technique report
-* solve the problem of buffer-- Sheng Su
 * kaldi-nnet3 --Xuewei
+:* 7*2048 8k 1400h tdnn training Xent done
+:* nnet3 mpe code is under investigation
+:*http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=472
+* train 7*2048 tdnn using 4000h data --Mengyuan
 ===multi-task===
-* write code according to selt-information neural structure learning --mengyuan
+* test according to selt-information neural structure learning --mengyuan
+:* hold
+:* write code done
+:* no significant performance improvement observed
 * speech rate learning --xiangyu
+:* no significant performance improvement observed
-===Neutral picture style transfer==
+:*http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=zxw&step=view_request&cvssid=483
-*hold
+: test using extreme data
-* reproduced the result of the paper "A neutral algorithm of artistic style" --Zhiyuan, Xuewei
-* while subject to the GPU's memory, limited to inception net with sgd optimizer (VGG network with the default L-BFGS optimizer consumes very much memory, which is better)
-===Multi-task learning===
-* train model using speech rate --xiangyu
-* speech recognition plus speaker reconition --xiangyu,lantian,zhiyuan
 ==Text Processing==

“ASR:2015-10-19”版本间的差异

2015年11月16日 (一) 07:27的最后版本

目录

Speech Processing

AM development

Environment

RNN AM

Adapative learning rate method

Mic-Array

Data selection unsupervised learning

RNN-DAE(Deep based Auto-Encode-RNN)

Ivector&Dvector based ASR

language vector

multi-GPU

multi-task

Text Processing

RNN LM

Neural Based Document Classification

RNN Rank Task

Graph RNN

RNN Word Segment

Seq to Seq(09-15)

Order representation

Balance Representation

Recommendation

RNN based QA

RNN Poem Process

Text Group Intern Project

Buddhist Process

RNN Poem Process

RNN Document Vector

Image Baseline

Text Intuitive Idea

Trace Learning

Match RNN

financial group

model research

rule combination

basic rule

multiple-factor

display

data

导航菜单

搜索