<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://index.cslt.org/mediawiki/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="zh-cn">
		<id>http://index.cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=Task_previous</id>
		<title>Task previous - 版本历史</title>
		<link rel="self" type="application/atom+xml" href="http://index.cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=Task_previous"/>
		<link rel="alternate" type="text/html" href="http://index.cslt.org/mediawiki/index.php?title=Task_previous&amp;action=history"/>
		<updated>2026-04-16T22:40:37Z</updated>
		<subtitle>本wiki的该页面的版本历史</subtitle>
		<generator>MediaWiki 1.23.3</generator>

	<entry>
		<id>http://index.cslt.org/mediawiki/index.php?title=Task_previous&amp;diff=23059&amp;oldid=prev</id>
		<title>Tangzy：以“ =Task To Do= ==Speech Recognition==  ===CTC expanded=== *Voice activity detection :*LSTM+CTC :*TDNN+CTC ::* BLANK as silence, others as speech  *Keyword detection :...”为内容创建页面</title>
		<link rel="alternate" type="text/html" href="http://index.cslt.org/mediawiki/index.php?title=Task_previous&amp;diff=23059&amp;oldid=prev"/>
				<updated>2016-10-16T12:17:42Z</updated>
		
		<summary type="html">&lt;p&gt;以“ =Task To Do= ==Speech Recognition==  ===CTC expanded=== *Voice activity detection :*LSTM+CTC :*TDNN+CTC ::* BLANK as silence, others as speech  *Keyword detection :...”为内容创建页面&lt;/p&gt;
&lt;p&gt;&lt;b&gt;新页面&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&lt;br /&gt;
=Task To Do=&lt;br /&gt;
==Speech Recognition==&lt;br /&gt;
&lt;br /&gt;
===CTC expanded===&lt;br /&gt;
*Voice activity detection&lt;br /&gt;
:*LSTM+CTC&lt;br /&gt;
:*TDNN+CTC&lt;br /&gt;
::* BLANK as silence, others as speech&lt;br /&gt;
&lt;br /&gt;
*Keyword detection&lt;br /&gt;
:* Character/Word-level, external key-word fst&lt;br /&gt;
:* G-fst need to be signal word?&lt;br /&gt;
&lt;br /&gt;
*Emotion recognition&lt;br /&gt;
:* LSTM-CTC&lt;br /&gt;
&lt;br /&gt;
===Network architecture test===&lt;br /&gt;
*chain model:&lt;br /&gt;
:*tdnn + simple-lstm&lt;br /&gt;
::* only keep forget gate&lt;br /&gt;
 &lt;br /&gt;
*ctc+mpe&lt;br /&gt;
:* similar to chain training&lt;br /&gt;
&lt;br /&gt;
===Spiral Joint Training of SPEECH and SPEAKER===&lt;br /&gt;
*ASR &amp;amp; SID parallel training and benefit mutual&lt;br /&gt;
&lt;br /&gt;
===Small data-set and Big model===&lt;br /&gt;
*Investigate the efficiency of pre-training  on small/big-model using dark-knowledge&lt;br /&gt;
&lt;br /&gt;
===Low-resource language improvement===&lt;br /&gt;
*SID&lt;br /&gt;
:* How to improve low-resource speaker&lt;br /&gt;
&lt;br /&gt;
===End-to-End speech recognition===&lt;br /&gt;
* Discriminative-Learning code implementation&lt;br /&gt;
:* Zhiyuan Tang&lt;br /&gt;
&lt;br /&gt;
===Multi-task===&lt;br /&gt;
* Fusion of speech-recognition and speech-rate&lt;br /&gt;
:* Xiangyu Zeng&lt;br /&gt;
* Self-informed neural network structure learning&lt;br /&gt;
:* Mengyuan Zhao&lt;br /&gt;
&lt;br /&gt;
===Integrate the class information to HCLG fst for speech recognition===&lt;br /&gt;
*zhiyuan&lt;br /&gt;
&lt;br /&gt;
===Distant speech recognition===&lt;br /&gt;
*RNN-DAE: echo or reverberation&lt;br /&gt;
:*Xuewei Zhang/Zhiyuan Tang/Mengyuan Zhao/Zhiyong Zhang&lt;br /&gt;
*Reverberation&lt;br /&gt;
:*Mutli-microphones&lt;br /&gt;
:*(Lasso),Xuewei Zhang&lt;br /&gt;
&lt;br /&gt;
===Voice conversation===&lt;br /&gt;
*hold&lt;br /&gt;
&lt;br /&gt;
===Sparse DNN===&lt;br /&gt;
*Zhiyuan Tang&lt;br /&gt;
&lt;br /&gt;
===Correlation based SENONE cluster===&lt;br /&gt;
&lt;br /&gt;
===NN Multi-GPU parallel traing===&lt;br /&gt;
*Multi-GPU using data parallelization&lt;br /&gt;
:*Sheng Su&lt;br /&gt;
* nnet3 mpe&lt;br /&gt;
:* Xuewei Zhang&lt;br /&gt;
&lt;br /&gt;
===Audio Embedding===&lt;br /&gt;
*Ke Ning&lt;br /&gt;
&lt;br /&gt;
===RNN training accelerating===&lt;br /&gt;
&lt;br /&gt;
===Data selection===&lt;br /&gt;
*Zhiyong Zhang&lt;br /&gt;
*Sub-modular data selection&lt;br /&gt;
*Objective-function loss training self-adaptation&lt;br /&gt;
&lt;br /&gt;
===Decoder===&lt;br /&gt;
*Confidence output for task-required&lt;br /&gt;
&lt;br /&gt;
==Speaker Verification==&lt;br /&gt;
===binary code===&lt;br /&gt;
*Lantian Li&lt;br /&gt;
&lt;br /&gt;
===RNN-ivector===&lt;br /&gt;
*Lantian Li&lt;br /&gt;
&lt;br /&gt;
===DNN clustering===&lt;br /&gt;
*Lantian Li&lt;br /&gt;
&lt;br /&gt;
=Task DONE=&lt;br /&gt;
==Multi-Mode features based VAD==&lt;br /&gt;
* Shi Yin&lt;br /&gt;
&lt;br /&gt;
==DNN based Language identification and Speaker identification==&lt;br /&gt;
* Xuewei Zhang/Zhiyuan Tang&lt;br /&gt;
&lt;br /&gt;
==Neural network visulization==&lt;br /&gt;
* Mian Wang,DONE&lt;br /&gt;
&lt;br /&gt;
==Dark knowledge==&lt;br /&gt;
* Mengyuan Zhao, Xiangyu Zeng, Zhiyong Zhang, Chao Liu&lt;br /&gt;
&lt;br /&gt;
==Normal RNN speech recognition==&lt;br /&gt;
* Mengyuan Zhao&lt;br /&gt;
&lt;br /&gt;
==Monmentum-like Hessien-Free acceleration==&lt;br /&gt;
* Nestrov/Adagrad/AdaDelta/AdaM&lt;br /&gt;
* Zhiyong Zhang/Xiangyu Zeng&lt;br /&gt;
&lt;br /&gt;
==Activation value normalization through time --Batch Normalization==&lt;br /&gt;
* Zhiyong Zhang&lt;br /&gt;
&lt;br /&gt;
==Mix-training Balance decision tree==&lt;br /&gt;
* Zhiyong Zhang&lt;br /&gt;
&lt;br /&gt;
==20-h Chinese data-set release==&lt;br /&gt;
* Xuewei Zhang&lt;br /&gt;
&lt;br /&gt;
==Unbound activation function(Rectifier/Maxout/Pnorm) go-through searching method==&lt;br /&gt;
* nne3 test --Xuewei Zhang&lt;br /&gt;
&lt;br /&gt;
=Technical Report To Write=&lt;br /&gt;
 1, DNN-DAE based noise cancellation -- Xiangyu Zeng / Mengyuan Zhao / Zhiyong Zhang  --DONE&lt;br /&gt;
 2, Speech Rate DNN speech recognition --Shi Yin/Xiangyu Zeng --DONE&lt;br /&gt;
 3, CNN+fbank feature combination --Mian Wang /Yiye Lin /Mengyuan Zhao /Shi Yin&lt;br /&gt;
 4, Uyghur low-resource acoustic model enhancement -- Shi Yin / Mengyuan Zhao / Zhiyong Zhang --DONE&lt;br /&gt;
 5, Uyghur 20h database release --Kaer /Shi Yin --DONE&lt;br /&gt;
 6，Dark-Knowledge Transfer&lt;br /&gt;
    *: Xiangyu Zeng/ Mengyuan Zhao / Zhiyong Zhang&lt;br /&gt;
&lt;br /&gt;
=Paper to Write=&lt;br /&gt;
&lt;br /&gt;
=Patent done=&lt;br /&gt;
* A method of new word enhancement for speech recognition --Yue Zhang&lt;br /&gt;
&lt;br /&gt;
=Project=&lt;br /&gt;
* Xiaomi TV&lt;br /&gt;
:*Mengyuan Zhao/Zhiyong Zhang&lt;br /&gt;
:*TAG-lm &amp;amp; Domain-specific general lm&lt;br /&gt;
*Chinese-English mix-training&lt;/div&gt;</summary>
		<author><name>Tangzy</name></author>	</entry>

	</feed>