<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://index.cslt.org/mediawiki/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="zh-cn">
		<id>http://index.cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=2014-08-29</id>
		<title>2014-08-29 - 版本历史</title>
		<link rel="self" type="application/atom+xml" href="http://index.cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=2014-08-29"/>
		<link rel="alternate" type="text/html" href="http://index.cslt.org/mediawiki/index.php?title=2014-08-29&amp;action=history"/>
		<updated>2026-04-16T20:39:33Z</updated>
		<subtitle>本wiki的该页面的版本历史</subtitle>
		<generator>MediaWiki 1.23.3</generator>

	<entry>
		<id>http://index.cslt.org/mediawiki/index.php?title=2014-08-29&amp;diff=10824&amp;oldid=prev</id>
		<title>Cslt：以内容“==Resoruce Building==  == Leftover questions==  * Investigating LOUDS FST.  * CLG embedded decoder plus online compiler. * DNN-GMM co-training * NN LM  == AM developmen...”创建新页面</title>
		<link rel="alternate" type="text/html" href="http://index.cslt.org/mediawiki/index.php?title=2014-08-29&amp;diff=10824&amp;oldid=prev"/>
				<updated>2014-08-29T02:14:01Z</updated>
		
		<summary type="html">&lt;p&gt;以内容“==Resoruce Building==  == Leftover questions==  * Investigating LOUDS FST.  * CLG embedded decoder plus online compiler. * DNN-GMM co-training * NN LM  == AM developmen...”创建新页面&lt;/p&gt;
&lt;p&gt;&lt;b&gt;新页面&lt;/b&gt;&lt;/p&gt;&lt;div&gt;==Resoruce Building==&lt;br /&gt;
&lt;br /&gt;
== Leftover questions==&lt;br /&gt;
&lt;br /&gt;
* Investigating LOUDS FST. &lt;br /&gt;
* CLG embedded decoder plus online compiler.&lt;br /&gt;
* DNN-GMM co-training&lt;br /&gt;
* NN LM&lt;br /&gt;
&lt;br /&gt;
== AM development ==&lt;br /&gt;
&lt;br /&gt;
=== Sparse DNN ===&lt;br /&gt;
* WJS sparse DNN does not obtain further improvement&lt;br /&gt;
&lt;br /&gt;
===Noise training===&lt;br /&gt;
&lt;br /&gt;
:* Error found in data setting. Re-run the test with gamma=20,30&lt;br /&gt;
:* Re-run test with gamma=1,0.1&lt;br /&gt;
:* Noisy training journal paper almost done.&lt;br /&gt;
&lt;br /&gt;
==Drop out &amp;amp; Rectification &amp;amp; convolutive network==&lt;br /&gt;
&lt;br /&gt;
* Change learning to 0.001, the training process can be started: &lt;br /&gt;
*# change the drop probability from 0.5 to 0.2. Frame accuracy is improved. WER seems problematic.&lt;br /&gt;
*# Experiment learning rate 1 and 8, NA &lt;br /&gt;
&lt;br /&gt;
* Rectification&lt;br /&gt;
# Rectification itself failed with large weights.&lt;br /&gt;
# Including L1 penalty enables the training but got very bad performance.&lt;br /&gt;
# Try to set the maximum value with rectifier&lt;br /&gt;
&lt;br /&gt;
* Convolutive network&lt;br /&gt;
# Test more configurations &lt;br /&gt;
&lt;br /&gt;
===Denoising &amp;amp; Farfield ASR===&lt;br /&gt;
&lt;br /&gt;
* Lasso-based dereverberation obtained reasonable results&lt;br /&gt;
:# Found some specious problems with frequency-dependent Lasso.&lt;br /&gt;
:# Proposed full frequency Lasso &amp;amp; full frequency-temporal Lasso.&lt;br /&gt;
:# good performance was obtained with F-dependent Lasso&lt;br /&gt;
:* Near data: 10.79 -&amp;gt; 10.35 (lamdba=0.05)&lt;br /&gt;
:* Far data : 40.53 -&amp;gt; 35.65 (lambda=0.15)&lt;br /&gt;
&lt;br /&gt;
===VAD===&lt;br /&gt;
&lt;br /&gt;
* Some discrepancy between CSLT results &amp;amp; Puqiang results&lt;br /&gt;
[http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=wangd&amp;amp;step=view_request&amp;amp;cvssid=207]&lt;br /&gt;
&lt;br /&gt;
:* check if the label is really problematic&lt;br /&gt;
:* check if short-time spike noise is the major problem (can be solved by spike filtering)&lt;br /&gt;
:* check if low-energy babble noise caused mismatch (can be solved by global energy detection)&lt;br /&gt;
:* test noise data trained model&lt;br /&gt;
&lt;br /&gt;
===Speech rate training===&lt;br /&gt;
&lt;br /&gt;
* Some interesting results with the simple speech rate change algorithm was obtained on the WSJ db&lt;br /&gt;
[http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=wangd&amp;amp;step=view_request&amp;amp;cvssid=268]&lt;br /&gt;
&lt;br /&gt;
* Seems ROS model is superior to the normal one with faster speech&lt;br /&gt;
* Need to check distribution of ROS on WSJ&lt;br /&gt;
* Suggest to extract speech data of different ROS, construct a new test set&lt;br /&gt;
* Suggest to use Tencent training data&lt;br /&gt;
* Suggest to remove silence when compute ROS&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Scoring===&lt;br /&gt;
&lt;br /&gt;
* hold&lt;br /&gt;
&lt;br /&gt;
===Confidence===&lt;br /&gt;
&lt;br /&gt;
* Implement a tool for data labeling&lt;br /&gt;
* Finished extraction of two features: DNN posterior + lattice posterior&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==LM development==&lt;br /&gt;
&lt;br /&gt;
===Domain specific LM===&lt;br /&gt;
&lt;br /&gt;
h2. G determinization problem re-open. &lt;br /&gt;
&lt;br /&gt;
h2. NUM tag LM:&lt;br /&gt;
&lt;br /&gt;
27h JS test:  20.16 vs 20.19&lt;br /&gt;
2h  JS test:  17.48 vs 17.49&lt;br /&gt;
&lt;br /&gt;
h2. Analyze the property of the tag LM: (1) random NUM should obtain better performance; (2) other words are not seriously impacted.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Word2Vector==&lt;br /&gt;
&lt;br /&gt;
===W2V based doc classification===&lt;br /&gt;
&lt;br /&gt;
* Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.&lt;br /&gt;
* Interest group setup, reading scheduled every Thusday&lt;br /&gt;
* Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==RNN LM==&lt;br /&gt;
&lt;br /&gt;
* New toolkit from Thomas obtained.&lt;br /&gt;
* Prepare WSJ database, re-test RNN.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Speaker ID==&lt;br /&gt;
&lt;br /&gt;
* Second model done&lt;br /&gt;
&lt;br /&gt;
==Emotion detection==&lt;br /&gt;
&lt;br /&gt;
* initial performance obtained&lt;br /&gt;
[http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=wangd&amp;amp;step=view_request&amp;amp;cvssid=271]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Translation==&lt;br /&gt;
* Failed due to out of memory &lt;br /&gt;
* Re-start the training due to some errors in grid&lt;/div&gt;</summary>
		<author><name>Cslt</name></author>	</entry>

	</feed>