<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://index.cslt.org/mediawiki/skins/common/feed.css?303"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="zh-cn">
		<id>http://index.cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=Sinovoice-2014-04-22</id>
		<title>Sinovoice-2014-04-22 - 版本历史</title>
		<link rel="self" type="application/atom+xml" href="http://index.cslt.org/mediawiki/index.php?action=history&amp;feed=atom&amp;title=Sinovoice-2014-04-22"/>
		<link rel="alternate" type="text/html" href="http://index.cslt.org/mediawiki/index.php?title=Sinovoice-2014-04-22&amp;action=history"/>
		<updated>2026-04-13T11:14:04Z</updated>
		<subtitle>本wiki的该页面的版本历史</subtitle>
		<generator>MediaWiki 1.23.3</generator>

	<entry>
		<id>http://index.cslt.org/mediawiki/index.php?title=Sinovoice-2014-04-22&amp;diff=9778&amp;oldid=prev</id>
		<title>Cslt：以内容“h1. Environment setting  * Sinovoice internal server deployment. Usage standard draft is released * Email notification is problematic. Need obtain an SMTP server  * Wil...”创建新页面</title>
		<link rel="alternate" type="text/html" href="http://index.cslt.org/mediawiki/index.php?title=Sinovoice-2014-04-22&amp;diff=9778&amp;oldid=prev"/>
				<updated>2014-04-23T04:01:30Z</updated>
		
		<summary type="html">&lt;p&gt;以内容“h1. Environment setting  * Sinovoice internal server deployment. Usage standard draft is released * Email notification is problematic. Need obtain an SMTP server  * Wil...”创建新页面&lt;/p&gt;
&lt;p&gt;&lt;b&gt;新页面&lt;/b&gt;&lt;/p&gt;&lt;div&gt;h1. Environment setting&lt;br /&gt;
&lt;br /&gt;
* Sinovoice internal server deployment. Usage standard draft is released&lt;br /&gt;
* Email notification is problematic. Need obtain an SMTP server &lt;br /&gt;
* Will train an redmine administrator for Sinovoice&lt;br /&gt;
&lt;br /&gt;
h1. Corpora&lt;br /&gt;
&lt;br /&gt;
* 300h Guangxi telecom text transcription prepared. 180h completed.&lt;br /&gt;
* Now totally 1338h (470 + 346 + 105BJ mobile + 200 PICC + 108h HBTc + 109h New BJ mobile) telephone speech is ready.&lt;br /&gt;
* 16k 6000h data: 978h online data from DataTang + 656h online mobile data + 4300h recording data.&lt;br /&gt;
* Standard established for LM-speech-text labeling (speech data transcription for LM enhancement)&lt;br /&gt;
* Xiaona is preparing noise database. Extract noise data from  the original wav files.&lt;br /&gt;
&lt;br /&gt;
h1. Acoustic modeling&lt;br /&gt;
&lt;br /&gt;
h2. Telephone model training&lt;br /&gt;
&lt;br /&gt;
h3. 1000h Training&lt;br /&gt;
&lt;br /&gt;
*Baseline: 8k states, 470+300 MPE4, 20.29&lt;br /&gt;
* Jietong phone, 200 hour seed, 10k states training:&lt;br /&gt;
:* Xent 16 iteration: 22.90&lt;br /&gt;
:* MPE1 : 20.89&lt;br /&gt;
:* MPE2 : 20.68&lt;br /&gt;
:* MPE3 : 20.61&lt;br /&gt;
:* MPE4 : 20.56&lt;br /&gt;
&lt;br /&gt;
* CSLT phone, 8k states training&lt;br /&gt;
:* MPE1: 20.60&lt;br /&gt;
:* MPE2: 20.37&lt;br /&gt;
:* MPE3: 20.37&lt;br /&gt;
:* MPE4: 20.37&lt;br /&gt;
&lt;br /&gt;
* Found a problem on data processing. Some data were cut off incorrectly. Re-training the model.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
h2. 6000 hour 16k training&lt;br /&gt;
&lt;br /&gt;
h3. Training progress&lt;br /&gt;
&lt;br /&gt;
* Baseline: 1700h, MPE5, JT phone. 9.91&lt;br /&gt;
&lt;br /&gt;
* 6000h/CSLT phone set training&lt;br /&gt;
&lt;br /&gt;
:* Xent: 12.83&lt;br /&gt;
:* MPE1: 9.21&lt;br /&gt;
:* MPE2: 9.13&lt;br /&gt;
:* MPE3: 9.10&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* 6000h/jt phone set  phone set training&lt;br /&gt;
:* MPE1: 10.63&lt;br /&gt;
&lt;br /&gt;
h3. Train Analysis&lt;br /&gt;
&lt;br /&gt;
* The Qihang model used a subset of the 6k data&lt;br /&gt;
:* 2500+950H+tang500h*+20131220, approximately 1700+2400 hours&lt;br /&gt;
&lt;br /&gt;
* GMM training using this subset achieved 22.47%. Xiaoming's result is 16.1%.&lt;br /&gt;
:* Seems the database is still not very consistent&lt;br /&gt;
:* Xiaoming kicked off the job to reproduce the Qihang training using this subset&lt;br /&gt;
&lt;br /&gt;
h3. Multilanguage Training&lt;br /&gt;
&lt;br /&gt;
* Prepare Chinglish data: will try to select 100h first to train a baseline model&lt;br /&gt;
* AMIDA database downloading&lt;br /&gt;
* Prepare shared DNN structure for multilingual training&lt;br /&gt;
* Baseline Chinese-English system is done&lt;br /&gt;
* Need some configuration on the size of hidden layers, need more sharing structure&lt;br /&gt;
* Need investigate knowledge based phone sharing&lt;br /&gt;
&lt;br /&gt;
h3. Noise robust feature&lt;br /&gt;
&lt;br /&gt;
* GFbank can be propagated to Sinovoice&lt;br /&gt;
&lt;br /&gt;
:* 1700h JT phone: MPE3: Fbank: 10.48 GFBank: 10.23&lt;br /&gt;
:* Prepare to train the 1000h telephone speech&lt;br /&gt;
&lt;br /&gt;
* Liuchao will prepare fast computing code&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
h1. Language modeling&lt;br /&gt;
&lt;br /&gt;
h2. Domain specific atom-LM construction&lt;br /&gt;
&lt;br /&gt;
h3. Some potential problems&lt;br /&gt;
* Unclear domain definition&lt;br /&gt;
* Using the same development set (8k transcription) is not very appropriate &lt;br /&gt;
&lt;br /&gt;
h3. Text data filtering&lt;br /&gt;
&lt;br /&gt;
* A telecom specific word list is ready. Will work with Xiaona to prepare a new version of lexicon.&lt;br /&gt;
* A comparison of document classification is done by LiuRong:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
                   财经       IT      健康     体育     旅游     教育      招聘     文化      军事       总体&lt;br /&gt;
vsm                0.92    0.906   0.921    0.983   0.954    0.916     0.953    0.996     0.9339   0.94&lt;br /&gt;
lda（50）           0.84    0.39    0.79     0.85    0.60     0.368     0.61     0.31      0.86     0.62&lt;br /&gt;
w2v （50）          0.69    0.77    0.67     0.59    0.70     0.62      0.74     0.79      0.88     0.73&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
h1. DNN Decoder&lt;br /&gt;
&lt;br /&gt;
h2. decoder optimization&lt;br /&gt;
&lt;br /&gt;
* Test computation cost of each step&lt;br /&gt;
:* beam 9/5000: netforward 65%&lt;br /&gt;
:* beam 13/7000: netforward 28%&lt;br /&gt;
:* This has been verified by Liuchao with the CSLT engine&lt;br /&gt;
&lt;br /&gt;
* The acceleration code was checked in to GIT, with small modification on heap management. &lt;br /&gt;
&lt;br /&gt;
h2. Frame-skipping&lt;br /&gt;
&lt;br /&gt;
* Zhiyong &amp;amp; Liuchao will deliver the frame-skipping approach.&lt;br /&gt;
&lt;br /&gt;
h2. BigLM optimization&lt;br /&gt;
&lt;br /&gt;
* Investigate BigLM retrieval optimization.&lt;/div&gt;</summary>
		<author><name>Cslt</name></author>	</entry>

	</feed>