2014年2月10日 (一) 06:56的版本

DNN training

Model	CE	MPE1	MPE2	MPE3	MPE4
4k states	23.27/22.85	21.35/18.87	21.18/18.76	21.07/18.54	20.93/18.32
8k states	22.16/22.22	20.55/18.03	20.36/17.94	20.32/17.78	20.29/17.80
8k states + IT	-	20.04/17.38	20.01/17.32	20.07/17.44	19.94/17.65

Code ready for direct adaptation, insertion adaptation and KL-regularized adaptatoin
50 sentences for adaptation, 834 sentences for testing
WER from 14.56 to 11.13
Hidden layer adaptation is better than input and output adaptation
Before Linear adaptation is better than after-linear adaptation
Results are here

CLG decoder uses less memory in decoding
HCLG is faster and more accurate than CLG, and more amiable to beam control here
std::exp/std::log result in very slow computation in train203. Solved the problem by replacing to standard exp() and log().

@@ 第61行： / 第61行： @@
 * Comparison between CLG and HCLG decoder
 :* CLG decoder uses less memory in decoding
-:* HCLG is faster and more accurate than HCLG, and more amiable to beam control [http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?step=view_request&cvssid=156 here]
+:* HCLG is faster and more accurate than CLG, and more amiable to beam control [http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?step=view_request&cvssid=156 here]
 :* std::exp/std::log result in very slow computation in train203. Solved the problem by replacing to standard exp() and log().