“NLP Status Report 2017-5-22”版本间的差异

2017年5月24日 (三) 06:08的版本

Date	People	Last Week	This Week
2017/5/22	Jiyuan Zhang
	Aodong LI	bleu of baseline = 43.87 2nd translator uses as training data the concat(Chinese, machine translated English): hidden_size, emb_size, lr = 500, 310, 0.001 bleu = 43.53 (best) hidden_size, emb_size, lr = 700, 510, 0.001 bleu = 45.21 (best) but most results are under 43.1 hidden_size, emb_size, lr = 700, 510, 0.0005 bleu = 42.19 (best) double-decoder model with joint loss (final loss = 1st decoder's loss + 2nd decoder's loss): bleu = 40.11 (best) The 1st decoder's output is generally better than 2nd decoder's output. The training process of double-decoder model without joint loss is problematic.	Overfitting? Train large data on 2nd translator Replace the force teaching mechanism in training process with beam search mechanism.
	Shiyue Zhang	tried to not train embedding but use external word vectors most results of my attempts are bad, only 3-layer rnn + no dropout model got 25.54 bleu which about 2 points worse than original baseline trained original baseline on new data ( the data fixed the reverse sentence problem), got bleu=27.88; moses bleu=32.47	try more models to get similar results as original baseline on new data m-nmt model on new data
	Shipan Ren	learn the implement of seq2seq model read tf_translate code	understand the meaning of main code start writing documents

@@ 第17行： / 第17行： @@
 * The training process of double-decoder model '''without''' joint loss is problematic.
 ||
-* Overfitting? Training large data on 2nd translator
+* Overfitting? Train large data on 2nd translator
 * Replace the force teaching mechanism in training process with beam search mechanism.
 |-