“NLP Status Report 2017-5-22”版本间的差异

2017年5月31日 (三) 09:03的最后版本

Date	People	Last Week	This Week
2017/5/22	Jiyuan Zhang
	Aodong LI	bleu of baseline = 43.87 2nd translator uses as training data the concat(Chinese, machine translated English): hidden_size, emb_size, lr = 500, 310, 0.001 bleu = 43.53 (best) hidden_size, emb_size, lr = 700, 510, 0.001 bleu = 45.21 (best) but most results are under 43.1 hidden_size, emb_size, lr = 700, 510, 0.0005 bleu = 42.19 (best) double-decoder model with joint loss (final loss = 1st decoder's loss + 2nd decoder's loss): bleu = 40.11 (best) The 1st decoder's output is generally better than 2nd decoder's output. The training process of double-decoder model without joint loss is problematic.	Replace the forced teaching mechanism in training process with beam search mechanism.
	Shiyue Zhang	tried to not train embedding but use external word vectors most results of my attempts are bad, only 3-layer rnn + no dropout model got 25.54 bleu which about 2 points worse than original baseline trained original baseline on new data ( the data fixed the reverse sentence problem), got bleu=27.88; moses bleu=32.47	try more models to get similar results as original baseline on new data m-nmt model on new data
	Shipan Ren	learned the implement of seq2seq model read tf_translate code	understand the meaning of main code start writing documents

@@ 第2行： / 第2行： @@
 !Date !! People !! Last Week !! This Week
 |-
-| rowspan="6"|2017/4/5
+| rowspan="6"|2017/5/22
 |Jiyuan Zhang ||
 ||
 |-
 |Aodong LI ||
+* bleu of baseline = 43.87
+* 2nd translator uses as training data the concat(Chinese, machine translated English):
+  hidden_size, emb_size, lr = 500, 310, 0.001 bleu = 43.53 (best)
+  hidden_size, emb_size, lr = 700, 510, 0.001 bleu = 45.21 (best) but most results are under 43.1
+  hidden_size, emb_size, lr = 700, 510, 0.0005 bleu = 42.19 (best)
+* double-decoder model with joint loss (final loss = 1st decoder's loss + 2nd decoder's loss):
+  bleu = 40.11 (best)
+  The 1st decoder's output is generally better than 2nd decoder's output.
+* The training process of double-decoder model '''without''' joint loss is problematic.
 ||
+* Replace the forced teaching mechanism in training process with beam search mechanism.
 |-
 |Shiyue Zhang ||
@@ 第19行： / 第28行： @@
 |-
 |Shipan Ren ||
-* learn the implement of seq2seq model
+* learned the implement of seq2seq model
 * read tf_translate code
 ||
@@ 第26行： / 第35行： @@
 |-
-||
 |}