“NLP Status Report 2017-5-31”版本间的差异

2017年5月31日 (三) 09:00的最后版本

Date

People

Last Week

This Week

2017/5/31

Jiyuan Zhang

Aodong LI

code double-attention model with final_attn = alpha * attn_ch + beta * attn_en
baseline bleu = 43.87
experiments with random initialized embedding:

Shiyue Zhang

found dropout bug, fix it, and reran baseline: baseline 35.21, baseline(outproj=emb) 35.24
tried several embed set models, failed
embedded other words to model embedding space (trained on train data not big data), and then directly used in baseline(outproj=emb)

30000	50000	70000	90000
35.24	34.52	33.73	33.16
4564 (6666)	4535	4469	4426

Shipan Ren

@@ 第2行： / 第2行： @@
 !Date !! People !! Last Week !! This Week
 |-
-| rowspan="6"|2017/5/22
+| rowspan="6"|2017/5/31
 |Jiyuan Zhang ||
 ||
 |-
 |Aodong LI ||
+* code double-attention model with '''final_attn = alpha * attn_ch + beta * attn_en'''
+* baseline bleu = '''43.87'''
+* experiments with '''random''' initialized embedding:
+{| class="wikitable"
+|-
+! alpha
+! beta
+! result (bleu)
+|-
+| 1
+| 1
+| 43.50
+|-
+| 4/3
+| 2/3
+| 43.58 (w/o retrained)
+|-
+| 2/3
+| 4/3
+| 41.22 (w/o retrained)
+|-
+| 2/3
+| 4/3
+| 42.36 (w/ retrained)
+|}
+* experiments with '''constant''' initialized embedding:
+{| class="wikitable"
+|-
+! alpha
+! beta
+! result (bleu)
+|-
+| 1
+| 1
+| '''45.41'''
+|-
+| 4/3
+| 2/3
+| '''45.79'''
+|-
+| 2/3
+| 4/3
+| '''45.32'''
+|}
+* 1.4~1.9 BLEU score improvement
+* This model is similar to multi-source neural translation but uses less resource
 ||
+* Test the model on big data
+* Explore different attention merge strategies
+* Explore hierarchical model
 |-
 |Shiyue Zhang ||
@@ 第39行： / 第86行： @@
 |-
 |Shipan Ren ||
+* writed document of tf_translate project
+* read neural machine translation paper
+* read tf_translate code
+* run and tested tf_translate code
 ||