“NLP Status Report 2017-5-31”版本间的差异

2017年5月31日 (三) 04:42的版本

Date

People

Last Week

This Week

2017/5/31

Jiyuan Zhang

Aodong LI

code double-attention model with final_attn = alpha * attn_ch + beta * attn_en
baseline bleu = 43.87
experiments with random initialized embedding:

Shiyue Zhang

found dropout bug, fix it, and reran baseline: baseline 35.21, baseline(outproj=emb) 35.24
tried several embed set models, failed
embedded other words to model embedding space (trained on train data not big data), and then directly used in baseline(outproj=emb)

30000	50000	70000	90000
35.24	34.52	33.73	33.16
4564 (6666)	4535	4469	4426

Shipan Ren

@@ 第7行： / 第7行： @@
 |-
 |Aodong LI ||
+* code double-attention model with '''final_attn = alpha * attn_ch + beta * attn_en'''
+* baseline bleu = '''43.87'''
+* experiments with '''random''' initialized embedding:
+{| class="wikitable"
+|-
+! alpha
+! beta
+! result (bleu)
+|-
+| 1
+| 1
+| 43.50
+|-
+| 4/3
+| 2/3
+| 43.58 (w/o retrained)
+|-
+| 2/3
+| 4/3
+| 42.22 (w/o retrained)
+|-
+| 2/3
+| 4/3
+| 42.36 (w/ retrained)
+|}
+* experiments with '''constant''' initialized embedding:
+{| class="wikitable"
+|-
+! alpha
+! beta
+! result (bleu)
+|-
+| 1
+| 1
+| '''45.41'''
+|-
+| 4/3
+| 2/3
+| '''45.79'''
+|-
+| 2/3
+| 4/3
+| '''45.32'''
+|}
+* This model is similar to multi-source neural translation but uses less resource
 ||
+* Explore different attention merge strategies
+* Explore hierarchical model
 |-
 |Shiyue Zhang ||