“NLP Status Report 2017-5-31”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(2位用户的3个中间修订版本未显示)
第26行: 第26行:
 
| 2/3
 
| 2/3
 
| 4/3
 
| 4/3
| 42.22 (w/o retrained)
+
| 41.22 (w/o retrained)
 
|-
 
|-
 
| 2/3
 
| 2/3
第86行: 第86行:
 
|-
 
|-
 
|Shipan Ren ||
 
|Shipan Ren ||
 
+
* writed document of tf_translate project
 +
* read neural machine translation paper
 +
* read tf_translate code
 +
* run and tested tf_translate code
 
||
 
||
  

2017年5月31日 (三) 09:00的最后版本

Date People Last Week This Week
2017/5/31 Jiyuan Zhang
Aodong LI
  • code double-attention model with final_attn = alpha * attn_ch + beta * attn_en
  • baseline bleu = 43.87
  • experiments with random initialized embedding:
alpha beta result (bleu)
1 1 43.50
4/3 2/3 43.58 (w/o retrained)
2/3 4/3 41.22 (w/o retrained)
2/3 4/3 42.36 (w/ retrained)
  • experiments with constant initialized embedding:
alpha beta result (bleu)
1 1 45.41
4/3 2/3 45.79
2/3 4/3 45.32
  • 1.4~1.9 BLEU score improvement
  • This model is similar to multi-source neural translation but uses less resource
  • Test the model on big data
  • Explore different attention merge strategies
  • Explore hierarchical model
Shiyue Zhang
  • found dropout bug, fix it, and reran baseline: baseline 35.21, baseline(outproj=emb) 35.24
  • tried several embed set models, failed
  • embedded other words to model embedding space (trained on train data not big data), and then directly used in baseline(outproj=emb)
30000 50000 70000 90000
35.24 34.52 33.73 33.16
4564 (6666) 4535 4469 4426
  • m-nmt is running
  • get word2vec on big data, and compare with word2vec from train data
  • test m-nmt model, increase vocab size and test
  • review zh-uy/uy-zh related works, start to write paper
Shipan Ren
  • writed document of tf_translate project
  • read neural machine translation paper
  • read tf_translate code
  • run and tested tf_translate code