“NLP Status Report 2017-5-31”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第51行: 第51行:
 
| '''45.32'''
 
| '''45.32'''
 
|}
 
|}
 +
* 1.4~1.9 BLEU score improvement
 
* This model is similar to multi-source neural translation but uses less resource
 
* This model is similar to multi-source neural translation but uses less resource
 
||
 
||
 +
* Test the model on big data
 
* Explore different attention merge strategies
 
* Explore different attention merge strategies
 
* Explore hierarchical model
 
* Explore hierarchical model

2017年5月31日 (三) 04:50的版本

Date People Last Week This Week
2017/5/31 Jiyuan Zhang
Aodong LI
  • code double-attention model with final_attn = alpha * attn_ch + beta * attn_en
  • baseline bleu = 43.87
  • experiments with random initialized embedding:
alpha beta result (bleu)
1 1 43.50
4/3 2/3 43.58 (w/o retrained)
2/3 4/3 42.22 (w/o retrained)
2/3 4/3 42.36 (w/ retrained)
  • experiments with constant initialized embedding:
alpha beta result (bleu)
1 1 45.41
4/3 2/3 45.79
2/3 4/3 45.32
  • 1.4~1.9 BLEU score improvement
  • This model is similar to multi-source neural translation but uses less resource
  • Test the model on big data
  • Explore different attention merge strategies
  • Explore hierarchical model
Shiyue Zhang
  • found dropout bug, fix it, and reran baseline: baseline 35.21, baseline(outproj=emb) 35.24
  • tried several embed set models, failed
  • embedded other words to model embedding space (trained on train data not big data), and then directly used in baseline(outproj=emb)
30000 50000 70000 90000
35.24 34.52 33.73 33.16
4564 (6666) 4535 4469 4426
  • m-nmt is running
  • get word2vec on big data, and compare with word2vec from train data
  • test m-nmt model, increase vocab size and test
  • review zh-uy/uy-zh related works, start to write paper
Shipan Ren