|
|
(2位用户的3个中间修订版本未显示) |
第26行: |
第26行: |
| | 2/3 | | | 2/3 |
| | 4/3 | | | 4/3 |
− | | 42.22 (w/o retrained) | + | | 41.22 (w/o retrained) |
| |- | | |- |
| | 2/3 | | | 2/3 |
第86行: |
第86行: |
| |- | | |- |
| |Shipan Ren || | | |Shipan Ren || |
− | | + | * writed document of tf_translate project |
| + | * read neural machine translation paper |
| + | * read tf_translate code |
| + | * run and tested tf_translate code |
| || | | || |
| | | |
Date |
People |
Last Week |
This Week
|
2017/5/31
|
Jiyuan Zhang |
|
|
Aodong LI |
- code double-attention model with final_attn = alpha * attn_ch + beta * attn_en
- baseline bleu = 43.87
- experiments with random initialized embedding:
alpha
|
beta
|
result (bleu)
|
1
|
1
|
43.50
|
4/3
|
2/3
|
43.58 (w/o retrained)
|
2/3
|
4/3
|
41.22 (w/o retrained)
|
2/3
|
4/3
|
42.36 (w/ retrained)
|
- experiments with constant initialized embedding:
alpha
|
beta
|
result (bleu)
|
1
|
1
|
45.41
|
4/3
|
2/3
|
45.79
|
2/3
|
4/3
|
45.32
|
- 1.4~1.9 BLEU score improvement
- This model is similar to multi-source neural translation but uses less resource
|
- Test the model on big data
- Explore different attention merge strategies
- Explore hierarchical model
|
Shiyue Zhang |
- found dropout bug, fix it, and reran baseline: baseline 35.21, baseline(outproj=emb) 35.24
- tried several embed set models, failed
- embedded other words to model embedding space (trained on train data not big data), and then directly used in baseline(outproj=emb)
30000
|
50000
|
70000
|
90000
|
35.24
|
34.52
|
33.73
|
33.16
|
4564 (6666)
|
4535
|
4469
|
4426
|
|
- get word2vec on big data, and compare with word2vec from train data
- test m-nmt model, increase vocab size and test
- review zh-uy/uy-zh related works, start to write paper
|
Shipan Ren |
- writed document of tf_translate project
- read neural machine translation paper
- read tf_translate code
- run and tested tf_translate code
|
|