|
|
第27行: |
第27行: |
| |- | | |- |
| |Shiyue Zhang || | | |Shiyue Zhang || |
− | | + | * added trained memory-attention model to neural model(43.0) and got 2+ blue gain (45.19), but need more validation and improvement |
| + | * ran baseline model on cs-en data, and found it was good on train set but poor on test set. |
| + | * ran baseline model on en-fr data, and found 'inf' problem. |
| + | * fixed the 'inf' problem by debugging the code of mask-added baseline model. |
| + | * running on cs-en and en-fr data again. |
| || | | || |
− | | + | * go on with baseline on big data: get results of cs-en and enfr data, train on zh-en data from [http://www.statmt.org/wmt17/translation-task.html#download WMT17] |
| + | * go on to refine memory attention model: retrain to find out if the 2+ is just by chance, try more memory attention structure (relu, a(t-1), y(t-1)...) |
| |- | | |- |
| |Peilun Xiao || | | |Peilun Xiao || |
Date |
People |
Last Week |
This Week
|
2017/1/3
|
Yang Feng |
- tested and analyzed the results on the cs-en data set (30.4 on the heldout-training set and 7.3 on the dev set);
- added masks to the baseline (44.4 on the cn-en);
- added encoder-masks and memory-masks to alpha-gamma method and fixed the bugs. Got an improvement of 0.5 again the masked baseline [report];
- To avoid doing softmax twice, rewrite the softmax_cross_entropy function myself. (under-training)
|
- analyze and improve the alpha-gamma method.
|
Jiyuan Zhang |
- completed to reproduce planning neural network
- chose best attention_memory model for huilian and ran big train dataset(about 370k) result
|
- Keyword expansion model
- collect more poem from Internet
- recruiting
|
Andi Zhang |
|
|
Shiyue Zhang |
- added trained memory-attention model to neural model(43.0) and got 2+ blue gain (45.19), but need more validation and improvement
- ran baseline model on cs-en data, and found it was good on train set but poor on test set.
- ran baseline model on en-fr data, and found 'inf' problem.
- fixed the 'inf' problem by debugging the code of mask-added baseline model.
- running on cs-en and en-fr data again.
|
- go on with baseline on big data: get results of cs-en and enfr data, train on zh-en data from WMT17
- go on to refine memory attention model: retrain to find out if the 2+ is just by chance, try more memory attention structure (relu, a(t-1), y(t-1)...)
|
Peilun Xiao |
|
|