“NLP Status Report 2017-3-13”版本间的差异
来自cslt Wiki
(以“{| class="wikitable" !Date !! People !! Last Week !! This Week |- | rowspan="6"|2017/1/3 |Yang Feng || * ran experiments on the CS-EN data set (200k pairs) with tota...”为内容创建页面) |
|||
(3位用户的5个中间修订版本未显示) | |||
第4行: | 第4行: | ||
| rowspan="6"|2017/1/3 | | rowspan="6"|2017/1/3 | ||
|Yang Feng || | |Yang Feng || | ||
− | * | + | * tested and analyzed the results on the cs-en data set (30.4 on the heldout-training set and 7.3 on the dev set); |
− | * | + | * added masks to the baseline (44.4 on the cn-en); |
− | * | + | * added encoder-masks and memory-masks to alpha-gamma method and fixed the bugs. Got an improvement of 0.5 again the masked baseline [[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/b/b8/Nmt_mn_report_continue.pdf report]]; |
+ | * To avoid doing softmax twice, rewrite the softmax_cross_entropy function myself. (under-training) | ||
|| | || | ||
− | * | + | * analyze and improve the alpha-gamma method. |
− | + | ||
− | + | ||
|- | |- | ||
|Jiyuan Zhang || | |Jiyuan Zhang || | ||
− | * | + | *completed to reproduce planning neural network |
+ | *chose best attention_memory model for huilian and ran big train dataset(about 370k) [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/b/b9/Model_with_different_dataset.pdf result] | ||
+ | |||
|| | || | ||
− | * | + | *Keyword expansion model |
+ | *collect more poem from Internet | ||
+ | *recruiting | ||
+ | |||
|- | |- | ||
|Andi Zhang || | |Andi Zhang || | ||
− | * | + | *ran baseline without mask, found that the model with masks has a slightly better bleu score. |
− | * | + | *tried a way to deal oov words; but it can't predict '_EOS' symbol |
|| | || | ||
− | + | *try to fix the problem | |
|- | |- | ||
|Shiyue Zhang || | |Shiyue Zhang || | ||
− | * | + | * added trained memory-attention model to neural model(43.0) and got 2+ blue gain (45.19), but need more validation and improvement |
− | * | + | * ran baseline model on cs-en data, and found it was good on train set but poor on test set. |
− | * | + | * ran baseline model on en-fr data, and found 'inf' problem. |
+ | * fixed the 'inf' problem by debugging the code of mask-added baseline model. | ||
+ | * running on cs-en and en-fr data again. | ||
|| | || | ||
− | * | + | * go on with baseline on big data: get results of cs-en and enfr data, train on zh-en data from [http://www.statmt.org/wmt17/translation-task.html#download WMT17] |
+ | * go on to refine memory attention model: retrain to find out if the 2+ is just by chance, try more memory attention structure (relu, a(t-1), y(t-1)...) | ||
|- | |- | ||
|Peilun Xiao || | |Peilun Xiao || |
2017年3月14日 (二) 01:46的最后版本
Date | People | Last Week | This Week |
---|---|---|---|
2017/1/3 | Yang Feng |
|
|
Jiyuan Zhang |
|
| |
Andi Zhang |
|
| |
Shiyue Zhang |
|
| |
Peilun Xiao |