“NLP Status Report 2017-3-13”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(2位用户的2个中间修订版本未显示)
第13行: 第13行:
 
|Jiyuan Zhang ||
 
|Jiyuan Zhang ||
 
*completed to reproduce planning neural network
 
*completed to reproduce planning neural network
*chose best attention_memory model for huilian  and ran big train dataset(about 370k)
+
*chose best attention_memory model for huilian  and ran big train dataset(about 370k) [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/b/b9/Model_with_different_dataset.pdf  result]
  
 
||  
 
||  
第22行: 第22行:
 
|-
 
|-
 
|Andi Zhang ||
 
|Andi Zhang ||
 
+
*ran baseline without mask, found that the model with masks has a slightly better bleu score.
 +
*tried a way to deal oov words; but it can't predict '_EOS' symbol
 
||
 
||
 
+
*try to fix the problem
 
|-
 
|-
 
|Shiyue Zhang ||  
 
|Shiyue Zhang ||  
 
+
* added trained memory-attention model to neural model(43.0) and got 2+ blue gain (45.19), but need more validation and improvement
 +
* ran baseline model on cs-en data, and found it was good on train set but poor on test set.
 +
* ran baseline model on en-fr data, and found 'inf' problem.
 +
* fixed the 'inf' problem by debugging the code of mask-added baseline model.
 +
* running on cs-en and en-fr data again.
 
||
 
||
 
+
* go on with baseline on big data: get results of cs-en and enfr data, train on zh-en data from [http://www.statmt.org/wmt17/translation-task.html#download WMT17]
 +
* go on to refine memory attention model: retrain to find out if the 2+ is just by chance, try more memory attention structure (relu, a(t-1), y(t-1)...)
 
|-
 
|-
 
|Peilun Xiao ||
 
|Peilun Xiao ||

2017年3月14日 (二) 01:46的最后版本

Date People Last Week This Week
2017/1/3 Yang Feng
  • tested and analyzed the results on the cs-en data set (30.4 on the heldout-training set and 7.3 on the dev set);
  • added masks to the baseline (44.4 on the cn-en);
  • added encoder-masks and memory-masks to alpha-gamma method and fixed the bugs. Got an improvement of 0.5 again the masked baseline [report];
  • To avoid doing softmax twice, rewrite the softmax_cross_entropy function myself. (under-training)
  • analyze and improve the alpha-gamma method.
Jiyuan Zhang
  • completed to reproduce planning neural network
  • chose best attention_memory model for huilian and ran big train dataset(about 370k) result
  • Keyword expansion model
  • collect more poem from Internet
  • recruiting
Andi Zhang
  • ran baseline without mask, found that the model with masks has a slightly better bleu score.
  • tried a way to deal oov words; but it can't predict '_EOS' symbol
  • try to fix the problem
Shiyue Zhang
  • added trained memory-attention model to neural model(43.0) and got 2+ blue gain (45.19), but need more validation and improvement
  • ran baseline model on cs-en data, and found it was good on train set but poor on test set.
  • ran baseline model on en-fr data, and found 'inf' problem.
  • fixed the 'inf' problem by debugging the code of mask-added baseline model.
  • running on cs-en and en-fr data again.
  • go on with baseline on big data: get results of cs-en and enfr data, train on zh-en data from WMT17
  • go on to refine memory attention model: retrain to find out if the 2+ is just by chance, try more memory attention structure (relu, a(t-1), y(t-1)...)
Peilun Xiao