“Schedule”版本间的差异
来自cslt Wiki
(→Daily Report) |
(→Daily Report) |
||
(2位用户的35个中间修订版本未显示) | |||
第587行: | 第587行: | ||
|- | |- | ||
+ | | rowspan="1"|2017/07/10 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * trained translation models using tf1.0 baseline and tf0.1 baseline perspectively | ||
+ | * dataset:zh-en small | ||
+ | |- | ||
+ | | rowspan="1"|2017/07/11 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * tested these checkpoints | ||
+ | * found the new version takes less time | ||
+ | * found these two versions have similar complexity and bleu values | ||
+ | * found that the bleu is still good when the model is over fitting . | ||
+ | * (reason: the test set and the train set of small data set are similar in content and style) | ||
+ | |- | ||
+ | | rowspan="1"|2017/07/12 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * trained translation models using tf1.0 baseline and tf0.1 baseline perspectively | ||
+ | * dataset:zh-en big | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/07/13 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * OOM(Out Of Memory) error occurred when version 0.1 was trained using large data set,but version 1.0 worked | ||
+ | reason: improper distribution of resources by the tensorflow0.1 frame leads to exhaustion of memory resources | ||
+ | * I had tried 4 times (just enter the same command), and version 0.1 worked | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/07/14 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * tested these checkpoints | ||
+ | * found the new version takes less time | ||
+ | * found these two versions have similar complexity and bleu values | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/07/17 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * downloaded the wmt2014 data sets and processed it | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/07/18 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * processed data | ||
+ | |||
+ | |- | ||
| rowspan="1"|2017/07/18 | | rowspan="1"|2017/07/18 | ||
|Jiayu Guo || 8:30|| 22:00 || 14 || | |Jiayu Guo || 8:30|| 22:00 || 14 || | ||
* read model code. | * read model code. | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/07/19 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * processed data | ||
|- | |- | ||
| rowspan="1"|2017/07/19 | | rowspan="1"|2017/07/19 | ||
|Jiayu Guo || 9:00|| 22:00 || 13 || | |Jiayu Guo || 9:00|| 22:00 || 13 || | ||
* read papers of bleu. | * read papers of bleu. | ||
+ | |- | ||
+ | | rowspan="1"|2017/07/20 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * processed data | ||
|- | |- | ||
| rowspan="1"|2017/07/20 | | rowspan="1"|2017/07/20 | ||
第600行: | 第652行: | ||
* read papers of attention mechanism. | * read papers of attention mechanism. | ||
|- | |- | ||
− | + | | rowspan="1"|2017/07/21 | |
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * trained translation models using tf1.0 baseline and tf0.1 baseline perspectively | ||
+ | * dataset:WMT2014 en-de | ||
+ | |- | ||
| rowspan="1"|2017/07/21 | | rowspan="1"|2017/07/21 | ||
|Jiayu Guo || 10:00|| 23:00 || 13 || | |Jiayu Guo || 10:00|| 23:00 || 13 || | ||
* process document | * process document | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/07/24 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * tested these checkpoints of en-de dataset | ||
+ | * found the new version takes less time | ||
+ | * found these two versions have similar complexity and bleu values | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/07/24 | | rowspan="1"|2017/07/24 | ||
第609行: | 第673行: | ||
* read model code. | * read model code. | ||
|- | |- | ||
− | + | | rowspan="1"|2017/07/25 | |
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * trained translation models using tf1.0 baseline and tf0.1 baseline perspectively | ||
+ | * dataset:WMT2014 en-fr datasets | ||
+ | |- | ||
| rowspan="1"|2017/07/25 | | rowspan="1"|2017/07/25 | ||
|Jiayu Guo || 9:00|| 23:00 || 14 || | |Jiayu Guo || 9:00|| 23:00 || 14 || | ||
* process document | * process document | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/07/26 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * read papers about memory-augmented nmt | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/07/26 | | rowspan="1"|2017/07/26 | ||
|Jiayu Guo || 10:00|| 24:00 || 14 || | |Jiayu Guo || 10:00|| 24:00 || 14 || | ||
* process document | * process document | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/07/27 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * read papers about memory-augmented nmt | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/07/27 | | rowspan="1"|2017/07/27 | ||
|Jiayu Guo || 10:00|| 24:00 || 14 || | |Jiayu Guo || 10:00|| 24:00 || 14 || | ||
* process document | * process document | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/07/28 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * read memory-augmented nmt code | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/07/28 | | rowspan="1"|2017/07/28 | ||
|Jiayu Guo || 9:00|| 24:00 || 15 || | |Jiayu Guo || 9:00|| 24:00 || 15 || | ||
− | * process document | + | * process document |
− | + | ||
| | | | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/07/31 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * read memory-augmented nmt code | ||
|- | |- | ||
| rowspan="1"|2017/07/31 | | rowspan="1"|2017/07/31 | ||
第632行: | 第722行: | ||
* split ancient language text to single word | * split ancient language text to single word | ||
| | | | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/1 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * tested these checkpoints of en-fr dataset | ||
+ | * found the new version takes less time | ||
+ | * found these two versions have similar complexity and bleu values | ||
|- | |- | ||
| rowspan="1"|2017/08/1 | | rowspan="1"|2017/08/1 | ||
第637行: | 第733行: | ||
* run seq2seq_model | * run seq2seq_model | ||
| | | | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/2 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * looked for the performance(the bleu value) of other models | ||
+ | * datasets:WMT2014 en-de and en-fr | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/08/2 | | rowspan="1"|2017/08/2 | ||
|Jiayu Guo || 10:00|| 23:00 || 13 || | |Jiayu Guo || 10:00|| 23:00 || 13 || | ||
* process document | * process document | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/3 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * looked for the performance(the bleu value) of other seq2seq models | ||
+ | * datasets:WMT2014 en-de and en-fr | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/08/3 | | rowspan="1"|2017/08/3 | ||
|Jiayu Guo || 10:00|| 23:00 || 13 || | |Jiayu Guo || 10:00|| 23:00 || 13 || | ||
* process document | * process document | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/4 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * learn moses | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/08/4 | | rowspan="1"|2017/08/4 | ||
|Jiayu Guo || 10:00|| 23:00 || 13 || | |Jiayu Guo || 10:00|| 23:00 || 13 || | ||
* search new data(Songshu) | * search new data(Songshu) | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/08/7 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * installed and built Moses on the server | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/08/7 | | rowspan="1"|2017/08/7 | ||
|Jiayu Guo || 9:00|| 22:00 || 13 || | |Jiayu Guo || 9:00|| 22:00 || 13 || | ||
* process document | * process document | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/08/8 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * train statistical machine translation model and test it | ||
+ | * dataset:zh-en small | ||
+ | * test if moses can work normally | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/08/8 | | rowspan="1"|2017/08/8 | ||
|Jiayu Guo || 10:00|| 21:00 || 11 || | |Jiayu Guo || 10:00|| 21:00 || 11 || | ||
* read tensorflow | * read tensorflow | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/08/9 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * code automation scripts to process data,train model and test model | ||
+ | * toolkit: Moses | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/08/9 | | rowspan="1"|2017/08/9 | ||
|Jiayu Guo || 10:00|| 23:00 || 13 || | |Jiayu Guo || 10:00|| 23:00 || 13 || | ||
* run model with the data of which ancient content was split by single character. | * run model with the data of which ancient content was split by single character. | ||
+ | |||
|- | |- | ||
+ | | rowspan="1"|2017/08/10 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * train statistical machine translation models and test it | ||
+ | * dataset:zh-en big,WMT2014 en-de,WMT2014 en-fr | ||
+ | |- | ||
| rowspan="1"|2017/08/10 | | rowspan="1"|2017/08/10 | ||
|Jiayu Guo || 9:00|| 23:00 || 13 || | |Jiayu Guo || 9:00|| 23:00 || 13 || | ||
* process data of Songshu | * process data of Songshu | ||
* read papers of CNN | * read papers of CNN | ||
+ | |||
|- | |- | ||
| rowspan="1"|2017/08/11 | | rowspan="1"|2017/08/11 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * collate experimental results | ||
+ | * compare our baseline model with Moses | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/08/11 | ||
+ | |Jiayu Guo || 9:00|| 20:00 || 11 || | ||
+ | * test results. | ||
+ | |||
+ | |- | ||
+ | |||
+ | | rowspan="1"|2017/08/14 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * read paper about THUMT | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/14 | ||
|Jiayu Guo || 10:00|| 23:00 || 13 || | |Jiayu Guo || 10:00|| 23:00 || 13 || | ||
* learn about Graphic Model of LSTM-Projected BPTT | * learn about Graphic Model of LSTM-Projected BPTT | ||
* search for data available for translation (Twenty-four-Shi) | * search for data available for translation (Twenty-four-Shi) | ||
− | |- | + | |- |
− | | rowspan="1"|2017/08/ | + | |
+ | | rowspan="1"|2017/08/15 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * read THUMT manual and learn how to use it | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/15 | ||
|Jiayu Guo || 11:00|| 23:30 || 12 || | |Jiayu Guo || 11:00|| 23:30 || 12 || | ||
* run model with data including Shiji、Zizhitongjian. | * run model with data including Shiji、Zizhitongjian. | ||
|- | |- | ||
− | | rowspan="1"|2017/08/ | + | | rowspan="1"|2017/08/16 |
− | | | + | |Shipan Ren || 9:00 || 20:00 || 11 || |
− | * test | + | * train translation models and test them |
+ | * toolkit: THUMT | ||
+ | * dataset:zh-en small | ||
+ | * test if THUMT can work normally | ||
+ | |||
+ | |- | ||
+ | | rowspan="1"|2017/08/16 | ||
+ | |Jiayu Guo || 10:00|| 23:00 || 10|| | ||
checkpoint-100000 translation model | checkpoint-100000 translation model | ||
BLEU: 11.11 | BLEU: 11.11 | ||
第697行: | 第866行: | ||
*3.data used Shiji and Zizhitongjian(43,0000 pairs), we can get BLEU about 9. | *3.data used Shiji and Zizhitongjian(43,0000 pairs), we can get BLEU about 9. | ||
*4.data used Shiji and Zizhitongjian(43,0000 pairs), and split the ancient language text one character by one, we can get BLEU 11.11 at most. | *4.data used Shiji and Zizhitongjian(43,0000 pairs), and split the ancient language text one character by one, we can get BLEU 11.11 at most. | ||
− | * | + | |- |
+ | |||
+ | | rowspan="1"|2017/08/17 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * code automation scripts to process data,train model and test model | ||
+ | * train translation models and test them | ||
+ | * toolkit: THUMT | ||
+ | * dataset:zh-en big | ||
|- | |- | ||
+ | | rowspan="1"|2017/08/17 | ||
+ | |Jiayu Guo || 13:00|| 23:00 || 10 || | ||
+ | * read source code. | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/18 | ||
+ | |Shipan Ren || 9:00 || 20:00 || 11 || | ||
+ | * test translation models by using single reference and multiple reference | ||
+ | * organize all the experimental results(our baseline system,Moses,THUMT) | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/18 | ||
+ | |Jiayu Guo || 13:00|| 22:00 || 9 || | ||
+ | * read source code. | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/21 | ||
+ | |Shipan Ren || 10:00 || 22:00 || 12 || | ||
+ | * read the released information of other translation systems | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/21 | ||
+ | |Jiayu Guo || 9:30 || 21:30 || 12 || | ||
+ | * read the source code and learn tensorflow | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/22 | ||
+ | |Shipan Ren || 10:00 || 22:00 || 12 || | ||
+ | * cleaned up the code | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/22 | ||
+ | |Jiayu Guo || 9:00 || 22:00 || 12 || | ||
+ | * read the source code | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/23 | ||
+ | |Shipan Ren || 10:00 || 21:00 || 11 || | ||
+ | * wrote the documents | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/23 | ||
+ | |Jiayu Guo || 9:00 || 22:00 || 11 || | ||
+ | * read the source code and learn tensorflow | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/24 | ||
+ | |Shipan Ren || 10:00 || 20:00 || 10 || | ||
+ | * wrote the documents | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/24 | ||
+ | |Jiayu Guo || 9:10 || 22:00 || 10.5 || | ||
+ | * read the source code and learn tensorflow | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/25 | ||
+ | |Shipan Ren || 10:00 || 20:00 || 10 || | ||
+ | * check experimental results | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/25 | ||
+ | |Jiayu Guo || 8:50 || 22:00 || 10.5 || | ||
+ | * read the source code and learn tensorflow | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/28 | ||
+ | |Shipan Ren || 10:00 || 20:00 || 10 || | ||
+ | * wrote the paper of ViVi_NMT(version 1.0) | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/28 | ||
+ | |Jiayu Guo || 8:10 || 21:00 || 11 || | ||
+ | * read the source code and learn tensorflow | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/29 | ||
+ | |Shipan Ren || 10:00 || 20:00 || 10 || | ||
+ | * wrote the paper of ViVi_NMT(version 1.0) | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/29 | ||
+ | |Jiayu Guo || 11:00 || 21:00 || 10 || | ||
+ | * read the source code and learn tensorflow | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/30 | ||
+ | |Shipan Ren || 10:00 || 20:00 || 10 || | ||
+ | * wrote the paper of ViVi_NMT(version 1.0) | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/30 | ||
+ | |Jiayu Guo || 11:30 || 21:00 || 9 || | ||
+ | * learn VV model | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/31 | ||
+ | |Shipan Ren || 10:00 || 20:00 || 10 || | ||
+ | * wrote the paper of ViVi_NMT(version 1.0) | ||
+ | |- | ||
+ | | rowspan="1"|2017/08/31 | ||
+ | |Jiayu Guo || 10:00 || 20:00 || 10 || | ||
+ | * clean up the code | ||
+ | |- | ||
} | } | ||
2017年9月4日 (一) 07:41的最后版本
目录
NLP Schedule
Members
Current Members
- Yang Feng (冯洋)
- Jiyuan Zhang (张记袁)
- Aodong Li (李傲冬)
- Andi Zhang (张安迪)
- Shiyue Zhang (张诗悦)
- Li Gu (古丽)
- Peilun Xiao (肖培伦)
- Shipan Ren (任师攀)
- Jiayu Guo (郭佳雨)
Former Members
- Chao Xing (邢超) : FreeNeb
- Rong Liu (刘荣) : 优酷
- Xiaoxi Wang (王晓曦) : 图灵机器人
- Xi Ma (马习) : 清华大学研究生
- Tianyi Luo (骆天一) : phd candidate in University of California Santa Cruz
- Qixin Wang (王琪鑫) : MA candidate in University of California
- DongXu Zhang (张东旭): --
- Yiqiao Pan (潘一桥) : MA candidate in University of Sydney
- Shiyao Li (李诗瑶) : BUPT
- Aiting Liu (刘艾婷) : BUPT
Work Progress
Daily Report
Date | Person | start | leave | hours | status | |
---|---|---|---|---|---|---|
2017/04/02 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/03 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/04 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/05 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/06 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/07 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/08 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/09 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/10 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/11 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/12 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/13 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/14 | Andy Zhang | 9:30 | 18:30 | 8 |
| |
Peilun Xiao | ||||||
2017/04/15 | Andy Zhang | 9:00 | 15:00 | 6 |
| |
Peilun Xiao | ||||||
2017/04/18 | Aodong Li | 11:00 | 20:00 | 8 |
| |
2017/04/19 | Aodong Li | 11:00 | 20:00 | 8 |
| |
2017/04/20 | Aodong Li | 12:00 | 20:00 | 8 |
| |
2017/04/21 | Aodong Li | 12:00 | 20:00 | 8 |
| |
2017/04/24 | Aodong Li | 11:00 | 20:00 | 8 |
| |
2017/04/25 | Aodong Li | 11:00 | 20:00 | 8 |
| |
2017/04/26 | Aodong Li | 11:00 | 20:00 | 8 |
| |
2017/04/27 | Aodong Li | 11:00 | 20:00 | 8 |
| |
2017/04/28 | Aodong Li | 11:00 | 20:00 | 8 |
| |
2017/04/30 | Aodong Li | 11:00 | 20:00 | 8 |
| |
2017/05/01 | Aodong Li | 11:00 | 20:00 | 8 |
| |
2017/05/02 | Aodong Li | 11:00 | 20:00 | 8 |
| |
2017/05/06 | Aodong Li | 14:20 | 17:20 | 3 |
| |
2017/05/07 | Aodong Li | 13:30 | 22:00 | 8 |
| |
2017/05/08 | Aodong Li | 11:30 | 21:00 | 8 |
| |
2017/05/09 | Aodong Li | 13:00 | 22:00 | 9 |
small data, 1st and 2nd translator uses the same training data, 2nd translator uses random initialized embedding
BASELINE: 43.87 best result of our model: 42.56 | |
2017/05/10 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/05/10 | Aodong Li | 13:30 | 22:00 | 8 |
small data, 1st and 2nd translator uses the different training data, counting 22000 and 22017 seperately 2nd translator uses random initialized embedding
BASELINE: 36.67 (36.67 is the model at 4750 updates, but we use model at 3000 updates to prevent the case of overfitting, to generate the 2nd translator's training data, for which the BLEU is 34.96) best result of our model: 29.81 This may suggest that that using either the same training data with 1st translator or different one won't influence 2nd translator's performance, instead, using the same one may be better, at least from results. But I have to give a consideration of a smaller size of training data compared to yesterday's model.
| |
2017/05/11 | Shipan Ren | 10:00 | 19:30 | 9.5 |
| |
2017/05/11 | Aodong Li | 13:00 | 21:00 | 8 |
small data, 1st and 2nd translator uses the same training data, 2nd translator uses constant untrainable embedding imported from 1st translator's decoder
BASELINE: 43.87 best result of our model: 43.48 Experiments show that this kind of series or cascade model will definitely impair the final perfor- mance due to information loss as the information flows through the network from end to end. Decoder's smaller vocabulary size compared to encoder's demonstrate this (9000+ -> 6000+). The intention of this experiment is looking for a map to solve meaning shift using 2nd translator, but result of whether the map is learned or not is obscured by the smaller vocab size phenomenon.
| |
2017/05/12 | Aodong Li | 13:00 | 21:00 | 8 |
| |
2017/05/13 | Shipan Ren | 10:00 | 19:00 | 9 |
| |
2017/05/14 | Aodong Li | 10:00 | 20:00 | 9 |
small data, 2nd translator uses as training data the concat(Chinese, machine translated English), 2nd translator uses random initialized embedding
BASELINE: 43.87 best result of our model: 43.53
| |
2017/05/15 | Shipan Ren | 9:30 | 19:00 | 9.5 |
| |
2017/05/17 | Shipan Ren | 9:30 | 19:30 | 10 |
| |
Aodong Li | 13:30 | 24:00 | 9 |
| ||
2017/05/18 | Shipan Ren | 10:00 | 19:00 | 9 |
| |
Aodong Li | 12:30 | 21:00 | 8 |
| ||
2017/05/19 | Aodong Li | 12:30 | 20:30 | 8 |
| |
2017/05/21 | Aodong Li | 10:30 | 18:30 | 8 |
hidden_size = 700 (500 in prior) emb_size = 510 (310 in prior) small data, 2nd translator uses as training data the concat(Chinese, machine translated English), 2nd translator uses random initialized embedding
BASELINE: 43.87 best result of our model: 45.21 But only one checkpoint outperforms the baseline, the other results are commonly under 43.1
| |
2017/05/22 | Aodong Li | 14:00 | 22:00 | 8 |
| |
2017/05/23 | Aodong Li | 13:00 | 21:30 | 8 |
hidden_size = 700 emb_size = 510 learning_rate = 0.0005 (0.001 in prior) small data, 2nd translator uses as training data the concat(Chinese, machine translated English), 2nd translator uses random initialized embedding
BASELINE: 43.87 best result of our model: 42.19 Overfitting? In overall, the 2nd translator performs worse than baseline
hidden_size = 500 emb_size = 310 learning_rate = 0.001 small data, double-decoder model with joint loss which means the final loss = 1st decoder's loss + 2nd decoder's loss
BASELINE: 43.87 best result of our model: 39.04 The 1st decoder's output is generally better than 2nd decoder's output. The reason may be that the second decoder only learns from the first decoder's hidden states because their states are almost the same.
The reason why double-decoder without joint loss generalizes very bad is that the gap between force teaching mechanism (training process) and beam search mechanism (decoding process) propagates and expands the error to the output end, which destroys the model when decoding.
Try to train double-decoder model without joint loss but with beam search on 1st decoder. | |
2017/05/24 | Aodong Li | 13:00 | 21:30 | 8 |
| |
2017/05/24 | Shipan Ren | 10:00 | 20:00 | 10 |
| |
2017/05/25 | Shipan Ren | 9:30 | 18:30 | 9 |
| |
Aodong Li | 13:00 | 22:00 | 9 |
| ||
2017/05/27 | Shipan Ren | 9:30 | 18:30 | 9 |
| |
2017/05/28 | Aodong Li | 15:00 | 22:00 | 7 |
hidden_size = 500 emb_size = 310 learning_rate = 0.001 small data, 2nd translator uses as training data both Chinese and machine translated English Chinese and English use different encoders and different attention final_attn = attn_1 + attn_2 2nd translator uses random initialized embedding
BASELINE: 43.87 when decoding: final_attn = attn_1 + attn_2 best result of our model: 43.50 final_attn = 2/3attn_1 + 4/3attn_2 best result of our model: 41.22 final_attn = 4/3attn_1 + 2/3attn_2 best result of our model: 43.58 | |
2017/05/30 | Aodong Li | 15:00 | 21:00 | 6 |
hidden_size = 500 emb_size = 310 learning_rate = 0.001 small data, 2nd translator uses as training data both Chinese and machine translated English Chinese and English use different encoders and different attention final_attn = 2/3attn_1 + 4/3attn_2 2nd translator uses random initialized embedding
BASELINE: 43.87 best result of our model: 42.36
final_attn = 2/3attn_1 + 4/3attn_2 2nd translator uses constant initialized embedding
BASELINE: 43.87 best result of our model: 45.32
final_attn = attn_1 + attn_2 2nd translator uses constant initialized embedding
BASELINE: 43.87 best result of our model: 45.41 and it seems more stable | |
2017/05/31 | Shipan Ren | 10:00 | 19:30 | 9.5 |
| |
Aodong Li | 12:00 | 20:30 | 8.5 |
final_attn = 4/3attn_1 + 2/3attn_2 2nd translator uses constant initialized embedding
BASELINE: 43.87 best result of our model: 45.79
| ||
2017/06/01 | Aodong Li | 13:00 | 24:00 | 11 |
| |
2017/06/02 | Aodong Li | 13:00 | 22:00 | 9 |
| |
2017/06/03 | Aodong Li | 13:00 | 21:00 | 8 |
| |
2017/06/05 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/06 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/07 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/08 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/09 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/12 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/13 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/14 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/15 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/16 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/19 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/20 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/21 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/22 | Aodong Li | 10:00 | 19:00 | 8 |
| |
2017/06/23 | Shipan Ren | 10:00 | 21:00 | 11 |
| |
Aodong Li | 10:00 | 19:00 | 8 |
| ||
2017/06/26 | Shipan Ren | 10:00 | 21:00 | 11 |
| |
Aodong Li | 10:00 | 19:00 | 8 |
| ||
2017/06/27 | Shipan Ren | 10:00 | 20:00 | 10 |
| |
Aodong Li | 10:00 | 19:00 | 8 |
| ||
2017/06/28 | Shipan Ren | 10:00 | 19:00 | 9 |
| |
Aodong Li | 10:00 | 19:00 | 8 |
| ||
2017/06/29 | Shipan Ren | 10:00 | 20:00 | 10 |
| |
Aodong Li | 10:00 | 19:00 | 8 |
| ||
2017/06/30 | Shipan Ren | 10:00 | 24:00 | 14 |
| |
Aodong Li | 10:00 | 19:00 | 8 |
| ||
2017/07/03 | Shipan Ren | 9:00 | 21:00 | 12 |
| |
2017/07/04 | Shipan Ren | 9:00 | 21:00 | 12 |
| |
2017/07/05 | Shipan Ren | 9:00 | 21:00 | 12 |
| |
2017/07/06 | Shipan Ren | 9:00 | 21:00 | 12 |
| |
2017/07/07 | Shipan Ren | 9:00 | 21:00 | 12 |
| |
2017/07/08 | Shipan Ren | 9:00 | 21:00 | 12 |
| |
2017/07/10 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/11 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/12 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/13 | Shipan Ren | 9:00 | 20:00 | 11 |
reason: improper distribution of resources by the tensorflow0.1 frame leads to exhaustion of memory resources
| |
2017/07/14 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/17 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/18 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/18 | Jiayu Guo | 8:30 | 22:00 | 14 |
| |
2017/07/19 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/19 | Jiayu Guo | 9:00 | 22:00 | 13 |
| |
2017/07/20 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/20 | Jiayu Guo | 9:00 | 22:00 | 13 |
| |
2017/07/21 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/21 | Jiayu Guo | 10:00 | 23:00 | 13 |
| |
2017/07/24 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/24 | Jiayu Guo | 9:00 | 22:00 | 13 |
| |
2017/07/25 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/25 | Jiayu Guo | 9:00 | 23:00 | 14 |
| |
2017/07/26 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/26 | Jiayu Guo | 10:00 | 24:00 | 14 |
| |
2017/07/27 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/27 | Jiayu Guo | 10:00 | 24:00 | 14 |
| |
2017/07/28 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/28 | Jiayu Guo | 9:00 | 24:00 | 15 |
|
|
2017/07/31 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/07/31 | Jiayu Guo | 10:00 | 23:00 | 13 |
|
|
2017/08/1 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/1 | Jiayu Guo | 10:00 | 23:00 | 13 |
|
|
2017/08/2 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/2 | Jiayu Guo | 10:00 | 23:00 | 13 |
| |
2017/08/3 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/3 | Jiayu Guo | 10:00 | 23:00 | 13 |
| |
2017/08/4 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/4 | Jiayu Guo | 10:00 | 23:00 | 13 |
| |
2017/08/7 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/7 | Jiayu Guo | 9:00 | 22:00 | 13 |
| |
2017/08/8 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/8 | Jiayu Guo | 10:00 | 21:00 | 11 |
| |
2017/08/9 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/9 | Jiayu Guo | 10:00 | 23:00 | 13 |
| |
2017/08/10 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/10 | Jiayu Guo | 9:00 | 23:00 | 13 |
| |
2017/08/11 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/11 | Jiayu Guo | 9:00 | 20:00 | 11 |
| |
2017/08/14 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/14 | Jiayu Guo | 10:00 | 23:00 | 13 |
| |
2017/08/15 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/15 | Jiayu Guo | 11:00 | 23:30 | 12 |
| |
2017/08/16 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/16 | Jiayu Guo | 10:00 | 23:00 | 10 |
checkpoint-100000 translation model BLEU: 11.11
| |
2017/08/17 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/17 | Jiayu Guo | 13:00 | 23:00 | 10 |
| |
2017/08/18 | Shipan Ren | 9:00 | 20:00 | 11 |
| |
2017/08/18 | Jiayu Guo | 13:00 | 22:00 | 9 |
| |
2017/08/21 | Shipan Ren | 10:00 | 22:00 | 12 |
| |
2017/08/21 | Jiayu Guo | 9:30 | 21:30 | 12 |
| |
2017/08/22 | Shipan Ren | 10:00 | 22:00 | 12 |
| |
2017/08/22 | Jiayu Guo | 9:00 | 22:00 | 12 |
| |
2017/08/23 | Shipan Ren | 10:00 | 21:00 | 11 |
| |
2017/08/23 | Jiayu Guo | 9:00 | 22:00 | 11 |
| |
2017/08/24 | Shipan Ren | 10:00 | 20:00 | 10 |
| |
2017/08/24 | Jiayu Guo | 9:10 | 22:00 | 10.5 |
| |
2017/08/25 | Shipan Ren | 10:00 | 20:00 | 10 |
| |
2017/08/25 | Jiayu Guo | 8:50 | 22:00 | 10.5 |
| |
2017/08/28 | Shipan Ren | 10:00 | 20:00 | 10 |
| |
2017/08/28 | Jiayu Guo | 8:10 | 21:00 | 11 |
| |
2017/08/29 | Shipan Ren | 10:00 | 20:00 | 10 |
| |
2017/08/29 | Jiayu Guo | 11:00 | 21:00 | 10 |
| |
2017/08/30 | Shipan Ren | 10:00 | 20:00 | 10 |
| |
2017/08/30 | Jiayu Guo | 11:30 | 21:00 | 9 |
| |
2017/08/31 | Shipan Ren | 10:00 | 20:00 | 10 |
| |
2017/08/31 | Jiayu Guo | 10:00 | 20:00 | 10 |
|
Date | Yang Feng | Jiyuan Zhang |
---|