“Schedule”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
Aiting Liu
Daily Report
 
(12位用户的550个中间修订版本未显示)
第1行: 第1行:
=Text Processing Team Schedule=
+
=NLP Schedule=
  
 
==Members==
 
==Members==
===Former Members===
 
* Rong Liu (刘荣) : 优酷
 
* Xiaoxi Wang (王晓曦) : 图灵机器人
 
* Xi Ma (马习) : 清华大学研究生
 
* DongXu Zhang (张东旭) : --
 
* Yiqiao Pan (潘一桥):继续读研
 
  
 
===Current Members===
 
===Current Members===
* Tianyi Luo (骆天一)
+
 
* Chao Xing (邢超)
+
* Yang Feng (冯洋)
* Qixin Wang (王琪鑫)
+
* Jiyuan Zhang (张记袁)
 
* Aodong Li (李傲冬)
 
* Aodong Li (李傲冬)
* Aiting Liu (刘艾婷)
+
* Andi Zhang (张安迪)
* Ziwei Bai (白子薇)
+
* Shiyue Zhang (张诗悦)
 +
* Li Gu (古丽)
 +
* Peilun Xiao (肖培伦)
 +
* Shipan Ren (任师攀)
 +
* Jiayu Guo (郭佳雨)
  
==Work Process==
+
===Former Members===
===Paper Share===
+
* '''Chao Xing (邢超)'''    :  FreeNeb
====2016-06-23====
+
* '''Rong Liu (刘荣)'''      :  优酷
Learning Better Embeddings for Rare Words Using Distributional Representations [http://aclweb.org/anthology/D15-1033 pdf]
+
* '''Xiaoxi Wang (王晓曦)''' :  图灵机器人
 +
* '''Xi Ma (马习)'''        :  清华大学研究生
 +
* '''Tianyi Luo (骆天一)'''  : phd candidate in University of California Santa Cruz
 +
* '''Qixin Wang (王琪鑫)'''  :  MA candidate in University of California
 +
* '''DongXu Zhang (张东旭)''': --
 +
* '''Yiqiao Pan (潘一桥)'''  : MA candidate in University of Sydney
 +
* '''Shiyao Li (李诗瑶)''' : BUPT
 +
* '''Aiting Liu (刘艾婷)'''  :  BUPT
  
Hierarchical Attention Networks for Document Classification [https://www.cs.cmu.edu/~diyiy/docs/naacl16.pdf pdf]
+
==Work Progress==
 +
===Daily Report===
  
Hierarchical Recurrent Neural Network for Document Modeling [http://www.aclweb.org/anthology/D/D15/D15-1106.pdf pdf]
+
{|class="wikitable"
 +
! Date !! Person  !! start!! leave !! hours ||status
 +
|-
 +
| rowspan="2"|2017/04/02
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/03
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/04
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/05
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/06
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/07
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/08
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/09
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/10
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/11
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/12
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/13
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/14
 +
|Andy Zhang||9:30 ||18:30 ||8 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="2"|2017/04/15
 +
|Andy Zhang||9:00 ||15:00 ||6 ||
 +
*preparing EMNLP
 +
|-
 +
|Peilun Xiao || || || ||
 +
|-
 +
| rowspan="1"|2017/04/18
 +
|Aodong Li||11:00 ||20:00 ||8 ||
 +
*Pick up new task in news generation and do literature review
 +
|-
 +
| rowspan="1"|2017/04/19
 +
|Aodong Li||11:00 ||20:00 ||8 ||
 +
*Literature review
 +
|-
 +
| rowspan="1"|2017/04/20
 +
|Aodong Li||12:00 ||20:00 ||8 ||
 +
*Literature review
 +
|-
 +
| rowspan="1"|2017/04/21
 +
|Aodong Li||12:00 ||20:00 ||8 ||
 +
*Literature review
 +
|-
 +
| rowspan="1"|2017/04/24
 +
|Aodong Li||11:00 ||20:00 ||8 ||
 +
*Adjust literature review focus
 +
|-
 +
| rowspan="1"|2017/04/25
 +
|Aodong Li||11:00 ||20:00 ||8 ||
 +
*Literature review
 +
|-
 +
| rowspan="1"|2017/04/26
 +
|Aodong Li||11:00 ||20:00 ||8 ||
 +
*Literature review
 +
|-
 +
| rowspan="1"|2017/04/27
 +
|Aodong Li||11:00 ||20:00 ||8 ||
 +
*Try to reproduce sc-lstm work
 +
|-
 +
| rowspan="1"|2017/04/28
 +
|Aodong Li||11:00 ||20:00 ||8 ||
 +
*Transfer to new task in machine translation and do literature review
 +
|-
 +
| rowspan="1"|2017/04/30
 +
|Aodong Li||11:00 ||20:00 ||8 ||
 +
*Literature review
 +
|-
 +
| rowspan="1"|2017/05/01
 +
|Aodong Li||11:00 ||20:00 ||8 ||
 +
*Literature review
 +
|-
 +
| rowspan="1"|2017/05/02
 +
|Aodong Li||11:00 ||20:00 ||8 ||
 +
*Literature review and code review
 +
|-
 +
| rowspan="1"|2017/05/06
 +
|Aodong Li||14:20 ||17:20||3 ||
 +
*Code review
 +
|-
 +
| rowspan="1"|2017/05/07
 +
|Aodong Li||13:30 ||22:00||8 ||
 +
*Code review and experiment started, but version discrepancy encountered
 +
|-
 +
| rowspan="1"|2017/05/08
 +
|Aodong Li||11:30 ||21:00 ||8 ||
 +
*Code review and version discrepancy solved
 +
|-
 +
| rowspan="1"|2017/05/09
 +
|Aodong Li||13:00 ||22:00 ||9 ||
 +
*Code review and experiment
 +
*details about experiment:
 +
  small data,
 +
  1st and 2nd translator uses the same training data,
 +
  2nd translator uses '''random initialized embedding'''
 +
*results (BLEU):
 +
  BASELINE: 43.87
 +
  best result of our model: 42.56
 +
|-
 +
| rowspan="1"|2017/05/10
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
*Entry procedures
 +
*Machine Translation paper reading
 +
|-
 +
| rowspan="1"|2017/05/10
 +
|Aodong Li || 13:30 || 22:00 || 8 ||
 +
*experiment setting:
 +
  small data,
 +
  1st and 2nd translator uses the different training data, counting 22000 and 22017 seperately
 +
  2nd translator uses '''random initialized embedding'''
 +
*results (BLEU):
 +
  BASELINE: 36.67 (36.67 is the model at 4750 updates, but we use model at 3000 updates to
 +
                    prevent the case of overfitting, to generate the 2nd translator's training data, for  
 +
                    which the BLEU is 34.96)
 +
  best result of our model: 29.81
 +
  This may suggest that that using either the same training data with 1st translator or different
 +
                    one won't influence 2nd translator's performance, instead, using the same one may
 +
                    be better, at least from results. But I have to give a consideration of a smaller size
 +
                    of training data compared to yesterday's model.
 +
*code 2nd translator with constant embedding
 +
|-
 +
| rowspan="1"|2017/05/11
 +
|Shipan Ren || 10:00 || 19:30 || 9.5 ||
 +
*Configure environment
 +
*Run tf_translate code
 +
*Read Machine Translation paper
 +
|-
 +
| rowspan="1"|2017/05/11
 +
|Aodong Li || 13:00 ||  21:00|| 8 ||
 +
*experiment setting:
 +
  small data,
 +
  1st and 2nd translator uses the same training data,
 +
  2nd translator uses '''constant untrainable embedding''' imported from 1st translator's decoder
 +
*results (BLEU):
 +
  BASELINE: 43.87
 +
  best result of our model: 43.48
 +
  Experiments show that this kind of series or cascade model will definitely impair the final perfor-
 +
                      mance due to information loss as the information flows through the network from
 +
                      end to end. Decoder's smaller vocabulary size compared to encoder's demonstrate
 +
                      this (9000+ -> 6000+).
 +
  The intention of this experiment is looking for a map to solve meaning shift using 2nd translator,
 +
                      but result of whether the map is learned or not is obscured by the smaller vocab size
 +
                      phenomenon.
 +
*literature review on hierarchical machine translation
 +
|-
 +
| rowspan="1"|2017/05/12
 +
|Aodong Li||13:00 ||21:00 ||8 ||
 +
*Code double decoding model and read multilingual MT paper
 +
|-
 +
| rowspan="1"|2017/05/13
 +
|Shipan Ren || 10:00 || 19:00 || 9 ||
 +
*read machine translation paper
 +
*learne lstm model and seq2seq model
 +
|-
 +
| rowspan="1"|2017/05/14
 +
|Aodong Li || 10:00 || 20:00 || 9 ||
 +
*Code double decoding model and experiment
 +
*details about experiment:
 +
  small data,
 +
  2nd translator uses as training data the concat(Chinese, machine translated English),
 +
  2nd translator uses '''random initialized embedding'''
 +
*results (BLEU):
 +
  BASELINE: 43.87
 +
  best result of our model: 43.53
 +
*NEXT: 2nd translator uses '''trained constant embedding'''
 +
|-
 +
| rowspan="1"|2017/05/15
 +
|Shipan Ren || 9:30 || 19:00 || 9.5 ||
 +
* understand the difference between lstm model and gru model
 +
* read the implement code of seq2seq model
 +
|-
 +
| rowspan="2"|2017/05/17
 +
|Shipan Ren || 9:30 || 19:30 || 10 ||
 +
* read neural machine translation paper
 +
* read tf_translate code
 +
|-
 +
|Aodong Li || 13:30 || 24:00 || 9||
 +
* code and debug double-decoder model
 +
* alter 2017/05/14 model's size and will try after nips
 +
|-
 +
| rowspan="2"|2017/05/18
 +
|Shipan Ren || 10:00 || 19:00 || 9 ||
 +
* read neural machine translation paper
 +
* read tf_translate code
 +
|-
 +
|Aodong Li || 12:30 || 21:00 || 8 ||
 +
* train double-decoder model on small data set but encounter decode bugs
 +
|-
 +
| rowspan="1"|2017/05/19
 +
|Aodong Li || 12:30 || 20:30 || 8 ||
 +
* debug double-decoder model
 +
* the model performs well on develop set, but performs badly on test data. I want to figure out the reason.
 +
|-
 +
| rowspan="1"|2017/05/21
 +
|Aodong Li || 10:30 || 18:30 || 8 ||
 +
*details about experiment:
 +
  hidden_size = 700 (500 in prior)
 +
  emb_size = 510 (310 in prior)
 +
  small data,
 +
  2nd translator uses as training data the concat(Chinese, machine translated English),
 +
  2nd translator uses '''random initialized embedding'''
 +
*results (BLEU):
 +
  BASELINE: 43.87
 +
  best result of our model: '''45.21'''
 +
  But only one checkpoint outperforms the baseline, the other results are commonly under 43.1
 +
* debug double-decoder model
 +
|-
 +
| rowspan="1"|2017/05/22
 +
|Aodong Li || 14:00 || 22:00 || 8 ||
 +
*double-decoder without joint loss generalizes very bad
 +
*i'm trying double-decoder model with joint loss
 +
|-
 +
| rowspan="1"|2017/05/23
 +
|Aodong Li || 13:00 || 21:30 || 8 ||
 +
*details about experiment 1:
 +
  hidden_size = 700
 +
  emb_size = 510
 +
  learning_rate = 0.0005 (0.001 in prior)
 +
  small data,
 +
  2nd translator uses as training data the concat(Chinese, machine translated English),
 +
  2nd translator uses '''random initialized embedding'''
 +
*results (BLEU):
 +
  BASELINE: 43.87
 +
  best result of our model: '''42.19'''
 +
  Overfitting? In overall, the 2nd translator performs worse than baseline
 +
*details about experiment 2:
 +
  hidden_size = 500
 +
  emb_size = 310
 +
  learning_rate = 0.001
 +
  small data,
 +
  double-decoder model with joint loss which means the final loss  = 1st decoder's loss + 2nd
 +
  decoder's loss
 +
*results (BLEU):
 +
  BASELINE: 43.87
 +
  best result of our model: '''39.04'''
 +
  The 1st decoder's output is generally better than 2nd decoder's output. The reason may be that
 +
  the second decoder only learns from the first decoder's hidden states because their states are
 +
  almost the same.
 +
*DISCOVERY:
 +
  The reason why double-decoder without joint loss generalizes very bad is that the gap between
 +
  force teaching mechanism (training process) and beam search mechanism (decoding process)
 +
  propagates and expands the error to the output end, which destroys the model when decoding.
 +
*next:
 +
  Try to train double-decoder model without joint loss but with beam search on 1st decoder.
 +
|-
 +
| rowspan="1"|2017/05/24
 +
|Aodong Li || 13:00 || 21:30 || 8 ||
 +
*code double-attention one-decoder model
 +
*code double-decoder model
 +
|-
  
Learning Distributed Representations of Sentences from Unlabelled Data [http://arxiv.org/pdf/1602.03483.pdf pdf]
+
| rowspan="1"|2017/05/24
 +
|Shipan Ren || 10:00 || 20:00 || 10 ||
 +
*read neural machine translation paper
 +
*read tf_translate code
 +
|-
  
Speech Synthesis Based on HiddenMarkov Models [http://www.research.ed.ac.uk/portal/files/15269212/Speech_Synthesis_Based_on_Hidden_Markov_Models.pdf pdf]
+
| rowspan="2"|2017/05/25
 +
|Shipan Ren || 9:30 || 18:30 || 9 ||
 +
*write document of tf_translate project
 +
*read neural machine translation paper
 +
*read tf_translate code
 +
|-
 +
|Aodong Li || 13:00 || 22:00 || 9 ||
 +
* code and debug double attention model
 +
|-
  
===Research Task===
+
| rowspan="1"|2017/05/27
====Binary Word Embedding(Aiting)====
+
|Shipan Ren || 9:30 || 18:30 || 9 ||
[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/9/97/Binary.pdf binary]
+
*read tf_translate code
 +
*write document of tf_translate project
 +
|-
 +
| rowspan="1"|2017/05/28
 +
|Aodong Li || 15:00 || 22:00 || 7 ||
 +
*details about experiment:
 +
  hidden_size = 500
 +
  emb_size = 310
 +
  learning_rate = 0.001
 +
  small data,
 +
  2nd translator uses as training data both Chinese and machine translated English
 +
  Chinese and English use different encoders and different attention
 +
  '''final_attn = attn_1 + attn_2'''
 +
  2nd translator uses '''random initialized embedding'''
 +
*results (BLEU):
 +
  BASELINE: 43.87
 +
  when decoding:
 +
    final_attn = attn_1 + attn_2 best result of our model: '''43.50'''
 +
    final_attn = 2/3attn_1 + 4/3attn_2 best result of our model: '''41.22'''
 +
    final_attn = 4/3attn_1 + 2/3attn_2 best result of our model: '''43.58'''
 +
|-
 +
| rowspan="1"|2017/05/30
 +
|Aodong Li || 15:00 || 21:00 || 6 ||
 +
*details about experiment 1:
 +
  hidden_size = 500
 +
  emb_size = 310
 +
  learning_rate = 0.001
 +
  small data,
 +
  2nd translator uses as training data both Chinese and machine translated English
 +
  Chinese and English use different encoders and different attention
 +
  '''final_attn = 2/3attn_1 + 4/3attn_2'''
 +
  2nd translator uses '''random initialized embedding'''
 +
*results (BLEU):
 +
  BASELINE: 43.87
 +
  best result of our model: '''42.36'''
 +
* details about experiment 2:
 +
  '''final_attn = 2/3attn_1 + 4/3attn_2'''
 +
  2nd translator uses '''constant initialized embedding'''
 +
*results (BLEU):
 +
  BASELINE: 43.87
 +
  best result of our model: '''45.32'''
 +
* details about experiment 3:
 +
  '''final_attn = attn_1 + attn_2'''
 +
  2nd translator uses '''constant initialized embedding'''
 +
*results (BLEU):
 +
  BASELINE: 43.87
 +
  best result of our model: '''45.41''' and it seems more stable
 +
|-
 +
| rowspan="2"|2017/05/31
 +
|Shipan Ren || 10:00 || 19:30 || 9.5 ||
 +
*run and test tf_translate code
 +
*write document of tf_translate project
 +
|-
 +
|Aodong Li || 12:00 || 20:30 || 8.5 ||
 +
* details about experiment 1:
 +
  '''final_attn = 4/3attn_1 + 2/3attn_2'''
 +
  2nd translator uses '''constant initialized embedding'''
 +
*results (BLEU):  
 +
  BASELINE: 43.87
 +
  best result of our model: '''45.79'''
 +
* That only make English word embedding at encoder constant and train all the other embedding and parameters achieves an even higher bleu score 45.98 and the results are stable.
 +
* The quality of English embedding at encoder plays an pivotal role in this model.
 +
* Preparation of big data.
 +
|-
 +
| rowspan="1"|2017/06/01
 +
|Aodong Li || 13:00 || 24:00 || 11 ||
 +
* Only make the English encoder's embedding constant -- 45.98
 +
* Only initialize the English encoder's embedding and then finetune it -- 46.06
 +
* Share the attention mechanism and then directly add them -- 46.20
 +
* Run double-attention model on large data
 +
|-
 +
| rowspan="1"|2017/06/02
 +
|Aodong Li || 13:00 || 22:00 || 9 ||
 +
* Baseline bleu on large data is 30.83 with '''30000''' output vocab
 +
* Our best result is 31.53 with '''20000''' output vocab
 +
|-
 +
| rowspan="1"|2017/06/03
 +
|Aodong Li || 13:00 || 21:00 || 8 ||
 +
* Train the model with 40 batch size and with concat(attn_1, attn_2)
 +
* the best result of model with 40 batch size and with add(attn_1, attn_2) is 30.52
 +
|-
 +
| rowspan="1"|2017/06/05
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Prepare for APSIPA paper
 +
|-
 +
| rowspan="1"|2017/06/06
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Prepare for APSIPA paper
 +
|-
 +
| rowspan="1"|2017/06/07
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Prepare for APSIPA paper
 +
|-
 +
| rowspan="1"|2017/06/08
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Prepare for APSIPA paper
 +
|-
 +
| rowspan="1"|2017/06/09
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Prepare for APSIPA paper
 +
|-
 +
| rowspan="1"|2017/06/12
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Prepare for APSIPA paper
 +
|-
 +
| rowspan="1"|2017/06/13
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Prepare for APSIPA paper
 +
|-
 +
| rowspan="1"|2017/06/14
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Prepare for APSIPA paper
 +
|-
 +
| rowspan="1"|2017/06/15
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Prepare for APSIPA paper
 +
* Read paper about MT involving grammar
 +
|-
 +
| rowspan="1"|2017/06/16
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Prepare for APSIPA paper
 +
* Read paper about MT involving grammar
 +
|-
 +
| rowspan="1"|2017/06/19
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Completed APSIPA paper
 +
* Took new task in style translation
 +
|-
 +
| rowspan="1"|2017/06/20
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Tried synonyms substitution
 +
|-
 +
| rowspan="1"|2017/06/21
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Tried post edit like synonyms substitution but this didn't work
 +
|-
 +
| rowspan="1"|2017/06/22
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Trained a GRU language model to determine similar word
 +
|-
 +
| rowspan="2"|2017/06/23
 +
|Shipan Ren || 10:00 || 21:00 || 11 ||
 +
* read neural machine translation paper
 +
* read and run tf_translate code
 +
|-
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Trained a GRU language model to determine similar word
 +
* This didn't work because semantics is not captured
 +
|-
 +
| rowspan="2"|2017/06/26
 +
|Shipan Ren || 10:00 || 21:00 || 11 ||
 +
* read paper:LSTM Neural Networks for Language Modeling
 +
* read and run ViVi_NMT code
 +
|-
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Tried to figure out new ways to change the text style
 +
|-
 +
| rowspan="2"|2017/06/27
 +
|Shipan Ren || 10:00 || 20:00 || 10 ||
 +
* read the API of tensorflow
 +
* debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0
 +
|-
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Trained seq2seq model to solve this problem
 +
* Semantics are stored in fixed-length vectors by a encoder and a decoder generate sequences on this vector
 +
|-
 +
| rowspan="2"|2017/06/28
 +
|Shipan Ren || 10:00 || 19:00 || 9 ||
 +
* debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server)
 +
* installed tensorflow0.1 and tensorflow1.0 on my pc and debugged ViVi_NMT
 +
|-
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Cross-domain seq2seq w/o attention and w/ attention models didn't work because of overfitting
 +
|-
 +
| rowspan="2"|2017/06/29
 +
|Shipan Ren || 10:00 || 20:00 || 10 ||
 +
* read the API of tensorflow
 +
* debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server)
 +
|-
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Read style transfer papers
 +
|-
 +
| rowspan="2"|2017/06/30
 +
|Shipan Ren || 10:00 || 24:00 || 14 ||
 +
* debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server)
 +
* accomplished this task
 +
* found the new version saves more time,has lower complexity and better bleu than before
 +
|-
 +
|Aodong Li || 10:00 || 19:00 || 8 ||
 +
* Read style transfer papers
 +
|-
 +
| rowspan="1"|2017/07/03
 +
|Shipan Ren || 9:00 || 21:00 || 12 ||
 +
* run two versions of the code on small data sets (Chinese-English)
 +
* tested these checkpoint
  
2016-06-05: find out that tensorflow does not provide logical derivation method.
+
|-
 +
| rowspan="1"|2017/07/04
 +
|Shipan Ren || 9:00 || 21:00 || 12 ||
 +
* recorded experimental results
 +
* found version 1.0 of the code save more training time, has less complexity and these two version of the code has a similar Bleu value
 +
* found that the Bleu is still good when the model is over fitting
 +
* reason: the test set and training set are similar in content and style on small data set
  
2016-06-01: complete the first version of binary word embedding model
+
|-
 +
| rowspan="1"|2017/07/05
 +
|Shipan Ren || 9:00 || 21:00 || 12 ||
 +
* run two versions of the code on big data sets (Chinese-English)
 +
* read NMT papers
  
2016-05-28: complete the word2vec model in tensorflow
+
|-
 +
| rowspan="1"|2017/07/06
 +
|Shipan Ren || 9:00 || 21:00 || 12 ||
 +
* out of memory(OOM) error occurred when version 0.1 of code was trained using large data set,but version 1.0 worked
 +
* reason: improper distribution of resources by the tensorflow0.1 version leads to exhaustion of memory resources
 +
* I've tried many times, and version 0.1 worked
 +
|-
 +
| rowspan="1"|2017/07/07
 +
|Shipan Ren || 9:00 || 21:00 || 12 ||
 +
* tested these checkpoints and recorded experimental results
 +
* the version 1.0 code saved 0.06 second per step than the version 0.1 code
 +
|-
 +
| rowspan="1"|2017/07/08
 +
|Shipan Ren || 9:00 || 21:00 || 12 ||
 +
* downloaded the wmt2014 data set
 +
* used the English-French data set to run the code and found the translation is not good
 +
* reason:no data preprocessing is done
  
2016-05-25: write my own version of word2vec model
+
|-
 +
| rowspan="1"|2017/07/10
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
 +
* dataset:zh-en small
 +
|-
 +
| rowspan="1"|2017/07/11
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* tested these checkpoints
 +
* found the new version takes less time
 +
* found these two versions have similar complexity and bleu values
 +
* found that the bleu is still good when the model is over fitting .
 +
* (reason: the test set and the train set of small data set are similar in content and style)
  
2016-05-23:
+
|-
 +
| rowspan="1"|2017/07/12
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
 +
* dataset:zh-en big
  
        1.get tensorflow's word2vec model from(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/models/embedding)
+
|-
        2.learn word2vec_basic model
+
| rowspan="1"|2017/07/13
        3.run word2vec.py and word2vec_optimized.py
+
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* OOM(Out Of Memory) error occurred when version 0.1 was trained using large data set,but version 1.0 worked
 +
    reason: improper distribution of resources by the tensorflow0.1 frame leads to exhaustion of memory resources
 +
* I had tried 4 times (just enter the same command), and version 0.1 worked
  
2016-05-22:
+
|-
 +
| rowspan="1"|2017/07/14
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* tested these checkpoints
 +
* found the new version takes less time
 +
* found these two versions have similar complexity and bleu values
  
        1.find the tf.logical_xor(x,y) method in tensorflow to compute Hamming distance.
+
|-
        2.learn tensorflow's word2vec model
+
| rowspan="1"|2017/07/17
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* downloaded the wmt2014 data sets and processed it
  
2016-05-21:
+
|-
 +
| rowspan="1"|2017/07/18
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* processed data
  
        1.read Lantian's paper 'Binary Speaker Embedding'
+
|-
        2.try to find a formula in tensorflow to compute Hamming distance.
+
| rowspan="1"|2017/07/18
 +
|Jiayu Guo || 8:30|| 22:00 || 14 ||
 +
* read model code.
  
====Ordered Word Embedding(Aodong)====
+
|-
 +
| rowspan="1"|2017/07/19
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* processed data
 +
|-
 +
| rowspan="1"|2017/07/19
 +
|Jiayu Guo || 9:00|| 22:00 || 13 ||
 +
* read papers of bleu.
 +
|-
 +
| rowspan="1"|2017/07/20
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* processed data
 +
|-
 +
| rowspan="1"|2017/07/20
 +
|Jiayu Guo || 9:00|| 22:00 || 13 ||
 +
* read papers of attention mechanism.
 +
|-
 +
| rowspan="1"|2017/07/21
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
 +
* dataset:WMT2014 en-de
 +
|-
 +
| rowspan="1"|2017/07/21
 +
|Jiayu Guo || 10:00|| 23:00 || 13 ||
 +
* process document
  
: 2016-07-05 :
+
|-
    Code predict process
+
| rowspan="1"|2017/07/24
    Although I've got a low cost value, the predict result does not compatible as expected, even the input of predict process from training set
+
|Shipan Ren || 9:00 || 20:00 || 11 ||
    When I tried the Weibo data, program collapsed with an out of memory error.
+
* tested these checkpoints of en-de dataset
: 2016-07-04 : Complete Coding training process
+
* found the new version takes less time
: 2016-07-01, 02: The cost function is very bumpy, debug it, while it's quite difficult!
+
* found these two versions have similar complexity and bleu values
: 2016-06-27, 28, 29 : Coding
+
: 2016-06-26 : Code tf's GRU and attention model
+
: 2016-06-25 : Read tf's source code rnn_cell.py and seq2seq.py
+
: 2016-06-24 :
+
    Code spearman correlation coefficient and experiment
+
    Read Li's paper "Neural Responding Machine for Short-Text Conversation"
+
: 2016-06-23 :
+
    Share paper "Learning Better Embeddings for Rare Words Using Distributional Representations"
+
    experiment and receive new task
+
: 2016-06-22 :
+
    Experiment on low-frequency words
+
    Roughly read "Online Learning of Interpretable Word Embeddings"
+
    Roughly read "Learning Better Embeddings for Rare Words Using Distributional Representations"
+
: 2016-06-21 : Experiment and calculate cosine distance between words
+
: 2016-06-20 : Something went wrong with my program and fix it, so I have to start it all over again
+
: 2016-06-04 : Experiment the semantic&syntactic analysis of retrained word vector
+
: 2016-06-03 : Complete coding retrain process of low-freq word and experiment the semantic&syntactic analysis
+
: 2016-06-02 : Complete coding predict process of low-freq word and experiment the semantic&syntactic analysis
+
: 2016-06-01 : Read "Distributed Representations of Words and Phrases and their Compositionality"
+
: 2016-05-31 :
+
    Read Mikolov's ppt about his word embedding papers
+
    test the randomness of word2vec and there is nothing different in single thread while rerunning the program
+
    Download dataset "microsoft syntactic test set", "wordsim353", and "simlex-999"
+
: 2016-05-30 : Read "Hierarchical Probabilistic Neural Network Language Model" and "word2vec Explained: Deriving Mikolov's Negative-Sampling Word-Embedding Method"
+
: 2016-05-27 : Reread word2vec paper and read C-version word2vec.
+
: 2016-05-24 : Understand word2vec in TensorFlow, and because of some uncompleted functions, I determine to adapt the source of C-versioned word2vec.
+
: 2016-05-23 :
+
    Basic setup of TensorFlow
+
    Read code of word2vec in TensorFlow
+
: 2016-05-22 :
+
    Learn about algorithms in word2vec
+
    Read low-freq word papar and learn about 6 strategies
+
  
[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/3/39/How_to_deal_with_low_frequency_words.pdf low_freq]
+
|-
 +
| rowspan="1"|2017/07/24
 +
|Jiayu Guo || 9:00|| 22:00 || 13 ||
 +
* read model code.
 +
|-
 +
| rowspan="1"|2017/07/25
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
 +
* dataset:WMT2014 en-fr datasets
 +
|-
 +
| rowspan="1"|2017/07/25
 +
|Jiayu Guo || 9:00|| 23:00 || 14 ||
 +
* process document
  
[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/2/2c/Lowv.pdf order_rep]
+
|-
 +
| rowspan="1"|2017/07/26
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* read papers about memory-augmented nmt
  
====Matrix Factorization(Ziwei)====
+
|-
[http://papers.nips.cc/paper/5477-neural-word-embedding-as-implicit-matrix-factorization.pdf matrix-factorization]
+
| rowspan="1"|2017/07/26
 +
|Jiayu Guo || 10:00|| 24:00 || 14 ||
 +
* process document
  
2016-06-23:
+
|-
          prepare for report
+
| rowspan="1"|2017/07/27
2016-05-28:
+
|Shipan Ren || 9:00 || 20:00 || 11 ||
          learn the code 'matrix-factorization.py','count_word_frequence.py',and 'reduce_rawtext_matrix_factorization.py'
+
* read papers about memory-augmented nmt
          problem:I have no idea how to run the program and where the data.
+
  
2016-05-23:
+
|-
          read the code 'map_rawtext_matrix_factorization.py'
+
| rowspan="1"|2017/07/27
2016-05-22:
+
|Jiayu Guo || 10:00|| 24:00 || 14 ||
          learn the rest of  paper ‘Neural word Embedding as implicit matrix factorization’
+
* process document
2016-05-21:
+
          learn the ‘abstract’ and ‘introduction’ of paper ‘Neural word Embedding as implicit matrix factorization’
+
  
===Question answering system===
+
|-
 +
| rowspan="1"|2017/07/28
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* read memory-augmented nmt code
  
====Chao Xing====
+
|-
2016-05-30 ~ 2016-06-04 :
+
| rowspan="1"|2017/07/28
            Deliver CDSSM model to huilan.
+
|Jiayu Guo || 9:00|| 24:00 || 15 ||
2016-05-29 :
+
* process document
            Package chatting model in practice.
+
|
2016-05-28 :
+
            Modify bugs...
+
2016-05-27 :
+
            Train large scale model, find some problem.
+
2016-05-26 :
+
            Modify test program for large scale testing process.
+
2016-05-24 :  
+
            Build CDSSM model in huilan's machine.
+
2016-05-23 :
+
            Find three things to do.
+
            1. Cost function change to maximize QA+ - QA-.
+
            2. Different parameters space in Q space and A space.
+
            3. HRNN separate to two tricky things : use output layer or use hidden layer as decoder's softmax layer's input.
+
2016-05-22 :
+
            1. Investigate different loss functions in chatting model.
+
2016-05-21 :
+
            1. Hand out different research task to intern students.
+
2016-05-20 :
+
            1. Testing denosing rnn generation model.
+
2016-05-19 :
+
            1. Discover for denosing rnn.
+
2016-05-18 :
+
            1. Modify model for crawler data.
+
2016-05-17 :
+
            1. Code & Test HRNN model.
+
2016-05-16 :
+
            1. Work done for CDSSM model.
+
2016-05-15 :
+
            1. Test CDSSM model package version.
+
2016-05-13 :
+
            1. Coding done CDSSM model package version. Wait to test.
+
2016-05-12 :
+
            1. Begin to package CDSSM model for huilan.
+
2016-05-11 :
+
            1. Prepare for paper sharing.
+
            2. Finish CDSSM model in chatting process.
+
            3. Start setup model & experiment in dialogue system.
+
2016-05-10 :
+
            1. Finish test CDSSM model in chatting, find original data has some problem.
+
            2. Read paper:
+
                    A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion
+
                    A Neural Network Approach to Context-Sensitive Generation of Conversational Responses
+
                    Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models
+
                    Neural Responding Machine for Short-Text Conversation
+
2016-05-09 :
+
            1. Test CDSSM model in chatting model.
+
            2. Read paper :
+
                    Learning from Real Users Rating Dialogue Success with Neural Networks for Reinforcement Learning in Spoken Dialogue Systems
+
                    SimpleDS A Simple Deep Reinforcement Learning Dialogue System
+
            3. Code RNN by myself in tensorflow.
+
2016-05-08 :
+
            Fix some problem in dialogue system team, and continue read some papers in dialogue system.
+
2016-05-07 :
+
            Read some papers in dialogue system.
+
2016-05-06 :
+
            Try to fix RNN-DSSM model in tensorflow. Failure..
+
2016-05-05 :
+
            Coding for RNN-DSSM in tensorflow. Face an error when running rnn-dssm model in cpu : memory keep increasing.
+
            Tensorflow's version in huilan is 0.7.0 and install by pip, this cause using error in creating gpu graph,
+
            one possible solution is build tensorflow from source code.
+
  
====Aiting Liu====
+
|-
 +
| rowspan="1"|2017/07/31
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* read memory-augmented nmt code
 +
|-
 +
| rowspan="1"|2017/07/31
 +
|Jiayu Guo || 10:00|| 23:00 || 13 ||
 +
* split ancient language text to single word
 +
|
 +
|-
 +
| rowspan="1"|2017/08/1
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* tested these checkpoints of en-fr dataset
 +
* found the new version takes less time
 +
* found these two versions have similar complexity and bleu values
 +
|-
 +
| rowspan="1"|2017/08/1
 +
|Jiayu Guo || 10:00|| 23:00 || 13 ||
 +
* run seq2seq_model
 +
|
 +
|-
 +
| rowspan="1"|2017/08/2
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* looked for the performance(the bleu value) of other models
 +
* datasets:WMT2014 en-de and en-fr
  
2016-07-05:   write lyrics spider, and get 56306 songs from http://music.baidu.com/
+
|-
 +
| rowspan="1"|2017/08/2
 +
|Jiayu Guo || 10:00|| 23:00 || 13 ||
 +
* process document
 +
|-
 +
| rowspan="1"|2017/08/3
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* looked for the performance(the bleu value) of other seq2seq models
 +
* datasets:WMT2014 en-de and en-fr
  
2016-07-04:   learn tensorflow
+
|-
 +
| rowspan="1"|2017/08/3
 +
|Jiayu Guo || 10:00|| 23:00 || 13 ||
 +
* process document
 +
|-
 +
| rowspan="1"|2017/08/4
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* learn moses
  
2016-07-01:   submit APSIPA2016  paper
+
|-
 +
| rowspan="1"|2017/08/4
 +
|Jiayu Guo || 10:00|| 23:00 || 13 ||
 +
* search new data(Songshu)
  
2016-06-30:   perfection paper
+
|-
 +
| rowspan="1"|2017/08/7
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* installed and built Moses on the server
  
2016-06-29:   complete the ordered word embedding's paper
+
|-
 +
| rowspan="1"|2017/08/7
 +
|Jiayu Guo || 9:00|| 22:00 || 13 ||
 +
* process document
  
2016-06-26:   modify the ordered word embedding's paper
+
|-
 +
| rowspan="1"|2017/08/8
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* train statistical machine translation model and test it
 +
* dataset:zh-en small
 +
* test if moses can work normally
  
2016-06-25:   complete ordered word embedding experiment,get 54 figures
+
|-
 +
| rowspan="1"|2017/08/8
 +
|Jiayu Guo || 10:00|| 21:00 || 11 ||
 +
* read tensorflow
  
2016-06-23:   read Bengio's paper https://arxiv.org/pdf/1605.06069v3.pdf
+
|-
 +
| rowspan="1"|2017/08/9
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* code automation scripts to process data,train model and test model
 +
* toolkit: Moses
  
2016-06-22:   read Bengio's paper http://arxiv.org/pdf/1507.04808v3.pdf
+
|-
 +
| rowspan="1"|2017/08/9
 +
|Jiayu Guo || 10:00|| 23:00 || 13 ||
 +
* run model with the data of which ancient content was split by single character.
  
2016-06-13:
+
|-
 +
| rowspan="1"|2017/08/10
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* train statistical machine translation models and test it
 +
* dataset:zh-en big,WMT2014 en-de,WMT2014 en-fr
  
    [[文件:Classification.jpg]]
+
|-
 +
| rowspan="1"|2017/08/10
 +
|Jiayu Guo || 9:00|| 23:00 || 13 ||
 +
* process data of Songshu
 +
* read papers of CNN
  
2016-06-12:
+
|-
 +
| rowspan="1"|2017/08/11
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* collate experimental results
 +
* compare our baseline model with Moses
  
    [[文件:Similarity.jpg]]
+
|-
 +
| rowspan="1"|2017/08/11
 +
|Jiayu Guo || 9:00|| 20:00 || 11 ||
 +
* test results.
  
2016-06-05:  complete the binary word embedding, find out that tensorflow does not provide logical derivation method.
+
|-
  
2016-06-04:   write the binary word embedding model
+
| rowspan="1"|2017/08/14
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* read paper about THUMT
 +
|-
 +
| rowspan="1"|2017/08/14
 +
|Jiayu Guo || 10:00|| 23:00 || 13 ||
 +
* learn about Graphic Model of LSTM-Projected BPTT
 +
* search for data available for translation (Twenty-four-Shi)
 +
|-
  
2016-06-01:
+
| rowspan="1"|2017/08/15
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* read THUMT manual and learn how to use it
 +
|-
 +
| rowspan="1"|2017/08/15
 +
|Jiayu Guo || 11:00|| 23:30 || 12 ||
 +
* run model with data including Shiji、Zizhitongjian.
 +
|-
 +
| rowspan="1"|2017/08/16
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* train translation models and test them
 +
* toolkit: THUMT
 +
* dataset:zh-en small
 +
* test if THUMT can work normally
  
        1.Record demo video of our Personalized Chatterbot
+
|-
        2.program the binary word embedding model
+
| rowspan="1"|2017/08/16
 +
|Jiayu Guo || 10:00|| 23:00 || 10||
 +
checkpoint-100000 translation model
 +
BLEU: 11.11
  
2016-05-31: debugging our Personalized Chatterbot
+
*source:在秦者名错,与张仪争论,於是惠王使错将伐蜀,遂拔,因而守之。
 +
*target:在秦国的名叫司马错,曾与张仪发生争论,秦惠王采纳了他的意见,于是司马错率军攻蜀国,攻取后,又让他做了蜀地郡守。
 +
*trans:当时秦国的人都很欣赏他的建议,与张仪一起商议,所以吴王派使者率军攻打蜀地,一举攻,接着又下令守城 。
 +
*source:神大用则竭,形大劳则敝,形神离则死 。
 +
*target:精神过度使用就会衰竭,形体过度劳累就会疲惫,神形分离就会死亡。
 +
*trans: 精神过度就可衰竭,身体过度劳累就会疲惫,地形也就会死。
 +
*source:今天子接千岁之统,封泰山,而余不得从行,是命也夫,命也夫!
 +
*target:现天子继承汉朝千年一统的大业,在泰山举行封禅典礼而我不能随行,这是命啊,是命啊!
 +
*trans: 现在天子可以继承帝位的成就爵位,爵位至泰山,而我却未能执行先帝的命运。
  
2016-05-30:  complete our Personalized Chatterbot
+
*1.data used Zizhitongjian only(6,000 pairs), we can get BLEU 6 at most.
 +
*2.data used Zizhitongjian only(12,000 pairs), we can get BLEU 7 at most.
 +
*3.data used Shiji and Zizhitongjian(43,0000 pairs), we can get BLEU about 9.
 +
*4.data used Shiji and Zizhitongjian(43,0000 pairs), and split the ancient language text one character by one, we can get BLEU 11.11 at most.
 +
|-
  
2016-05-29:
+
| rowspan="1"|2017/08/17
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* code automation scripts to process data,train model and test model
 +
* train translation models and test them
 +
* toolkit: THUMT
 +
* dataset:zh-en big
  
        1.scan Chao's code and modify it
+
|-
        2.run the modified program to get the eight hundred thousand sentences's whole matrix
+
| rowspan="1"|2017/08/17
 +
|Jiayu Guo || 13:00|| 23:00 || 10 ||
 +
* read source code.
 +
|-
 +
| rowspan="1"|2017/08/18
 +
|Shipan Ren || 9:00 || 20:00 || 11 ||
 +
* test translation models by using single reference and multiple reference
 +
* organize all the experimental results(our baseline system,Moses,THUMT)
  
2016-05-28:
+
|-
 +
| rowspan="1"|2017/08/18
 +
|Jiayu Guo || 13:00|| 22:00 || 9 ||
 +
* read source code.
 +
|-
 +
| rowspan="1"|2017/08/21
 +
|Shipan Ren || 10:00 || 22:00 || 12 ||
 +
* read the released information of other translation systems
 +
|-
 +
| rowspan="1"|2017/08/21
 +
|Jiayu Guo || 9:30 || 21:30 || 12 ||
 +
* read the source code and learn tensorflow
 +
|-
 +
| rowspan="1"|2017/08/22
 +
|Shipan Ren || 10:00 || 22:00 || 12 ||
 +
* cleaned up the code
 +
|-
 +
| rowspan="1"|2017/08/22
 +
|Jiayu Guo || 9:00 || 22:00 || 12 ||
 +
* read the source code
 +
|-
 +
| rowspan="1"|2017/08/23
 +
|Shipan Ren || 10:00 || 21:00 || 11 ||
 +
* wrote the documents
 +
|-
 +
| rowspan="1"|2017/08/23
 +
|Jiayu Guo || 9:00 || 22:00 || 11 ||
 +
* read the source code and learn tensorflow
 +
|-
 +
| rowspan="1"|2017/08/24
 +
|Shipan Ren || 10:00 || 20:00 || 10 ||
 +
* wrote the documents
 +
|-
 +
| rowspan="1"|2017/08/24
 +
|Jiayu Guo || 9:10 || 22:00 || 10.5 ||
 +
* read the source code and learn tensorflow
 +
|-
 +
| rowspan="1"|2017/08/25
 +
|Shipan Ren || 10:00 || 20:00 || 10 ||
 +
* check experimental results
 +
|-
 +
| rowspan="1"|2017/08/25
 +
|Jiayu Guo || 8:50 || 22:00 || 10.5 ||
 +
* read the source code and learn tensorflow
 +
|-
 +
| rowspan="1"|2017/08/28
 +
|Shipan Ren || 10:00 || 20:00 || 10 ||
 +
* wrote the paper of ViVi_NMT(version 1.0)
 +
|-
 +
| rowspan="1"|2017/08/28
 +
|Jiayu Guo || 8:10 || 21:00 || 11 ||
 +
* read the source code and learn tensorflow
 +
|-
 +
| rowspan="1"|2017/08/29
 +
|Shipan Ren || 10:00 || 20:00 || 10 ||
 +
* wrote the paper of ViVi_NMT(version 1.0)
 +
|-
 +
| rowspan="1"|2017/08/29
 +
|Jiayu Guo || 11:00 || 21:00 || 10 ||
 +
* read the source code and learn tensorflow
 +
|-
 +
| rowspan="1"|2017/08/30
 +
|Shipan Ren || 10:00 || 20:00 || 10 ||
 +
* wrote the paper of ViVi_NMT(version 1.0)
 +
|-
 +
| rowspan="1"|2017/08/30
 +
|Jiayu Guo || 11:30 || 21:00 || 9 ||
 +
* learn VV model
 +
|-
 +
| rowspan="1"|2017/08/31
 +
|Shipan Ren || 10:00 || 20:00 || 10 ||
 +
* wrote the paper of ViVi_NMT(version 1.0)
 +
|-
 +
| rowspan="1"|2017/08/31
 +
|Jiayu Guo || 10:00 || 20:00 || 10 ||
 +
* clean up the code
 +
|-
 +
}
  
        1.complete the word2vec model in tensorflow
+
===Time Off Table===
        2.complete the first version of binary word embedding model
+
  
2016-05-25:  .write my own version of word2vec model
+
{| class="wikitable"
 +
! Date !! Yang Feng !! Jiyuan Zhang
 +
|-
 +
|}
  
2016-05-23:
+
==Past progress==
 +
[[nlp-progress 2017/03]]
  
        1.get tensorflow's word2vec model from(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/models/embedding)
+
[[nlp-progress 2017/02]]
        2.learn word2vec_basic model
+
        3.run word2vec.py and word2vec_optimized.py,we need a Chinese evaluation dataset if we want to use it directly
+
  
2016-05-22:
+
[[nlp-progress 2017/01]]
  
        1.find the tf.logical_xor(x,y) method in tensorflow to compute Hamming distance.
+
[[nlp-progress 2016/12]]
        2.learn tensorflow's word2vec model
+
  
2016-05-21:
+
[[nlp-progress 2016/11]]
  
        1.read Lantian's paper 'Binary Speaker Embedding'
+
[[nlp-progress 2016/10]]
        2.try to find a formula in tensorflow to compute Hamming distance.
+
  
2016-05-18:
+
[[nlp-progress 2016/09]]
 
+
            Fetch American TV subtitles and process them into a specific format(12.6M)
+
          (1.Sex and the City 2.Gossip Girl 3.Desperate Housewives 4.The IT Crowd 5.Empire 6.2 Broke Girls)
+
 
+
2016-05-16:Process the data collected from the interview site,interview books and American TV subtitles(38.2M+23.2M)
+
 
+
2016-05-11:
+
 
+
            Fetch American TV subtitles
+
          (1.Friends 2.Big Bang Theory 3.The descendant of the Sun 4.Modern Family 5.House M.D. 6.Grey's Anatomy)
+
 
+
2016-05-08:Fetch data from 'http://news.ifeng.com/' and 'http://www.xinhuanet.com/'(13.4M)
+
 
+
2016-05-07:Fetch data from 'http://fangtan.china.com.cn/' and interview books (10M)
+
 
+
2016-05-04:Establish the overall framework of our chat robot,and continue to build database
+
 
+
====Ziwei Bai====
+
2016-07-01:
+
          the model updated yesterday can't converge,try to learn tf.sampled_softmax_loss()
+
2016-06-30:
+
          convert our chatting model from Negative sample to softmax and convert the cost from cosine to cross-entropy
+
          tf.softmax()
+
2016-06-29:
+
          learn paper 'Neural Responding Machine for Short-Text Conversation'
+
2016-06-23:
+
          learn paper ‘Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models’
+
          http://arxiv.org/pdf/1507.04808v3.pdf
+
2016-06-22:
+
          1、construct vector for word cut by jieba
+
          2、retrain the cdssm model with new word vector(still run)
+
2016-06-04:
+
          1、modify the interface for QA system
+
          2、pull together the interface and QA system
+
2016-06-01:
+
          1、add  data source and Performance Test results in work report
+
          2、learn pyQt
+
 
+
2016-05-30:
+
            complete the work report
+
2016-05-29:
+
          write code for inputting a question ,return a answer sets whose question is most similar to the input question
+
2016-05-25:
+
          1、learn DSSM
+
          2、 complete the first edition of work report
+
          3、construct basic Q&A(name,age,job and so on)             
+
2016-05-23:
+
          write code for searching question in 'zhihu.sogou.com' and searching answer in zhihu
+
2016-05-21:
+
          learn the second half of paper 'A Neural Conversational Model'
+
2016-05-18:
+
          1、crawl QA pairs from http://www.chinalife.com.cn/publish/zhuzhan/index.html and http://www.pingan.com/
+
          2、find  paper 'A Neural Conversational Model' from google scholar and learn the first half of it.
+
2016-05-16:
+
            1、find datasets in paper 'Neural Responding Machine for Short-Text Conversation'
+
            2、reconstruct 15 scripts into our expected formula
+
2016-05-15:
+
            1、find 130 scripts
+
            2、 reconstruct 11 scripts into our expected formula
+
            problem:many files cann't distinguish between dialogue and scenario describes by program.
+
 
+
2016-05-11:
+
            1、read paper“Movie-DiC: a Movie Dialogue Corpus for Research and Development”
+
            2、reconstruct a new film scripts into our expected formula
+
 
+
2016-05-08:  convert the pdf we found yesterday into txt,and reconstruct the data into our expected formula 
+
 
+
2016-05-07:  Finding 9 Drama scripts and 20 film scripts 
+
 
+
2016-05-04:Finding and dealing with the data for QA system
+
 
+
===Generation Model (Aodong li)===
+
 
+
: 2016-05-21 : Complete my biweekly report and take over new tasks -- low-frequency words
+
: 2016-05-20 :
+
    Optimize my code to speed up
+
    Train the models with GPU
+
    However, it does not converge :(
+
: 2016-05-19 : Code a simple version of keywords-to-sequence model and train the model
+
: 2016-05-18 : Debug keywords-to-sequence model and train the model
+
: 2016-05-17 : make technical details clear and code keywords-to-sequence model
+
: 2016-05-16 : Denoise and segment more lyrics and prepare for keywords to sequence model
+
: 2016-05-15 : Train some different models and analyze performance: song to song, paragraph to paragraph, etc.
+
: 2016-05-12 : complete sequence to sequence model's prediction process and the whole standard sequence to sequence lstm-based model v0.0
+
: 2016-05-11 : complete sequence to sequence model's training process in Theano
+
: 2016-05-10 : complete sequence to sequence lstm-based model in Theano
+
: 2016-05-09 : try to code sequence to sequence model
+
: 2016-05-08 :
+
    denoise and train word vectors of  Lijun Deng's lyrics (110+ pieces)
+
    decide on using raw sequence to sequence model
+
: 2016-05-07 :
+
    study attention-based model
+
    learn some details about the poem generation model
+
    change my focus onto lyrics generation model
+
: 2016-05-06 : read the paper about poem generation and learn about LSTM
+
: 2016-05-05 : check in and have an overview of generation model
+
 
+
===jiyuan zhang===
+
: 2016-05-01~06 :modify input format and run lstmrbm model (16-beat,32-beat,bar)
+
: 2016-05-09~13:
+
  Modify model parameters  and run model ,the result is not ideal  yet
+
  According to teacher Wang's opinion, in the generation stage,replace random generation with the maximum probability generation
+
 
+
: 2016-05-24~27 :check the blog's codes  and  understand  the model and input format details  on the blog
+
 
+
==Past progress==
+
  
 +
[[nlp-progress 2016/08]]
  
[[nlp-progress-2016-05]]
+
[[nlp-progress 2016/05/01 -- 08/16 | nlp-progress 2016/05-07]]
  
[[nlp-progress-2016-04]]
+
[[nlp-progress 2016/04]]

2017年9月4日 (一) 07:41的最后版本

NLP Schedule

Members

Current Members

  • Yang Feng (冯洋)
  • Jiyuan Zhang (张记袁)
  • Aodong Li (李傲冬)
  • Andi Zhang (张安迪)
  • Shiyue Zhang (张诗悦)
  • Li Gu (古丽)
  • Peilun Xiao (肖培伦)
  • Shipan Ren (任师攀)
  • Jiayu Guo (郭佳雨)

Former Members

  • Chao Xing (邢超)  : FreeNeb
  • Rong Liu (刘荣)  : 优酷
  • Xiaoxi Wang (王晓曦) : 图灵机器人
  • Xi Ma (马习)  : 清华大学研究生
  • Tianyi Luo (骆天一) : phd candidate in University of California Santa Cruz
  • Qixin Wang (王琪鑫)  : MA candidate in University of California
  • DongXu Zhang (张东旭): --
  • Yiqiao Pan (潘一桥) : MA candidate in University of Sydney
  • Shiyao Li (李诗瑶) : BUPT
  • Aiting Liu (刘艾婷)  : BUPT

Work Progress

Daily Report

}

Time Off Table

Date Person start leave hours status
2017/04/02 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/03 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/04 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/05 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/06 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/07 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/08 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/09 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/10 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/11 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/12 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/13 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/14 Andy Zhang 9:30 18:30 8
  • preparing EMNLP
Peilun Xiao
2017/04/15 Andy Zhang 9:00 15:00 6
  • preparing EMNLP
Peilun Xiao
2017/04/18 Aodong Li 11:00 20:00 8
  • Pick up new task in news generation and do literature review
2017/04/19 Aodong Li 11:00 20:00 8
  • Literature review
2017/04/20 Aodong Li 12:00 20:00 8
  • Literature review
2017/04/21 Aodong Li 12:00 20:00 8
  • Literature review
2017/04/24 Aodong Li 11:00 20:00 8
  • Adjust literature review focus
2017/04/25 Aodong Li 11:00 20:00 8
  • Literature review
2017/04/26 Aodong Li 11:00 20:00 8
  • Literature review
2017/04/27 Aodong Li 11:00 20:00 8
  • Try to reproduce sc-lstm work
2017/04/28 Aodong Li 11:00 20:00 8
  • Transfer to new task in machine translation and do literature review
2017/04/30 Aodong Li 11:00 20:00 8
  • Literature review
2017/05/01 Aodong Li 11:00 20:00 8
  • Literature review
2017/05/02 Aodong Li 11:00 20:00 8
  • Literature review and code review
2017/05/06 Aodong Li 14:20 17:20 3
  • Code review
2017/05/07 Aodong Li 13:30 22:00 8
  • Code review and experiment started, but version discrepancy encountered
2017/05/08 Aodong Li 11:30 21:00 8
  • Code review and version discrepancy solved
2017/05/09 Aodong Li 13:00 22:00 9
  • Code review and experiment
  • details about experiment:
 small data, 
 1st and 2nd translator uses the same training data, 
 2nd translator uses random initialized embedding
  • results (BLEU):
 BASELINE: 43.87
 best result of our model: 42.56
2017/05/10 Shipan Ren 9:00 20:00 11
  • Entry procedures
  • Machine Translation paper reading
2017/05/10 Aodong Li 13:30 22:00 8
  • experiment setting:
 small data, 
 1st and 2nd translator uses the different training data, counting 22000 and 22017 seperately
 2nd translator uses random initialized embedding
  • results (BLEU):
 BASELINE: 36.67 (36.67 is the model at 4750 updates, but we use model at 3000 updates to
                    prevent the case of overfitting, to generate the 2nd translator's training data, for 
                    which the BLEU is 34.96)
 best result of our model: 29.81
 This may suggest that that using either the same training data with 1st translator or different
                   one won't influence 2nd translator's performance, instead, using the same one may
                    be better, at least from results. But I have to give a consideration of a smaller size 
                    of training data compared to yesterday's model.
  • code 2nd translator with constant embedding
2017/05/11 Shipan Ren 10:00 19:30 9.5
  • Configure environment
  • Run tf_translate code
  • Read Machine Translation paper
2017/05/11 Aodong Li 13:00 21:00 8
  • experiment setting:
 small data, 
 1st and 2nd translator uses the same training data, 
 2nd translator uses constant untrainable embedding imported from 1st translator's decoder
  • results (BLEU):
 BASELINE: 43.87
 best result of our model: 43.48
 Experiments show that this kind of series or cascade model will definitely impair the final perfor-
                     mance due to information loss as the information flows through the network from 
                     end to end. Decoder's smaller vocabulary size compared to encoder's demonstrate
                     this (9000+ -> 6000+).
 The intention of this experiment is looking for a map to solve meaning shift using 2nd translator,
                     but result of whether the map is learned or not is obscured by the smaller vocab size 
                     phenomenon.
  • literature review on hierarchical machine translation
2017/05/12 Aodong Li 13:00 21:00 8
  • Code double decoding model and read multilingual MT paper
2017/05/13 Shipan Ren 10:00 19:00 9
  • read machine translation paper
  • learne lstm model and seq2seq model
2017/05/14 Aodong Li 10:00 20:00 9
  • Code double decoding model and experiment
  • details about experiment:
 small data, 
 2nd translator uses as training data the concat(Chinese, machine translated English), 
 2nd translator uses random initialized embedding
  • results (BLEU):
 BASELINE: 43.87
 best result of our model: 43.53
  • NEXT: 2nd translator uses trained constant embedding
2017/05/15 Shipan Ren 9:30 19:00 9.5
  • understand the difference between lstm model and gru model
  • read the implement code of seq2seq model
2017/05/17 Shipan Ren 9:30 19:30 10
  • read neural machine translation paper
  • read tf_translate code
Aodong Li 13:30 24:00 9
  • code and debug double-decoder model
  • alter 2017/05/14 model's size and will try after nips
2017/05/18 Shipan Ren 10:00 19:00 9
  • read neural machine translation paper
  • read tf_translate code
Aodong Li 12:30 21:00 8
  • train double-decoder model on small data set but encounter decode bugs
2017/05/19 Aodong Li 12:30 20:30 8
  • debug double-decoder model
  • the model performs well on develop set, but performs badly on test data. I want to figure out the reason.
2017/05/21 Aodong Li 10:30 18:30 8
  • details about experiment:
 hidden_size = 700 (500 in prior)
 emb_size = 510 (310 in prior)
 small data, 
 2nd translator uses as training data the concat(Chinese, machine translated English), 
 2nd translator uses random initialized embedding
  • results (BLEU):
 BASELINE: 43.87
 best result of our model: 45.21
 But only one checkpoint outperforms the baseline, the other results are commonly under 43.1
  • debug double-decoder model
2017/05/22 Aodong Li 14:00 22:00 8
  • double-decoder without joint loss generalizes very bad
  • i'm trying double-decoder model with joint loss
2017/05/23 Aodong Li 13:00 21:30 8
  • details about experiment 1:
 hidden_size = 700
 emb_size = 510
 learning_rate = 0.0005 (0.001 in prior)
 small data, 
 2nd translator uses as training data the concat(Chinese, machine translated English), 
 2nd translator uses random initialized embedding
  • results (BLEU):
 BASELINE: 43.87
 best result of our model: 42.19
 Overfitting? In overall, the 2nd translator performs worse than baseline
  • details about experiment 2:
 hidden_size = 500
 emb_size = 310
 learning_rate = 0.001
 small data, 
 double-decoder model with joint loss which means the final loss  = 1st decoder's loss + 2nd 
 decoder's loss
  • results (BLEU):
 BASELINE: 43.87
 best result of our model: 39.04
 The 1st decoder's output is generally better than 2nd decoder's output. The reason may be that 
 the second decoder only learns from the first decoder's hidden states because their states are 
 almost the same.
  • DISCOVERY:
 The reason why double-decoder without joint loss generalizes very bad is that the gap between
 force teaching mechanism (training process) and beam search mechanism (decoding process)
 propagates and expands the error to the output end, which destroys the model when decoding.
  • next:
 Try to train double-decoder model without joint loss but with beam search on 1st decoder.
2017/05/24 Aodong Li 13:00 21:30 8
  • code double-attention one-decoder model
  • code double-decoder model
2017/05/24 Shipan Ren 10:00 20:00 10
  • read neural machine translation paper
  • read tf_translate code
2017/05/25 Shipan Ren 9:30 18:30 9
  • write document of tf_translate project
  • read neural machine translation paper
  • read tf_translate code
Aodong Li 13:00 22:00 9
  • code and debug double attention model
2017/05/27 Shipan Ren 9:30 18:30 9
  • read tf_translate code
  • write document of tf_translate project
2017/05/28 Aodong Li 15:00 22:00 7
  • details about experiment:
 hidden_size = 500
 emb_size = 310
 learning_rate = 0.001
 small data, 
 2nd translator uses as training data both Chinese and machine translated English
 Chinese and English use different encoders and different attention
 final_attn = attn_1 + attn_2
 2nd translator uses random initialized embedding
  • results (BLEU):
 BASELINE: 43.87
 when decoding:
   final_attn = attn_1 + attn_2 best result of our model: 43.50
   final_attn = 2/3attn_1 + 4/3attn_2 best result of our model: 41.22
   final_attn = 4/3attn_1 + 2/3attn_2 best result of our model: 43.58
2017/05/30 Aodong Li 15:00 21:00 6
  • details about experiment 1:
 hidden_size = 500
 emb_size = 310
 learning_rate = 0.001
 small data, 
 2nd translator uses as training data both Chinese and machine translated English
 Chinese and English use different encoders and different attention
 final_attn = 2/3attn_1 + 4/3attn_2
 2nd translator uses random initialized embedding
  • results (BLEU):
 BASELINE: 43.87
 best result of our model: 42.36
  • details about experiment 2:
 final_attn = 2/3attn_1 + 4/3attn_2
 2nd translator uses constant initialized embedding
  • results (BLEU):
 BASELINE: 43.87
 best result of our model: 45.32
  • details about experiment 3:
 final_attn = attn_1 + attn_2
 2nd translator uses constant initialized embedding
  • results (BLEU):
 BASELINE: 43.87
 best result of our model: 45.41 and it seems more stable
2017/05/31 Shipan Ren 10:00 19:30 9.5
  • run and test tf_translate code
  • write document of tf_translate project
Aodong Li 12:00 20:30 8.5
  • details about experiment 1:
 final_attn = 4/3attn_1 + 2/3attn_2
 2nd translator uses constant initialized embedding
  • results (BLEU):
 BASELINE: 43.87
 best result of our model: 45.79
  • That only make English word embedding at encoder constant and train all the other embedding and parameters achieves an even higher bleu score 45.98 and the results are stable.
  • The quality of English embedding at encoder plays an pivotal role in this model.
  • Preparation of big data.
2017/06/01 Aodong Li 13:00 24:00 11
  • Only make the English encoder's embedding constant -- 45.98
  • Only initialize the English encoder's embedding and then finetune it -- 46.06
  • Share the attention mechanism and then directly add them -- 46.20
  • Run double-attention model on large data
2017/06/02 Aodong Li 13:00 22:00 9
  • Baseline bleu on large data is 30.83 with 30000 output vocab
  • Our best result is 31.53 with 20000 output vocab
2017/06/03 Aodong Li 13:00 21:00 8
  • Train the model with 40 batch size and with concat(attn_1, attn_2)
  • the best result of model with 40 batch size and with add(attn_1, attn_2) is 30.52
2017/06/05 Aodong Li 10:00 19:00 8
  • Prepare for APSIPA paper
2017/06/06 Aodong Li 10:00 19:00 8
  • Prepare for APSIPA paper
2017/06/07 Aodong Li 10:00 19:00 8
  • Prepare for APSIPA paper
2017/06/08 Aodong Li 10:00 19:00 8
  • Prepare for APSIPA paper
2017/06/09 Aodong Li 10:00 19:00 8
  • Prepare for APSIPA paper
2017/06/12 Aodong Li 10:00 19:00 8
  • Prepare for APSIPA paper
2017/06/13 Aodong Li 10:00 19:00 8
  • Prepare for APSIPA paper
2017/06/14 Aodong Li 10:00 19:00 8
  • Prepare for APSIPA paper
2017/06/15 Aodong Li 10:00 19:00 8
  • Prepare for APSIPA paper
  • Read paper about MT involving grammar
2017/06/16 Aodong Li 10:00 19:00 8
  • Prepare for APSIPA paper
  • Read paper about MT involving grammar
2017/06/19 Aodong Li 10:00 19:00 8
  • Completed APSIPA paper
  • Took new task in style translation
2017/06/20 Aodong Li 10:00 19:00 8
  • Tried synonyms substitution
2017/06/21 Aodong Li 10:00 19:00 8
  • Tried post edit like synonyms substitution but this didn't work
2017/06/22 Aodong Li 10:00 19:00 8
  • Trained a GRU language model to determine similar word
2017/06/23 Shipan Ren 10:00 21:00 11
  • read neural machine translation paper
  • read and run tf_translate code
Aodong Li 10:00 19:00 8
  • Trained a GRU language model to determine similar word
  • This didn't work because semantics is not captured
2017/06/26 Shipan Ren 10:00 21:00 11
  • read paper:LSTM Neural Networks for Language Modeling
  • read and run ViVi_NMT code
Aodong Li 10:00 19:00 8
  • Tried to figure out new ways to change the text style
2017/06/27 Shipan Ren 10:00 20:00 10
  • read the API of tensorflow
  • debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0
Aodong Li 10:00 19:00 8
  • Trained seq2seq model to solve this problem
  • Semantics are stored in fixed-length vectors by a encoder and a decoder generate sequences on this vector
2017/06/28 Shipan Ren 10:00 19:00 9
  • debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server)
  • installed tensorflow0.1 and tensorflow1.0 on my pc and debugged ViVi_NMT
Aodong Li 10:00 19:00 8
  • Cross-domain seq2seq w/o attention and w/ attention models didn't work because of overfitting
2017/06/29 Shipan Ren 10:00 20:00 10
  • read the API of tensorflow
  • debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server)
Aodong Li 10:00 19:00 8
  • Read style transfer papers
2017/06/30 Shipan Ren 10:00 24:00 14
  • debugged ViVi_NMT and tried to upgrade code version to tensorflow1.0 (on server)
  • accomplished this task
  • found the new version saves more time,has lower complexity and better bleu than before
Aodong Li 10:00 19:00 8
  • Read style transfer papers
2017/07/03 Shipan Ren 9:00 21:00 12
  • run two versions of the code on small data sets (Chinese-English)
  • tested these checkpoint
2017/07/04 Shipan Ren 9:00 21:00 12
  • recorded experimental results
  • found version 1.0 of the code save more training time, has less complexity and these two version of the code has a similar Bleu value
  • found that the Bleu is still good when the model is over fitting
  • reason: the test set and training set are similar in content and style on small data set
2017/07/05 Shipan Ren 9:00 21:00 12
  • run two versions of the code on big data sets (Chinese-English)
  • read NMT papers
2017/07/06 Shipan Ren 9:00 21:00 12
  • out of memory(OOM) error occurred when version 0.1 of code was trained using large data set,but version 1.0 worked
  • reason: improper distribution of resources by the tensorflow0.1 version leads to exhaustion of memory resources
  • I've tried many times, and version 0.1 worked
2017/07/07 Shipan Ren 9:00 21:00 12
  • tested these checkpoints and recorded experimental results
  • the version 1.0 code saved 0.06 second per step than the version 0.1 code
2017/07/08 Shipan Ren 9:00 21:00 12
  • downloaded the wmt2014 data set
  • used the English-French data set to run the code and found the translation is not good
  • reason:no data preprocessing is done
2017/07/10 Shipan Ren 9:00 20:00 11
  • trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
  • dataset:zh-en small
2017/07/11 Shipan Ren 9:00 20:00 11
  • tested these checkpoints
  • found the new version takes less time
  • found these two versions have similar complexity and bleu values
  • found that the bleu is still good when the model is over fitting .
  • (reason: the test set and the train set of small data set are similar in content and style)
2017/07/12 Shipan Ren 9:00 20:00 11
  • trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
  • dataset:zh-en big
2017/07/13 Shipan Ren 9:00 20:00 11
  • OOM(Out Of Memory) error occurred when version 0.1 was trained using large data set,but version 1.0 worked
   reason: improper distribution of resources by the tensorflow0.1 frame leads to exhaustion of memory resources 
  • I had tried 4 times (just enter the same command), and version 0.1 worked
2017/07/14 Shipan Ren 9:00 20:00 11
  • tested these checkpoints
  • found the new version takes less time
  • found these two versions have similar complexity and bleu values
2017/07/17 Shipan Ren 9:00 20:00 11
  • downloaded the wmt2014 data sets and processed it
2017/07/18 Shipan Ren 9:00 20:00 11
  • processed data
2017/07/18 Jiayu Guo 8:30 22:00 14
  • read model code.
2017/07/19 Shipan Ren 9:00 20:00 11
  • processed data
2017/07/19 Jiayu Guo 9:00 22:00 13
  • read papers of bleu.
2017/07/20 Shipan Ren 9:00 20:00 11
  • processed data
2017/07/20 Jiayu Guo 9:00 22:00 13
  • read papers of attention mechanism.
2017/07/21 Shipan Ren 9:00 20:00 11
  • trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
  • dataset:WMT2014 en-de
2017/07/21 Jiayu Guo 10:00 23:00 13
  • process document
2017/07/24 Shipan Ren 9:00 20:00 11
  • tested these checkpoints of en-de dataset
  • found the new version takes less time
  • found these two versions have similar complexity and bleu values
2017/07/24 Jiayu Guo 9:00 22:00 13
  • read model code.
2017/07/25 Shipan Ren 9:00 20:00 11
  • trained translation models using tf1.0 baseline and tf0.1 baseline perspectively
  • dataset:WMT2014 en-fr datasets
2017/07/25 Jiayu Guo 9:00 23:00 14
  • process document
2017/07/26 Shipan Ren 9:00 20:00 11
  • read papers about memory-augmented nmt
2017/07/26 Jiayu Guo 10:00 24:00 14
  • process document
2017/07/27 Shipan Ren 9:00 20:00 11
  • read papers about memory-augmented nmt
2017/07/27 Jiayu Guo 10:00 24:00 14
  • process document
2017/07/28 Shipan Ren 9:00 20:00 11
  • read memory-augmented nmt code
2017/07/28 Jiayu Guo 9:00 24:00 15
  • process document
2017/07/31 Shipan Ren 9:00 20:00 11
  • read memory-augmented nmt code
2017/07/31 Jiayu Guo 10:00 23:00 13
  • split ancient language text to single word
2017/08/1 Shipan Ren 9:00 20:00 11
  • tested these checkpoints of en-fr dataset
  • found the new version takes less time
  • found these two versions have similar complexity and bleu values
2017/08/1 Jiayu Guo 10:00 23:00 13
  • run seq2seq_model
2017/08/2 Shipan Ren 9:00 20:00 11
  • looked for the performance(the bleu value) of other models
  • datasets:WMT2014 en-de and en-fr
2017/08/2 Jiayu Guo 10:00 23:00 13
  • process document
2017/08/3 Shipan Ren 9:00 20:00 11
  • looked for the performance(the bleu value) of other seq2seq models
  • datasets:WMT2014 en-de and en-fr
2017/08/3 Jiayu Guo 10:00 23:00 13
  • process document
2017/08/4 Shipan Ren 9:00 20:00 11
  • learn moses
2017/08/4 Jiayu Guo 10:00 23:00 13
  • search new data(Songshu)
2017/08/7 Shipan Ren 9:00 20:00 11
  • installed and built Moses on the server
2017/08/7 Jiayu Guo 9:00 22:00 13
  • process document
2017/08/8 Shipan Ren 9:00 20:00 11
  • train statistical machine translation model and test it
  • dataset:zh-en small
  • test if moses can work normally
2017/08/8 Jiayu Guo 10:00 21:00 11
  • read tensorflow
2017/08/9 Shipan Ren 9:00 20:00 11
  • code automation scripts to process data,train model and test model
  • toolkit: Moses
2017/08/9 Jiayu Guo 10:00 23:00 13
  • run model with the data of which ancient content was split by single character.
2017/08/10 Shipan Ren 9:00 20:00 11
  • train statistical machine translation models and test it
  • dataset:zh-en big,WMT2014 en-de,WMT2014 en-fr
2017/08/10 Jiayu Guo 9:00 23:00 13
  • process data of Songshu
  • read papers of CNN
2017/08/11 Shipan Ren 9:00 20:00 11
  • collate experimental results
  • compare our baseline model with Moses
2017/08/11 Jiayu Guo 9:00 20:00 11
  • test results.
2017/08/14 Shipan Ren 9:00 20:00 11
  • read paper about THUMT
2017/08/14 Jiayu Guo 10:00 23:00 13
  • learn about Graphic Model of LSTM-Projected BPTT
  • search for data available for translation (Twenty-four-Shi)
2017/08/15 Shipan Ren 9:00 20:00 11
  • read THUMT manual and learn how to use it
2017/08/15 Jiayu Guo 11:00 23:30 12
  • run model with data including Shiji、Zizhitongjian.
2017/08/16 Shipan Ren 9:00 20:00 11
  • train translation models and test them
  • toolkit: THUMT
  • dataset:zh-en small
  • test if THUMT can work normally
2017/08/16 Jiayu Guo 10:00 23:00 10

checkpoint-100000 translation model BLEU: 11.11

  • source:在秦者名错,与张仪争论,於是惠王使错将伐蜀,遂拔,因而守之。
  • target:在秦国的名叫司马错,曾与张仪发生争论,秦惠王采纳了他的意见,于是司马错率军攻蜀国,攻取后,又让他做了蜀地郡守。
  • trans:当时秦国的人都很欣赏他的建议,与张仪一起商议,所以吴王派使者率军攻打蜀地,一举攻,接着又下令守城 。
  • source:神大用则竭,形大劳则敝,形神离则死 。
  • target:精神过度使用就会衰竭,形体过度劳累就会疲惫,神形分离就会死亡。
  • trans: 精神过度就可衰竭,身体过度劳累就会疲惫,地形也就会死。
  • source:今天子接千岁之统,封泰山,而余不得从行,是命也夫,命也夫!
  • target:现天子继承汉朝千年一统的大业,在泰山举行封禅典礼而我不能随行,这是命啊,是命啊!
  • trans: 现在天子可以继承帝位的成就爵位,爵位至泰山,而我却未能执行先帝的命运。
  • 1.data used Zizhitongjian only(6,000 pairs), we can get BLEU 6 at most.
  • 2.data used Zizhitongjian only(12,000 pairs), we can get BLEU 7 at most.
  • 3.data used Shiji and Zizhitongjian(43,0000 pairs), we can get BLEU about 9.
  • 4.data used Shiji and Zizhitongjian(43,0000 pairs), and split the ancient language text one character by one, we can get BLEU 11.11 at most.
2017/08/17 Shipan Ren 9:00 20:00 11
  • code automation scripts to process data,train model and test model
  • train translation models and test them
  • toolkit: THUMT
  • dataset:zh-en big
2017/08/17 Jiayu Guo 13:00 23:00 10
  • read source code.
2017/08/18 Shipan Ren 9:00 20:00 11
  • test translation models by using single reference and multiple reference
  • organize all the experimental results(our baseline system,Moses,THUMT)
2017/08/18 Jiayu Guo 13:00 22:00 9
  • read source code.
2017/08/21 Shipan Ren 10:00 22:00 12
  • read the released information of other translation systems
2017/08/21 Jiayu Guo 9:30 21:30 12
  • read the source code and learn tensorflow
2017/08/22 Shipan Ren 10:00 22:00 12
  • cleaned up the code
2017/08/22 Jiayu Guo 9:00 22:00 12
  • read the source code
2017/08/23 Shipan Ren 10:00 21:00 11
  • wrote the documents
2017/08/23 Jiayu Guo 9:00 22:00 11
  • read the source code and learn tensorflow
2017/08/24 Shipan Ren 10:00 20:00 10
  • wrote the documents
2017/08/24 Jiayu Guo 9:10 22:00 10.5
  • read the source code and learn tensorflow
2017/08/25 Shipan Ren 10:00 20:00 10
  • check experimental results
2017/08/25 Jiayu Guo 8:50 22:00 10.5
  • read the source code and learn tensorflow
2017/08/28 Shipan Ren 10:00 20:00 10
  • wrote the paper of ViVi_NMT(version 1.0)
2017/08/28 Jiayu Guo 8:10 21:00 11
  • read the source code and learn tensorflow
2017/08/29 Shipan Ren 10:00 20:00 10
  • wrote the paper of ViVi_NMT(version 1.0)
2017/08/29 Jiayu Guo 11:00 21:00 10
  • read the source code and learn tensorflow
2017/08/30 Shipan Ren 10:00 20:00 10
  • wrote the paper of ViVi_NMT(version 1.0)
2017/08/30 Jiayu Guo 11:30 21:00 9
  • learn VV model
2017/08/31 Shipan Ren 10:00 20:00 10
  • wrote the paper of ViVi_NMT(version 1.0)
2017/08/31 Jiayu Guo 10:00 20:00 10
  • clean up the code
Date Yang Feng Jiyuan Zhang

Past progress

nlp-progress 2017/03

nlp-progress 2017/02

nlp-progress 2017/01

nlp-progress 2016/12

nlp-progress 2016/11

nlp-progress 2016/10

nlp-progress 2016/09

nlp-progress 2016/08

nlp-progress 2016/05-07

nlp-progress 2016/04