“NLP Status Report 2017-7-10”版本间的差异

2017年8月21日 (一) 00:31的最后版本

Date	People	Last Week	This Week
2017/7/10	Jiyuan Zhang	reproduced the couplet model using moses	continue to modify the couplet
	Aodong LI	Tried a seq2seq with style code model but it didn't work. Coded attention-based seq2seq NMT in shallow fusion with a language model.	Complete coding and have a try. Find more monolingual corpus and upgrade the model.
	Shiyue Zhang
	Shipan Ren	run two versions of the code on small data sets (Chinese-English) and tested these checkpoint found version 1.0 save time about 0.03s per step, and these two version has similar complexity and bleu values found that the bleu is still good when the model is over fitting . (reason: the test set and the train set of small data set are similar in content and style) run two versions of the code on big data sets (Chinese-English) . OOM（Out Of Memory） error occurred when version 0.1 was trained using large data set，but version 1.0 worked reason: improper distribution of resources by the tensorflow0.1 frame leads to exhaustion of memory resources I had tried 4 times （just enter the same command）, and version 0.1 worked found version 1.0 save time about 0.06s per step, and these two version has similar complexity and bleu values downloaded the wmt2014 data set ,used the English-French data set to run the code and found the translation is not good (reason:improper word segmentation)	do word segmentation on wmt2014 data set run two versions of the code on wmt2014 data set record the result and do analysis learn and train moses(use big data sets (Chinese-English))

@@ 第2行： / 第2行： @@
 !Date !! People !! Last Week !! This Week
 |-
-| rowspan="6"|2017/7/3
+| rowspan="6"|2017/7/10
 |Jiyuan Zhang ||
+*reproduced the couplet model using moses
 ||
+*continue to modify the couplet
 |-
 |Aodong LI ||
-* Tried seq2seq with or without attention model to do style transfer (cross domain) task but this didn't work due to overfitting
+* Tried a seq2seq with style code model but it didn't work.
-  seq2seq with attention model: Chinese-to-English
+* Coded attention-based seq2seq NMT in shallow fusion with a language model.
-  vanilla seq2seq model: English-to-English (Unsupervised)
-* Read two style controlled papers in generative model field
-* Trained seq2seq with style code model
 ||
-* Understand the model and mechanism mentioned in the two related papers
+* Complete coding and have a try.
-* Figure out new ways to do style transfer task
+* Find more monolingual corpus and upgrade the model.
 |-
 |Shiyue Zhang ||
@@ 第22行： / 第21行： @@
 |-
 |Shipan Ren ||
-* read and run ViVi_NMT code
+* run two versions of the code on small data sets (Chinese-English)  and tested these checkpoint
-* read the API of tensorflow
+     found version 1.0 save time about 0.03s  per step,
-* debugged ViVi_NMT and  upgraded code version to tensorflow1.0
+           and these two version  has  similar complexity and bleu values
-* found the new version saves more time，has lower complexity and better bleu than before
+     found that the bleu is still good when the model is over fitting .
+           (reason: the test set and the train set of small data set are similar in content and style)
+* run two versions of the code on big data sets (Chinese-English) .
+     OOM（Out Of Memory） error occurred when version 0.1 was trained using large data set，but version 1.0 worked
+          reason: improper distribution of resources by the tensorflow0.1 frame leads to exhaustion of memory resources
+     I had tried 4 times （just enter the same command）, and version 0.1 worked
+          found version 1.0 save time about 0.06s  per step, and these two version  has  similar complexity and bleu values
+* downloaded the wmt2014 data set ,used the English-French data set to run the code and
+    found the translation is not good (reason:improper word segmentation)
 ||
-* test two versions of the code on small data sets (Chinese-English) and large data sets (Chinese-English) respectively
+* do word segmentation on wmt2014  data set
-* test two versions of the code on WMT 2014 English-to-German parallel dataset and WMT 2014 English-French dataset respectively
+* run two versions of the code on wmt2014  data set
-* record experimental results
+* record the result and do analysis
-* read paper and try to make the bleu become a little better
+* learn and train moses(use big data sets (Chinese-English))
 |-
 |}

“NLP Status Report 2017-7-10”版本间的差异

2017年8月21日 (一) 00:31的最后版本

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具