“NLP Status Report 2017-8-7”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第10行: 第10行:
 
|-
 
|-
 
|Aodong LI ||
 
|Aodong LI ||
* Got 55,000+ Englsih poems and 260,000+ lines after preprocessing
+
 
* Added phase separators as the style indicator, and every line has at least one separator
+
* Training loss didn't decrease very much, only from 440 to 50
+
* The translation quality deteriorated when added language model
+
 
||
 
||
* Try to use a larger language model to decrease the training loss
+
 
* Try to use character-based MT in English-Chinese translation
+
 
|-
 
|-
 
|Shiyue Zhang ||  
 
|Shiyue Zhang ||  
第24行: 第20行:
 
|-
 
|-
 
|Shipan Ren ||
 
|Shipan Ren ||
* looked for the performance(the bleu value) of other models
 
  on the WMT2014 dataset from the published papers,but not found.
 
* installed and built Moses on the server 
 
||
 
 
* train statistical machine translation model and test it  
 
* train statistical machine translation model and test it  
 
   toolkit: Moses
 
   toolkit: Moses
 
   data sets:WMT2014 en-de、en-fr data sets
 
   data sets:WMT2014 en-de、en-fr data sets
 
* collate experimental results.compare our baseline model with Moses  
 
* collate experimental results.compare our baseline model with Moses  
 +
  en-de dataset
 +
  Moses:15.4
 +
  Baseline:14.87
 +
 
 +
  en-fr datasets
 +
  under training
 +
||
 +
* read memory-augment NMT code
 +
* think about the next step work
 
|-
 
|-
 
      
 
      

2017年8月7日 (一) 05:08的版本

Date People Last Week This Week
2017/7/3 Jiyuan Zhang
  • generated streame according to a couplet
  • almost completed the task of filling in the blanks of a couplet
  • continue to perfect the couplet model
Aodong LI
Shiyue Zhang
Shipan Ren
  • train statistical machine translation model and test it
 toolkit: Moses
 data sets:WMT2014 en-de、en-fr data sets
  • collate experimental results.compare our baseline model with Moses
 en-de dataset
 Moses:15.4
 Baseline:14.87
 
 en-fr datasets
 under training
  • read memory-augment NMT code
  • think about the next step work
Jiayu Guo
  • process document.Until now, Shiji has been split up to 2,4000 pairs of sentence.
  • Zizhitongjian has been split up to 1,6000 pairs.
  • adjust jieba source code, in order to make jieba more accurate for ancient language wordpiece
  • read model source code