“NLP Status Report 2017-7-31”版本间的差异
来自cslt Wiki
(process the ancient document) |
|||
(4位用户的16个中间修订版本未显示) | |||
第1行: | 第1行: | ||
− | Until now, Shiji has been split up to 2, | + | |
− | Zizhitongjian has been split up to | + | {| class="wikitable" |
+ | !Date !! People !! Last Week !! This Week | ||
+ | |- | ||
+ | | rowspan="6"|2017/7/31 | ||
+ | |Jiyuan Zhang || | ||
+ | *made the poster for ACL [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/9/95/Acl2017-poster.pdf] | ||
+ | *attempted to fix repeated word, but failed | ||
+ | *done some work of n-gram model of the couplet | ||
+ | || | ||
+ | *generate streame according to a couplet | ||
+ | *complete the task of filling in the blanks of a couplet | ||
+ | |||
+ | |- | ||
+ | |Aodong LI || | ||
+ | * Got 55,000+ Englsih poems and 260,000+ lines after preprocessing | ||
+ | * Added phase separators as the style indicator, and every line has at least one separator | ||
+ | * Training loss didn't decrease very much, only from 440 to 50 | ||
+ | * The translation quality deteriorated when added language model | ||
+ | || | ||
+ | * Try to use a larger language model to decrease the training loss | ||
+ | * Try to use character-based MT in English-Chinese translation | ||
+ | |- | ||
+ | |Shiyue Zhang || | ||
+ | |||
+ | || | ||
+ | |||
+ | |- | ||
+ | |Shipan Ren || | ||
+ | * looked for the performance(the bleu value) of other models | ||
+ | on the WMT2014 dataset from the published papers,but not found. | ||
+ | * installed and built Moses on the server | ||
+ | || | ||
+ | * train statistical machine translation model and test it | ||
+ | toolkit: Moses | ||
+ | data sets:WMT2014 en-de、en-fr data sets | ||
+ | * collate experimental results.compare our baseline model with Moses | ||
+ | |- | ||
+ | |||
+ | |Jiayu Guo|| | ||
+ | *process document.Until now, Shiji has been split up to 2,4000 pairs of sentence. | ||
+ | *Zizhitongjian has been split up to 1,6000 pairs. | ||
+ | || | ||
+ | *adjust jieba source code, in order to make jieba more accurate for ancient language wordpiece | ||
+ | *read model source code | ||
+ | |- | ||
+ | |} |
2017年8月21日 (一) 00:51的最后版本
Date | People | Last Week | This Week |
---|---|---|---|
2017/7/31 | Jiyuan Zhang |
|
|
Aodong LI |
|
| |
Shiyue Zhang | |||
Shipan Ren |
on the WMT2014 dataset from the published papers,but not found.
|
toolkit: Moses data sets:WMT2014 en-de、en-fr data sets
| |
Jiayu Guo |
|
|