“NLP Status Report 2017-7-31”版本间的差异
来自cslt Wiki
(4位用户的12个中间修订版本未显示) | |||
第3行: | 第3行: | ||
!Date !! People !! Last Week !! This Week | !Date !! People !! Last Week !! This Week | ||
|- | |- | ||
− | | rowspan="6"|2017/7/ | + | | rowspan="6"|2017/7/31 |
|Jiyuan Zhang || | |Jiyuan Zhang || | ||
− | *made the poster for ACL | + | *made the poster for ACL [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/9/95/Acl2017-poster.pdf] |
*attempted to fix repeated word, but failed | *attempted to fix repeated word, but failed | ||
*done some work of n-gram model of the couplet | *done some work of n-gram model of the couplet | ||
|| | || | ||
− | *complete | + | *generate streame according to a couplet |
+ | *complete the task of filling in the blanks of a couplet | ||
+ | |||
|- | |- | ||
|Aodong LI || | |Aodong LI || | ||
− | + | * Got 55,000+ Englsih poems and 260,000+ lines after preprocessing | |
+ | * Added phase separators as the style indicator, and every line has at least one separator | ||
+ | * Training loss didn't decrease very much, only from 440 to 50 | ||
+ | * The translation quality deteriorated when added language model | ||
|| | || | ||
− | + | * Try to use a larger language model to decrease the training loss | |
+ | * Try to use character-based MT in English-Chinese translation | ||
|- | |- | ||
|Shiyue Zhang || | |Shiyue Zhang || | ||
第22行: | 第28行: | ||
|- | |- | ||
|Shipan Ren || | |Shipan Ren || | ||
− | * | + | * looked for the performance(the bleu value) of other models |
− | + | on the WMT2014 dataset from the published papers,but not found. | |
− | + | * installed and built Moses on the server | |
− | + | ||
− | * | + | |
|| | || | ||
− | * | + | * train statistical machine translation model and test it |
− | * | + | toolkit: Moses |
+ | data sets:WMT2014 en-de、en-fr data sets | ||
+ | * collate experimental results.compare our baseline model with Moses | ||
|- | |- | ||
|Jiayu Guo|| | |Jiayu Guo|| | ||
− | *Until now, Shiji has been split up to 2, | + | *process document.Until now, Shiji has been split up to 2,4000 pairs of sentence. |
− | + | *Zizhitongjian has been split up to 1,6000 pairs. | |
|| | || | ||
− | * | + | *adjust jieba source code, in order to make jieba more accurate for ancient language wordpiece |
+ | *read model source code | ||
|- | |- | ||
|} | |} |
2017年8月21日 (一) 00:51的最后版本
Date | People | Last Week | This Week |
---|---|---|---|
2017/7/31 | Jiyuan Zhang |
|
|
Aodong LI |
|
| |
Shiyue Zhang | |||
Shipan Ren |
on the WMT2014 dataset from the published papers,but not found.
|
toolkit: Moses data sets:WMT2014 en-de、en-fr data sets
| |
Jiayu Guo |
|
|