NLP Status Report 2017-5-31

来自cslt Wiki

跳转至：导航、搜索

Date

People

Last Week

This Week

2017/5/31

Jiyuan Zhang

Aodong LI

code double-attention model with final_attn = alpha * attn_ch + beta * attn_en
baseline bleu = 43.87
experiments with random initialized embedding:

alpha	beta	result (bleu)
1	1	43.50
4/3	2/3	43.58 (w/o retrained)
2/3	4/3	41.22 (w/o retrained)
2/3	4/3	42.36 (w/ retrained)

experiments with constant initialized embedding:

alpha	beta	result (bleu)
1	1	45.41
4/3	2/3	45.79
2/3	4/3	45.32

1.4~1.9 BLEU score improvement
This model is similar to multi-source neural translation but uses less resource

Test the model on big data
Explore different attention merge strategies
Explore hierarchical model

Shiyue Zhang

found dropout bug, fix it, and reran baseline: baseline 35.21, baseline(outproj=emb) 35.24
tried several embed set models, failed
embedded other words to model embedding space (trained on train data not big data), and then directly used in baseline(outproj=emb)

30000	50000	70000	90000
35.24	34.52	33.73	33.16
4564 (6666)	4535	4469	4426

m-nmt is running

get word2vec on big data, and compare with word2vec from train data
test m-nmt model, increase vocab size and test
review zh-uy/uy-zh related works, start to write paper

Shipan Ren

writed document of tf_translate project
read neural machine translation paper
read tf_translate code
run and tested tf_translate code

取自“http://index.cslt.org/mediawiki/index.php?title=NLP_Status_Report_2017-5-31&oldid=27249”