“Jt-chinese”版本间的差异
来自cslt Wiki
(以“{| border="2px" |+ Train Set Environment |- ! Parameters !! hidden !! class !! direct !! bbt !! bptt_block !! threads !!direct-order!!rand_seed!!nwords!!time(min) |...”为内容创建页面) |
(→sample data from rnnlm) |
||
(相同用户的7个中间修订版本未显示) | |||
第1行: | 第1行: | ||
+ | =data and model= | ||
+ | * train | ||
+ | :* size: 62M | ||
+ | :* 8k-sentence from jt(about dianxin) | ||
+ | * dev | ||
+ | :* 1000 row from train data | ||
+ | * dict | ||
+ | :* chn_150576.txt(15w) | ||
+ | * model | ||
{| border="2px" | {| border="2px" | ||
|+ Train Set Environment | |+ Train Set Environment | ||
|- | |- | ||
− | ! Parameters !! hidden !! class !! direct !! bbt !! bptt_block !! threads !!direct-order!!rand_seed!!nwords!!time(min) | + | ! Parameters !! hidden !! class !! direct !! bbt !! bptt_block !! threads !!direct-order!!rand_seed!!nwords!!time(min)!! iter |
|- | |- | ||
!set1 | !set1 | ||
− | | 320 || 300 || 2000 || | + | | 320 || 300 || 2000 || 5 || 20 || 1 || 3 || 1 || 10000||(31h)||8 |
+ | |- | ||
+ | |} | ||
+ | * ppl | ||
+ | :* dev:86-66(ppl) | ||
+ | * learning rate | ||
+ | :* 0.1-0.00625 | ||
+ | |||
+ | =sampling data from rnnlm= | ||
+ | * different size of simpling data | ||
+ | :* socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35 | ||
+ | {| border="2px" | ||
+ | |+ different size of simple data | ||
+ | |- | ||
+ | ! size !! mix0 !! mix0.3 !! mix0.5 !! mix0.7 !! time | ||
+ | |- | ||
+ | !50M | ||
+ | | 105.457 || 86.7 || 87.5 || 89.7 ||0.5h | ||
+ | |- | ||
+ | !100M | ||
+ | |96.13 ||86.71||87.19||88.98||1.5h | ||
+ | |- | ||
+ | !150 | ||
+ | |103.95||86.59||86.93||88.46||2h | ||
+ | |- | ||
+ | !200M | ||
+ | |92.99||86.54||86.79||88.16||2.5h | ||
+ | |- | ||
+ | !250M | ||
+ | |92.44||86.55||86.77||88.07||3h | ||
+ | |- | ||
+ | !300M | ||
+ | |101.898||86.50||86.66||87.85||3.5h | ||
+ | |- | ||
+ | !350M | ||
+ | |98.8898||86.417||86.52||87.63||4h | ||
+ | |- | ||
+ | !500M | ||
+ | |98.21||86.17||86.119||86.99||6h | ||
+ | |- | ||
+ | !1000M | ||
+ | |87.226||85.83||85.54||86.10||10h | ||
|- | |- | ||
|} | |} |
2014年12月1日 (一) 06:58的最后版本
data and model
- train
- size: 62M
- 8k-sentence from jt(about dianxin)
- dev
- 1000 row from train data
- dict
- chn_150576.txt(15w)
- model
Parameters | hidden | class | direct | bbt | bptt_block | threads | direct-order | rand_seed | nwords | time(min) | iter |
---|---|---|---|---|---|---|---|---|---|---|---|
set1 | 320 | 300 | 2000 | 5 | 20 | 1 | 3 | 1 | 10000 | (31h) | 8 |
- ppl
- dev:86-66(ppl)
- learning rate
- 0.1-0.00625
sampling data from rnnlm
- different size of simpling data
- socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35
size | mix0 | mix0.3 | mix0.5 | mix0.7 | time |
---|---|---|---|---|---|
50M | 105.457 | 86.7 | 87.5 | 89.7 | 0.5h |
100M | 96.13 | 86.71 | 87.19 | 88.98 | 1.5h |
150 | 103.95 | 86.59 | 86.93 | 88.46 | 2h |
200M | 92.99 | 86.54 | 86.79 | 88.16 | 2.5h |
250M | 92.44 | 86.55 | 86.77 | 88.07 | 3h |
300M | 101.898 | 86.50 | 86.66 | 87.85 | 3.5h |
350M | 98.8898 | 86.417 | 86.52 | 87.63 | 4h |
500M | 98.21 | 86.17 | 86.119 | 86.99 | 6h |
1000M | 87.226 | 85.83 | 85.54 | 86.10 | 10h |