“Jt-chinese”版本间的差异
来自cslt Wiki
(以“{| border="2px" |+ Train Set Environment |- ! Parameters !! hidden !! class !! direct !! bbt !! bptt_block !! threads !!direct-order!!rand_seed!!nwords!!time(min) |...”为内容创建页面) |
(→sample data from rnnlm) |
||
| (相同用户的7个中间修订版本未显示) | |||
| 第1行: | 第1行: | ||
| + | =data and model= | ||
| + | * train | ||
| + | :* size: 62M | ||
| + | :* 8k-sentence from jt(about dianxin) | ||
| + | * dev | ||
| + | :* 1000 row from train data | ||
| + | * dict | ||
| + | :* chn_150576.txt(15w) | ||
| + | * model | ||
{| border="2px" | {| border="2px" | ||
|+ Train Set Environment | |+ Train Set Environment | ||
|- | |- | ||
| − | ! Parameters !! hidden !! class !! direct !! bbt !! bptt_block !! threads !!direct-order!!rand_seed!!nwords!!time(min) | + | ! Parameters !! hidden !! class !! direct !! bbt !! bptt_block !! threads !!direct-order!!rand_seed!!nwords!!time(min)!! iter |
|- | |- | ||
!set1 | !set1 | ||
| − | | 320 || 300 || 2000 || | + | | 320 || 300 || 2000 || 5 || 20 || 1 || 3 || 1 || 10000||(31h)||8 |
| + | |- | ||
| + | |} | ||
| + | * ppl | ||
| + | :* dev:86-66(ppl) | ||
| + | * learning rate | ||
| + | :* 0.1-0.00625 | ||
| + | |||
| + | =sampling data from rnnlm= | ||
| + | * different size of simpling data | ||
| + | :* socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35 | ||
| + | {| border="2px" | ||
| + | |+ different size of simple data | ||
| + | |- | ||
| + | ! size !! mix0 !! mix0.3 !! mix0.5 !! mix0.7 !! time | ||
| + | |- | ||
| + | !50M | ||
| + | | 105.457 || 86.7 || 87.5 || 89.7 ||0.5h | ||
| + | |- | ||
| + | !100M | ||
| + | |96.13 ||86.71||87.19||88.98||1.5h | ||
| + | |- | ||
| + | !150 | ||
| + | |103.95||86.59||86.93||88.46||2h | ||
| + | |- | ||
| + | !200M | ||
| + | |92.99||86.54||86.79||88.16||2.5h | ||
| + | |- | ||
| + | !250M | ||
| + | |92.44||86.55||86.77||88.07||3h | ||
| + | |- | ||
| + | !300M | ||
| + | |101.898||86.50||86.66||87.85||3.5h | ||
| + | |- | ||
| + | !350M | ||
| + | |98.8898||86.417||86.52||87.63||4h | ||
| + | |- | ||
| + | !500M | ||
| + | |98.21||86.17||86.119||86.99||6h | ||
| + | |- | ||
| + | !1000M | ||
| + | |87.226||85.83||85.54||86.10||10h | ||
|- | |- | ||
|} | |} | ||
2014年12月1日 (一) 06:58的最后版本
data and model
- train
- size: 62M
- 8k-sentence from jt(about dianxin)
- dev
- 1000 row from train data
- dict
- chn_150576.txt(15w)
- model
| Parameters | hidden | class | direct | bbt | bptt_block | threads | direct-order | rand_seed | nwords | time(min) | iter |
|---|---|---|---|---|---|---|---|---|---|---|---|
| set1 | 320 | 300 | 2000 | 5 | 20 | 1 | 3 | 1 | 10000 | (31h) | 8 |
- ppl
- dev:86-66(ppl)
- learning rate
- 0.1-0.00625
sampling data from rnnlm
- different size of simpling data
- socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35
| size | mix0 | mix0.3 | mix0.5 | mix0.7 | time |
|---|---|---|---|---|---|
| 50M | 105.457 | 86.7 | 87.5 | 89.7 | 0.5h |
| 100M | 96.13 | 86.71 | 87.19 | 88.98 | 1.5h |
| 150 | 103.95 | 86.59 | 86.93 | 88.46 | 2h |
| 200M | 92.99 | 86.54 | 86.79 | 88.16 | 2.5h |
| 250M | 92.44 | 86.55 | 86.77 | 88.07 | 3h |
| 300M | 101.898 | 86.50 | 86.66 | 87.85 | 3.5h |
| 350M | 98.8898 | 86.417 | 86.52 | 87.63 | 4h |
| 500M | 98.21 | 86.17 | 86.119 | 86.99 | 6h |
| 1000M | 87.226 | 85.83 | 85.54 | 86.10 | 10h |