“Jt-chinese”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“{| border="2px" |+ Train Set Environment |- ! Parameters !! hidden !! class !! direct !! bbt !! bptt_block !! threads !!direct-order!!rand_seed!!nwords!!time(min) |...”为内容创建页面)
 
Lr讨论 | 贡献
sample data from rnnlm
 
(相同用户的7个中间修订版本未显示)
第1行: 第1行:
 +
=data and model=
 +
* train
 +
:* size: 62M
 +
:* 8k-sentence from jt(about dianxin)
 +
* dev
 +
:* 1000 row from train data
 +
* dict
 +
:* chn_150576.txt(15w)
 +
* model
 
{| border="2px"
 
{| border="2px"
 
|+ Train Set Environment
 
|+ Train Set Environment
 
|-
 
|-
! Parameters  !! hidden !! class !! direct !! bbt !! bptt_block !! threads !!direct-order!!rand_seed!!nwords!!time(min)
+
! Parameters  !! hidden !! class !! direct !! bbt !! bptt_block !! threads !!direct-order!!rand_seed!!nwords!!time(min)!! iter
 
|-
 
|-
 
!set1
 
!set1
| 320 || 300 || 2000 || 4 || 20 || 1 || 3 || 1 || 10000||(31h)
+
| 320 || 300 || 2000 || 5 || 20 || 1 || 3 || 1 || 10000||(31h)||8
 +
|-
 +
|}
 +
* ppl
 +
:* dev:86-66(ppl)
 +
* learning rate
 +
:* 0.1-0.00625
 +
 
 +
=sampling data from rnnlm=
 +
* different size of simpling data
 +
:* socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35
 +
{| border="2px"
 +
|+ different size of simple data
 +
|-
 +
! size  !! mix0 !! mix0.3 !! mix0.5 !! mix0.7 !! time
 +
|-
 +
!50M
 +
| 105.457 || 86.7 || 87.5 || 89.7 ||0.5h
 +
|-
 +
!100M
 +
|96.13 ||86.71||87.19||88.98||1.5h
 +
|-
 +
!150
 +
|103.95||86.59||86.93||88.46||2h
 +
|-
 +
!200M
 +
|92.99||86.54||86.79||88.16||2.5h
 +
|-
 +
!250M
 +
|92.44||86.55||86.77||88.07||3h
 +
|-
 +
!300M
 +
|101.898||86.50||86.66||87.85||3.5h
 +
|-
 +
!350M
 +
|98.8898||86.417||86.52||87.63||4h
 +
|-
 +
!500M
 +
|98.21||86.17||86.119||86.99||6h
 +
|-
 +
!1000M
 +
|87.226||85.83||85.54||86.10||10h
 
|-
 
|-
 
|}
 
|}

2014年12月1日 (一) 06:58的最后版本

data and model

  • train
  • size: 62M
  • 8k-sentence from jt(about dianxin)
  • dev
  • 1000 row from train data
  • dict
  • chn_150576.txt(15w)
  • model
Train Set Environment
Parameters hidden class direct bbt bptt_block threads direct-order rand_seed nwords time(min) iter
set1 320 300 2000 5 20 1 3 1 10000 (31h) 8
  • ppl
  • dev:86-66(ppl)
  • learning rate
  • 0.1-0.00625

sampling data from rnnlm

  • different size of simpling data
  • socer_ngram:88.147,rnnlm:76.59,combine(0.5):75.35
different size of simple data
size mix0 mix0.3 mix0.5 mix0.7 time
50M 105.457 86.7 87.5 89.7 0.5h
100M 96.13 86.71 87.19 88.98 1.5h
150 103.95 86.59 86.93 88.46 2h
200M 92.99 86.54 86.79 88.16 2.5h
250M 92.44 86.55 86.77 88.07 3h
300M 101.898 86.50 86.66 87.85 3.5h
350M 98.8898 86.417 86.52 87.63 4h
500M 98.21 86.17 86.119 86.99 6h
1000M 87.226 85.83 85.54 86.10 10h