“Qixin Wang 2016-01-25”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“=== Work done in this week === word vector size:200 hidden size:500 mlp hidden size:400 maxout size:300 adadelta 0.3 fast mode, added cut, no global, no pz zgt:s...”为内容创建页面)
 
Wqx讨论 | 贡献
 
(相同用户的9个中间修订版本未显示)
第3行: 第3行:
  
 
word vector size:200
 
word vector size:200
 +
 
hidden size:500
 
hidden size:500
 +
 
mlp hidden size:400
 
mlp hidden size:400
 +
 
maxout size:300
 
maxout size:300
 +
 
adadelta 0.3
 
adadelta 0.3
 +
 +
---
  
 
fast mode, added cut, no global, no pz
 
fast mode, added cut, no global, no pz
  
zgt:song, si, giga, update:
+
zgt:song, si, giga, update: (grid-9, grid-9, grid-17, grid-17)
  
grid-9, grid-9, grid-17, grid-17
+
psm:song, si, giga, update: (grid-15, grid-15, grid-13, grid-11)
  
psm:song, si, giga, update:
+
---
  
grid-15, grid-15, grid-13, grid-11
+
with dropout & without maxout:
  
 
batch_all(zgt): grid-12
 
batch_all(zgt): grid-12
  
 
batch_all_go(zgt): grid-11
 
batch_all_go(zgt): grid-11
 +
 +
---
 +
 +
batch training code:
 +
 +
doing debug
 +
 +
---
 +
 +
int32 * float32 -> float64
 +
 +
float32 * float32 -> float32
 +
 +
 +
 +
 +
 +
----
 +
 +
fix n*bugs
 +
 +
added maxout
 +
 +
added update vectors
 +
 +
added dropout
 +
 +
deleted some long 224 iambic  (length > 120)
 +
 +
----
 +
 +
 +
mini_batch data parallel training
 +
 +
400k sentences per day (Iambic are longer than QA questions), up to 50X times faster! (now as faster as Bengio's code)
 +
 +
but still fast enough for the 15k song iambics......
 +
 +
finished some experiments (one cipai, two cipai)

2016年1月31日 (日) 23:59的最后版本

Work done in this week

word vector size:200

hidden size:500

mlp hidden size:400

maxout size:300

adadelta 0.3

---

fast mode, added cut, no global, no pz

zgt:song, si, giga, update: (grid-9, grid-9, grid-17, grid-17)

psm:song, si, giga, update: (grid-15, grid-15, grid-13, grid-11)

---

with dropout & without maxout:

batch_all(zgt): grid-12

batch_all_go(zgt): grid-11

---

batch training code:

doing debug

---

int32 * float32 -> float64

float32 * float32 -> float32




fix n*bugs

added maxout

added update vectors

added dropout

deleted some long 224 iambic (length > 120)



mini_batch data parallel training

400k sentences per day (Iambic are longer than QA questions), up to 50X times faster! (now as faster as Bengio's code)

but still fast enough for the 15k song iambics......

finished some experiments (one cipai, two cipai)