“Qixin Wang 2016-01-25”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
Wqx讨论 | 贡献
第45行: 第45行:
  
 
----
 
----
 
  
 
fix n*bugs
 
fix n*bugs
第54行: 第53行:
  
 
added dropout
 
added dropout
 +
 +
deleted some long 224 iambic  (length > 120)
 +
 +
----
  
 
mini_batch data parallel training
 
mini_batch data parallel training
  
400k sentences per day (Iambic are longer than QA questions)
+
400k sentences per day (Iambic are longer than QA questions), up to 50X times faster! (now as faster as Bengio's code)
+
 
 +
finished iambic format training code&training
  
 
-----
 
-----
  
deleted some long 224 iambic
+
 
 +
some results:

2016年1月28日 (四) 01:11的版本

Work done in this week

word vector size:200

hidden size:500

mlp hidden size:400

maxout size:300

adadelta 0.3

---

fast mode, added cut, no global, no pz

zgt:song, si, giga, update: (grid-9, grid-9, grid-17, grid-17)

psm:song, si, giga, update: (grid-15, grid-15, grid-13, grid-11)

---

with dropout & without maxout:

batch_all(zgt): grid-12

batch_all_go(zgt): grid-11

---

batch training code:

doing debug

---

int32 * float32 -> float64

float32 * float32 -> float32




fix n*bugs

added maxout

added update vectors

added dropout

deleted some long 224 iambic (length > 120)


mini_batch data parallel training

400k sentences per day (Iambic are longer than QA questions), up to 50X times faster! (now as faster as Bengio's code)

finished iambic format training code&training



some results: