“Qixin Wang 2016-01-25”版本间的差异
| 第45行: | 第45行: | ||
---- | ---- | ||
| − | |||
fix n*bugs | fix n*bugs | ||
| 第54行: | 第53行: | ||
added dropout | added dropout | ||
| + | |||
| + | deleted some long 224 iambic (length > 120) | ||
| + | |||
| + | ---- | ||
mini_batch data parallel training | mini_batch data parallel training | ||
| − | 400k sentences per day (Iambic are longer than QA questions) | + | 400k sentences per day (Iambic are longer than QA questions), up to 50X times faster! (now as faster as Bengio's code) |
| − | + | ||
| + | finished iambic format training code&training | ||
----- | ----- | ||
| − | + | ||
| + | some results: | ||
2016年1月28日 (四) 01:11的版本
Work done in this week
word vector size:200
hidden size:500
mlp hidden size:400
maxout size:300
adadelta 0.3
---
fast mode, added cut, no global, no pz
zgt:song, si, giga, update: (grid-9, grid-9, grid-17, grid-17)
psm:song, si, giga, update: (grid-15, grid-15, grid-13, grid-11)
---
with dropout & without maxout:
batch_all(zgt): grid-12
batch_all_go(zgt): grid-11
---
batch training code:
doing debug
---
int32 * float32 -> float64
float32 * float32 -> float32
fix n*bugs
added maxout
added update vectors
added dropout
deleted some long 224 iambic (length > 120)
mini_batch data parallel training
400k sentences per day (Iambic are longer than QA questions), up to 50X times faster! (now as faster as Bengio's code)
finished iambic format training code&training
some results: