Qixin Wang 2016-01-25
Work done in this week
word vector size:200
hidden size:500
mlp hidden size:400
maxout size:300
adadelta 0.3
---
fast mode, added cut, no global, no pz
zgt:song, si, giga, update: (grid-9, grid-9, grid-17, grid-17)
psm:song, si, giga, update: (grid-15, grid-15, grid-13, grid-11)
---
with dropout & without maxout:
batch_all(zgt): grid-12
batch_all_go(zgt): grid-11
---
batch training code:
doing debug
---
int32 * float32 -> float64
float32 * float32 -> float32
fix n*bugs
added maxout
added update vectors
added dropout
deleted some long 224 iambic (length > 120)
mini_batch data parallel training
400k sentences per day (Iambic are longer than QA questions), up to 50X times faster! (now as faster as Bengio's code)
but still fast enough for the 15k song iambics......
finished some experiments (one cipai, two cipai)