Zhiyuan Thang 15-06-29
来自cslt Wiki
Last few weeks (lazy):
training LSTM with MPE (failed);
presently got the conclusion that randoming the weights ahead of softmax of DNN before MPE training may be not helpful;
using language vector to pre-train the hidden layers of DNN gives at least a little improvement, especially 3 hidden layers is pre-trained when DNN is of 4 hidden layers;
have not realized leaky Rectifier and temporal Rectifier, as CUDA programing may be needed;
paper reading.
This week:
realize the two kinds of Rectifiers;
paper reading.