“Zhiyuan Thang 15-06-29”版本间的差异
来自cslt Wiki
(以“ Last few weeks (lazy): Presently got the conclusion that randoming the weights ahead of softmax of DNN before MPE training may be not helpful. Using language vec...”为内容创建页面) |
|||
(相同用户的2个中间修订版本未显示) | |||
第2行: | 第2行: | ||
Last few weeks (lazy): | Last few weeks (lazy): | ||
− | + | training LSTM with MPE (failed); | |
− | + | presently got the conclusion that randoming the weights ahead of softmax of DNN before MPE training may be not helpful; | |
+ | |||
+ | using language vector to pre-train the hidden layers of DNN gives at least a little improvement, especially 3 hidden layers is pre-trained when DNN is of 4 hidden layers; | ||
+ | |||
+ | have not realized leaky Rectifier and temporal Rectifier, as CUDA programing may be needed; | ||
+ | |||
+ | paper reading. | ||
− | |||
This week: | This week: | ||
− | realize the two kinds of | + | realize the two kinds of Rectifiers; |
+ | |||
+ | paper reading. |
2015年6月30日 (二) 14:47的最后版本
Last few weeks (lazy):
training LSTM with MPE (failed);
presently got the conclusion that randoming the weights ahead of softmax of DNN before MPE training may be not helpful;
using language vector to pre-train the hidden layers of DNN gives at least a little improvement, especially 3 hidden layers is pre-trained when DNN is of 4 hidden layers;
have not realized leaky Rectifier and temporal Rectifier, as CUDA programing may be needed;
paper reading.
This week:
realize the two kinds of Rectifiers;
paper reading.