Sheng Su 2015-10-12

来自cslt Wiki
跳转至: 导航搜索

four GPU training: --

  • having tried to change learning rate, mini-batch size and the gap, still diverge.
  • having tried to use asynchronous way to update, still diverge.
  • keep going to find the reason of divergency, and going to use some other methods to try.