“14-10-19 Dongxu Zhang”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“Last week ---------------------- 1. Train LSTM-Rnn LM with 200MB corpus(vocabulary 10k, classes 100). when using 2 kernels, it takes aroung 200min per epoch. 2. Trai...”为内容创建页面)
 
Accomplished this week
 
(相同用户的9个中间修订版本未显示)
第1行: 第1行:
Last week
+
=== Accomplished this week ===
----------------------
+
* Train LSTM-Rnn LM with 200MB corpus(vocabulary 10k, classes 100, i100*m100). when using 2 cpu kernels, it takes aroung 200min per iteration.  
1. Train LSTM-Rnn LM with 200MB corpus(vocabulary 10k, classes 100). when using 2 kernels, it takes aroung 200min per epoch.
+
* Train 5-gram LM using Baiduzhidao_corpus(~30GB after preprocess) with new lexicon. There is a mistake when counted possiblity after merge.
2. Train 5-gram LM using Baiduzhidao_corpus(~30GB after preprocess) with new lexicon. There is a mistake when counted possiblity after merge.
+
* Read paper "Learning Long-Term Dependencies with Gradient Descent is Difficult". Still in progress.
3. An idea occured to me which may improve word2vec with much more semantic information. But there is huge computation complexity problem that bothers me, which I wish we can discuss.
+
* An idea occured to me which may improve the performance of word2vec with much more semantic information added. But there is huge computation complexity problem that bothers me, which I wish we can discuss about.
4. Read paper "Learning Long-Term Dependencies with Gradient Descent is Difficult". Still in progress.
+
  
This week
+
=== Next week ===
----------------------
+
* Test LSTM-Rnn LM.
1. Test LSTM-Rnn LM.
+
* Finished building lexion.
2. Finished building lexion.
+
* Understand the paper.
3. Understand the paper.
+
* May have time to achieve my baseline idea on text8.
4. May have time to achieve my baseline idea on text8.
+
 
 +
 
 +
=== Myself ===
 +
* Prepare for toefl test.
 +
* prepare for master's thesis.

2014年10月20日 (一) 00:41的最后版本

Accomplished this week

  • Train LSTM-Rnn LM with 200MB corpus(vocabulary 10k, classes 100, i100*m100). when using 2 cpu kernels, it takes aroung 200min per iteration.
  • Train 5-gram LM using Baiduzhidao_corpus(~30GB after preprocess) with new lexicon. There is a mistake when counted possiblity after merge.
  • Read paper "Learning Long-Term Dependencies with Gradient Descent is Difficult". Still in progress.
  • An idea occured to me which may improve the performance of word2vec with much more semantic information added. But there is huge computation complexity problem that bothers me, which I wish we can discuss about.

Next week

  • Test LSTM-Rnn LM.
  • Finished building lexion.
  • Understand the paper.
  • May have time to achieve my baseline idea on text8.


Myself

  • Prepare for toefl test.
  • prepare for master's thesis.