“14-10-19 Dongxu Zhang”版本间的差异
来自cslt Wiki
(→Accomplished this week) |
(→Accomplished this week) |
||
(相同用户的7个中间修订版本未显示) | |||
第1行: | 第1行: | ||
=== Accomplished this week === | === Accomplished this week === | ||
− | * Train LSTM-Rnn LM with 200MB corpus(vocabulary 10k, classes 100). when using 2 kernels, it takes aroung 200min per | + | * Train LSTM-Rnn LM with 200MB corpus(vocabulary 10k, classes 100, i100*m100). when using 2 cpu kernels, it takes aroung 200min per iteration. |
* Train 5-gram LM using Baiduzhidao_corpus(~30GB after preprocess) with new lexicon. There is a mistake when counted possiblity after merge. | * Train 5-gram LM using Baiduzhidao_corpus(~30GB after preprocess) with new lexicon. There is a mistake when counted possiblity after merge. | ||
− | |||
* Read paper "Learning Long-Term Dependencies with Gradient Descent is Difficult". Still in progress. | * Read paper "Learning Long-Term Dependencies with Gradient Descent is Difficult". Still in progress. | ||
+ | * An idea occured to me which may improve the performance of word2vec with much more semantic information added. But there is huge computation complexity problem that bothers me, which I wish we can discuss about. | ||
=== Next week === | === Next week === | ||
− | * | + | * Test LSTM-Rnn LM. |
− | * | + | * Finished building lexion. |
− | * | + | * Understand the paper. |
− | * | + | * May have time to achieve my baseline idea on text8. |
+ | |||
+ | |||
+ | === Myself === | ||
+ | * Prepare for toefl test. | ||
+ | * prepare for master's thesis. |
2014年10月20日 (一) 00:41的最后版本
Accomplished this week
- Train LSTM-Rnn LM with 200MB corpus(vocabulary 10k, classes 100, i100*m100). when using 2 cpu kernels, it takes aroung 200min per iteration.
- Train 5-gram LM using Baiduzhidao_corpus(~30GB after preprocess) with new lexicon. There is a mistake when counted possiblity after merge.
- Read paper "Learning Long-Term Dependencies with Gradient Descent is Difficult". Still in progress.
- An idea occured to me which may improve the performance of word2vec with much more semantic information added. But there is huge computation complexity problem that bothers me, which I wish we can discuss about.
Next week
- Test LSTM-Rnn LM.
- Finished building lexion.
- Understand the paper.
- May have time to achieve my baseline idea on text8.
Myself
- Prepare for toefl test.
- prepare for master's thesis.