“Dongxu Zhang 2015-09-08”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“Last Week: 1.Adding constraint of topic distribution on last layer and first layer. performance went worse. 2.reproduce baseline of different topic model method. T...”为内容创建页面)
 
 
第1行: 第1行:
 
Last Week:
 
Last Week:
 +
 
1.Adding constraint of topic distribution on last layer and first layer. performance went worse.  
 
1.Adding constraint of topic distribution on last layer and first layer. performance went worse.  
  

2015年9月10日 (四) 04:32的最后版本

Last Week:

1.Adding constraint of topic distribution on last layer and first layer. performance went worse.

2.reproduce baseline of different topic model method. TF-IDF, LSI, LDA, Replicated Softmax, DocNade on the data set which used by "From Word Embeddings To Document Distances". LDA and LSI have similar performance. RS and DocNade also perform well. But the thing is, TF-IDF also performs well, which is awkward. (for example, 71% is reported in the paper, but actually 85% is arrived.)

3. Another thing is that, the performance when using tf-idf with two NN layer and a softmax layer exceeds the performance when using other topic features with two NN layers and a softmax layer. So that supervised NN is actually a better feature extractor than unsupervised topic model. And topic model itself also misses useful informations.

This Week: