Dongxu Zhang 2015-09-08
Last Week: 1.Adding constraint of topic distribution on last layer and first layer. performance went worse.
2.reproduce baseline of different topic model method. TF-IDF, LSI, LDA, Replicated Softmax, DocNade on the data set which used by "From Word Embeddings To Document Distances". LDA and LSI have similar performance. RS and DocNade also perform well. But the thing is, TF-IDF also performs well, which is awkward. (for example, 71% is reported in the paper, but actually 85% is arrived.)
3. Another thing is that, the performance when using tf-idf with two NN layer and a softmax layer exceeds the performance when using other topic features with two NN layers and a softmax layer. So that supervised NN is actually a better feature extractor than unsupervised topic model. And topic model itself also misses useful informations.
This Week: