“Xingchao work”版本间的差异
来自cslt Wiki
(→SSA Model) |
|||
第11行: | 第11行: | ||
==Chaos Work== | ==Chaos Work== | ||
+ | |||
+ | |||
+ | ===Temp Result Report=== | ||
+ | |||
+ | |||
+ | Result Report : | ||
+ | |||
+ | I have already train two sphere model. The first model I change hierachical softmax paramters to standard sphere. And another model is just change word vectors to standard sphere. | ||
+ | The result shows the performance change hierachical parameters is almost 0% correct rate. So I will not write it in our result report. | ||
+ | |||
+ | Use the norm vector : | ||
+ | |||
+ | Linear Transform : | ||
+ | |||
+ | test : 1 correct 5 correct | ||
+ | 10.25% 24.82% | ||
+ | |||
+ | train : | ||
+ | |||
+ | |||
+ | Sphere Transform: | ||
+ | |||
+ | test : 24.22% 41.01% | ||
+ | |||
+ | train : | ||
+ | |||
+ | Use original vector : | ||
+ | |||
+ | test : 26.83% 44.42% | ||
+ | |||
+ | train : | ||
+ | |||
+ | |||
+ | |||
+ | |||
===SSA Model=== | ===SSA Model=== |
2014年10月30日 (四) 08:19的版本
目录
Paper Recommendation
Pre-Trained Multi-View Word Embedding.[1]
Learning Word Representation Considering Proximity and Ambiguity.[2]
Continuous Distributed Representations of Words as Input of LSTM Network Language Model.[3]
WikiRelate! Computing Semantic Relatedness Using Wikipedia.[4]
Japanese-Spanish Thesaurus Construction Using English as a Pivot[5]
Chaos Work
Temp Result Report
Result Report :
I have already train two sphere model. The first model I change hierachical softmax paramters to standard sphere. And another model is just change word vectors to standard sphere. The result shows the performance change hierachical parameters is almost 0% correct rate. So I will not write it in our result report.
Use the norm vector :
Linear Transform :
test : 1 correct 5 correct 10.25% 24.82%
train :
Sphere Transform:
test : 24.22% 41.01%
train :
Use original vector :
test : 26.83% 44.42%
train :
SSA Model
Build 2-dimension SSA-Model. Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result is : 27.83% 46.53% 2 classify Test 25,50-dimension SSA-Model for transform Start at : 2014-10-02 <--> End at : 2014-10-03 <--> Result is : 27.9% 46.6% 1 classify 27.83% 46.53% 2 classify 27.43% 46.53% 3 classify 25.52% 45.83% 4 classify 25.62% 45.83% 5 classify 22.81% 42.51% 6 classify 11.96% 27.43% 50 classify Reason explain : There are some points doesn't belong to class which training data belongs to. So the transform doesn't share correct transform matrix. The method we want to update is just cluster the training data, and the test the performance. Simple cluster by 2 class. 23.51% 43.21% 2 classify Train set as test set Start at : 2014-10-06 <--> End at : 2014-10-08 <--> Result is : 56.91% 72.16% Simple 1 classify 63.98% 77.57% Simple 2 classify 68.49% 81.25% Simple 4 classify 71.43% 83.21% Simple 5 classify 76.71% 87.07% Simple 6 classify Different compute state : Start at : 2014-10-10 <--> End at : 2014-10-10 <--> Result is : 23.51% 40.20% 7 classify Test All-Belong SSA model for transform Start at : 2014-10-02
SEMPRE Research
Work Schedule
Download SEMPRE toolkit. Start at : 2014-09-30
Semantic Parsing via Paraphrasing [6]
Knowledge Vector
Pre-process corpus. Start at : 2014-09-30. Use toolkit Wikipedia_Extractor [7] waiting End at : 2014-10-03 Result : Original corpus is about 47G and after preprocessing the corpus is almost 17.8G Analysis corpus, and training word2vec by wikipedia. Start at : 2014-10-03. Design Data Structure : { title : "", content : {Abs : [[details],[related link]], h2 : []}, category : []}
Moses translation model
Pre-process corpus, remove the sentence which contains rarely seen words. Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result : Original lines is 8973724, Clean corpus (remove sentences which contain words less than 10) is 6033397 Train Model. Start at : 2014-10-02 <--> End at : 2014-10-05 Tuning Model. Start at : 2014-10-05 <--> End at : 2014-10-10 Result Report : 57G phrase in old translation system, 41G phrase in new system. And then testing load speed.
Non Linear Transform Testing
Work Schedule
Re-train best mse for test data. Start at : 2014-10-01 <--> End at : 2014-10-02 <--> Result : Performance is inconsistent to expectations. Best result for Non-Linear is 1e-2. Hidden Layer : 400 15.57% 29.14% 995 600 19.99% 36.08% 995 800 23.32% 39.60% 995 1200 19.19% 35.08% 995 1400 17.09% 32.06% 995 Result : According to the result, I will test 800, 1200, 1400, and 1600 hidden layer.