“Xingchao work”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
Date-5-11
(以“=Chaos Work= == Binary Word Vector == == Ordered Word Vector ==”替换内容)
第1行: 第1行:
==Paper Recommendation==
+
=Chaos Work=
Pre-Trained Multi-View Word Embedding.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/3/3c/Pre-Trained_Multi-View_Word_Embedding.pdf]
+
  
Learning Word Representation Considering Proximity and Ambiguity.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/b/b0/Learning_Word_Representation_Considering_Proximity_and_Ambiguity.pdf]
+
== Binary Word Vector ==
  
Continuous Distributed Representations of Words as Input of LSTM Network Language Model.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/5/5a/Continuous_Distributed_Representations_of_Words.pdf]
 
  
WikiRelate! Computing Semantic Relatedness Using Wikipedia.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/c/cb/WikiRelate%21_Computing_Semantic_Relatedness_Using_Wikipedia.pdf]
+
== Ordered Word Vector ==
 
+
Japanese-Spanish Thesaurus Construction Using English as a Pivot[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/e/e8/Japanese-Spanish_Thesaurus_Construction.pdf]
+
 
+
==Chaos Work==
+
 
+
 
+
===Temp Result Report===
+
 
+
 
+
Result Report :
+
 
+
I have already train two sphere model. The first model I change hierachical softmax paramters to standard sphere. And another model is just change word vectors to standard sphere.
+
The result shows the performance change hierachical parameters is almost 0% correct rate. So I will not write it in our result report.
+
 
+
Use the norm vector :
+
 
+
  Linear Transform :
+
 
+
      test :    1 correct    5 correct
+
                  10.25%        24.82%
+
 
+
      train :   
+
           
+
 
+
  Sphere Transform:
+
 
+
      test :      24.22%        41.01%
+
 
+
      train :     
+
 
+
Use original vector :
+
  Linear Transform:
+
      test :      26.83%        44.42%
+
 
+
      train :
+
  Sphere Transform :
+
      test :      28.74%        46.73%
+
 
+
 
+
 
+
 
+
===SSA Model===
+
 
+
  Build 2-dimension SSA-Model.
+
      Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result is :
+
        27.83%  46.53%    2  classify
+
  Test 25,50-dimension SSA-Model for transform
+
      Start at : 2014-10-02 <--> End at : 2014-10-03 <--> Result is :
+
        27.9%    46.6%      1  classify
+
        27.83%  46.53%    2  classify
+
        27.43%  46.53%    3  classify
+
        25.52%  45.83%    4  classify
+
        25.62%  45.83%    5  classify
+
        22.81%  42.51%    6  classify
+
        11.96%  27.43%    50 classify
+
      Reason explain : There are some points doesn't belong to class which training data belongs to. So the
+
                      transform doesn't share correct transform matrix.
+
                      The method we want to update is just cluster the training data, and the test
+
                      the performance.
+
  Simple cluster by 2 class.
+
        23.51%  43.21%    2  classify
+
  Train set as test set     
+
      Start at : 2014-10-06 <--> End at : 2014-10-08 <--> Result is :
+
        56.91%  72.16%    Simple 1 classify
+
        63.98%  77.57%    Simple 2 classify
+
        68.49%  81.25%    Simple 4 classify
+
        71.43%  83.21%    Simple 5 classify
+
        76.71%  87.07%    Simple 6 classify
+
  Different compute state :
+
      Start at : 2014-10-10 <--> End at : 2014-10-10 <--> Result is :
+
        23.51%  40.20%    7 classify
+
  Test All-Belong SSA model for transform
+
      Start at : 2014-10-02
+
 
+
===SEMPRE Research===
+
====Work Schedule ====
+
  Download SEMPRE toolkit.
+
  Start at : 2014-09-30
+
 
+
====Paper related====
+
  Semantic Parsing via Paraphrasing [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/8/85/Semantic_Parsing_via_Paraphrasing.pdf]
+
 
+
===Knowledge Vector===
+
 
+
  Pre-process corpus.
+
      Start at : 2014-09-30.
+
        Use toolkit Wikipedia_Extractor [http://medialab.di.unipi.it/wiki/Wikipedia_Extractor] waiting
+
      End at : 2014-10-03  Result :
+
        Original corpus is about 47G and after preprocessing the corpus is almost 17.8G
+
  Analysis corpus, and training word2vec by wikipedia.
+
      Start at : 2014-10-03.
+
      Design Data Structure :
+
        { title : "", content : {Abs : [[details],[related link]], h2 : []}, category : []}
+
 
+
===Moses translation model===
+
 
+
  Pre-process corpus, remove the sentence which contains rarely seen words.
+
      Start at : 2014-09-30 <--> End at : 2014-10-02  <--> Result :
+
      Original lines is 8973724, Clean corpus (remove sentences which contain words less than 10) is 6033397
+
  Train Model.
+
      Start at : 2014-10-02 <--> End at : 2014-10-05
+
  Tuning Model.
+
      Start at : 2014-10-05 <--> End at : 2014-10-10
+
  Result Report :
+
      57G phrase in old translation system, 41G phrase in new system. And then testing load speed.
+
 
+
===Non Linear Transform Testing===
+
====Work Schedule====
+
  Re-train best mse for test data.
+
      Start at : 2014-10-01 <-->  End at : 2014-10-02 <--> Result :
+
      Performance is inconsistent to expectations. Best result for Non-Linear is 1e-2.
+
  Hidden Layer : 400                      15.57%                    29.14%              995
+
                  600                      19.99%                    36.08%              995
+
                  800                      23.32%                    39.60%              995
+
                1200                      19.19%                    35.08%              995
+
                1400                      17.09%                    32.06%              995
+
      Result : According to the result, I will test 800, 1200, 1400, and 1600 hidden layer.
+
 
+
===New Approach===
+
====Date-3-26====
+
  Note: Run Wiki Vector Training Step.
+
  Pre-processing corpus 20-Newsgroups & Reuters-21578
+
      Pre-processing clean tag step done.
+
====Date-3-27====
+
  Learn how to use the Reuters corpus.
+
  Note: Read Papers :
+
  1. Parallel Training of An Improved Neural Network for Text Categorization
+
  2. A discriminative and semantic feature selection method for text categorization
+
  3. Effective Use of Word Order for Text Categorization with Convolutional Neural Networks
+
====Date-3-31====
+
  Code new edition spherical word2vec.
+
  Begin to code VMF based cluster.
+
 
+
====Date-4-26====
+
====Experience for orthogonal weights CNN.====
+
    dimension        alpha
+
      10            1e-4
+
      100          1e-2
+
====Experience for basic CNN.====
+
    dimension        alpha
+
      100          1e-4
+
==Binary Word Vector==
+
 
+
===Date-5-11===
+
====Hamming distance====
+
=====Define=====
+
  <nowiki>In information theory, the Hamming distance between two strings of equal length is the number of positions at
+
which the corresponding symbols are different. In another way, it measures the minimum number of substitutions required
+
to change one string into the other, or the minimum number of errors that could have transformed one string into the other.</nowiki>
+
 
+
=====Examples=====
+
  "karolin" and "kathrin" is 3.
+
  "karolin" and "kerstin" is 3.
+
  1011101 and 1001001 is 2.
+
  2173896 and 2233796 is 3.
+
From Wiki
+
===Date-5-12===
+
====Frobenius matrix norm====
+
=====Define=====
+
The Frobenius norm, sometimes also called the Euclidean norm (which may cause confusion with the vector L^2-norm which also sometimes known as the Euclidean norm), is matrix norm of an m×n matrix  A defined as the square root of the sum of the absolute squares of its elements,
+
 
+
||A||_F=sqrt(sum_(i=1)^msum_(j=1)^n|a_(ij)|^2)
+

2015年6月17日 (三) 02:17的版本

Chaos Work

Binary Word Vector

Ordered Word Vector