“Xingchao work”版本间的差异

2015年5月11日 (一) 23:48的版本

Paper Recommendation

Pre-Trained Multi-View Word Embedding.[1]

Learning Word Representation Considering Proximity and Ambiguity.[2]

Continuous Distributed Representations of Words as Input of LSTM Network Language Model.[3]

WikiRelate! Computing Semantic Relatedness Using Wikipedia.[4]

Japanese-Spanish Thesaurus Construction Using English as a Pivot[5]

Chaos Work

Temp Result Report

Result Report :

I have already train two sphere model. The first model I change hierachical softmax paramters to standard sphere. And another model is just change word vectors to standard sphere. The result shows the performance change hierachical parameters is almost 0% correct rate. So I will not write it in our result report.

Use the norm vector :

  Linear Transform :

     test :     1 correct     5 correct
                  10.25%        24.82%

     train :

  Sphere Transform:

     test :       24.22%        41.01%

     train :

Use original vector :

  Linear Transform:
     test :       26.83%        44.42%

     train : 
  Sphere Transform :
     test :       28.74%        46.73%

SSA Model

  Build 2-dimension SSA-Model.
     Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result is : 
        27.83%   46.53%     2  classify
  Test 25,50-dimension SSA-Model for transform
     Start at : 2014-10-02 <--> End at : 2014-10-03 <--> Result is : 
        27.9%    46.6%      1  classify
        27.83%   46.53%     2  classify
        27.43%   46.53%     3  classify
        25.52%   45.83%     4  classify
        25.62%   45.83%     5  classify
        22.81%   42.51%     6  classify
        11.96%   27.43%     50 classify
     Reason explain : There are some points doesn't belong to class which training data belongs to. So the
                      transform doesn't share correct transform matrix. 
                      The method we want to update is just cluster the training data, and the test 
                      the performance.
  Simple cluster by 2 class.
        23.51%   43.21%     2  classify
  Train set as test set      
     Start at : 2014-10-06 <--> End at : 2014-10-08 <--> Result is : 
        56.91%   72.16%     Simple 1 classify
        63.98%   77.57%     Simple 2 classify
        68.49%   81.25%     Simple 4 classify
        71.43%   83.21%     Simple 5 classify
        76.71%   87.07%     Simple 6 classify
  Different compute state :
     Start at : 2014-10-10 <--> End at : 2014-10-10 <--> Result is :
        23.51%   40.20%     7 classify
  Test All-Belong SSA model for transform
     Start at : 2014-10-02

SEMPRE Research

Work Schedule

  Download SEMPRE toolkit.
  Start at : 2014-09-30

Paper related

  Semantic Parsing via Paraphrasing [6]

Knowledge Vector

  Pre-process corpus.
     Start at : 2014-09-30.
        Use toolkit Wikipedia_Extractor [7] waiting
     End at : 2014-10-03  Result : 
        Original corpus is about 47G and after preprocessing the corpus is almost 17.8G
  Analysis corpus, and training word2vec by wikipedia.
     Start at : 2014-10-03.
     Design Data Structure : 
        { title : "", content : {Abs : [[details],[related link]], h2 : []}, category : []}

Moses translation model

  Pre-process corpus, remove the sentence which contains rarely seen words.
      Start at : 2014-09-30 <--> End at : 2014-10-02  <--> Result : 
      Original lines is 8973724, Clean corpus (remove sentences which contain words less than 10) is 6033397
  Train Model.
      Start at : 2014-10-02 <--> End at : 2014-10-05
  Tuning Model.
      Start at : 2014-10-05 <--> End at : 2014-10-10
  Result Report :
      57G phrase in old translation system, 41G phrase in new system. And then testing load speed.

Non Linear Transform Testing

Work Schedule

  Re-train best mse for test data.
      Start at : 2014-10-01 <-->  End at : 2014-10-02 <--> Result : 
      Performance is inconsistent to expectations. Best result for Non-Linear is 1e-2.
  Hidden Layer : 400                      15.57%                    29.14%              995
                 600                      19.99%                    36.08%              995
                 800                      23.32%                    39.60%              995
                1200                      19.19%                    35.08%              995
                1400                      17.09%                    32.06%              995
      Result : According to the result, I will test 800, 1200, 1400, and 1600 hidden layer.

New Approach

Date-3-26

 Note: Run Wiki Vector Training Step.
 Pre-processing corpus 20-Newsgroups & Reuters-21578
     Pre-processing clean tag step done.

Date-3-27

 Learn how to use the Reuters corpus.
 Note: Read Papers :
 1. Parallel Training of An Improved Neural Network for Text Categorization
 2. A discriminative and semantic feature selection method for text categorization
 3. Effective Use of Word Order for Text Categorization with Convolutional Neural Networks

Date-3-31

 Code new edition spherical word2vec.
 Begin to code VMF based cluster.

Date-4-26

Experience for orthogonal weights CNN.

   dimension        alpha
      10            1e-4
      100           1e-2

Experience for basic CNN.

   dimension        alpha
      100           1e-4

Binary Word Vector

Date-5-11

Hamming distance

Define

  In information theory, the Hamming distance between two strings of equal length is the number of positions at 
which the corresponding symbols are different. In another way, it measures the minimum number of substitutions required 
to change one string into the other, or the minimum number of errors that could have transformed one string into the other.

Examples

  "karolin" and "kathrin" is 3.
  "karolin" and "kerstin" is 3.
  1011101 and 1001001 is 2.
  2173896 and 2233796 is 3.

From Wiki

Date-5-12

Frobenius matrix norm

Define

The Frobenius norm, sometimes also called the Euclidean norm (which may cause confusion with the vector L^2-norm which also sometimes known as the Euclidean norm), is matrix norm of an m×n matrix A defined as the square root of the sum of the absolute squares of its elements,

||A||_F=sqrt(sum_(i=1)^msum_(j=1)^n|a_(ij)|^2)

“Xingchao work”版本间的差异

2015年5月11日 (一) 23:48的版本

目录

Paper Recommendation

Chaos Work

Temp Result Report

SSA Model

SEMPRE Research

Work Schedule

Paper related

Knowledge Vector

Moses translation model

Non Linear Transform Testing

Work Schedule

New Approach

Date-3-26

Date-3-27

Date-3-31

Date-4-26

Experience for orthogonal weights CNN.

Experience for basic CNN.

Binary Word Vector

Date-5-11

Hamming distance

Define

Examples

Date-5-12

Frobenius matrix norm

Define

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具

@@ 第163行： / 第163行： @@
     2173896 and 2233796 is 3.
 From Wiki
+===Date-5-12===
+====Frobenius matrix norm====
+=====Define=====
+The Frobenius norm, sometimes also called the Euclidean norm (which may cause confusion with the vector L^2-norm which also sometimes known as the Euclidean norm), is matrix norm of an m×n matrix  A defined as the square root of the sum of the absolute squares of its elements,
+ ||A||_F=sqrt(sum_(i=1)^msum_(j=1)^n|a_(ij)|^2)