Xingchao work

来自cslt Wiki
2014年10月3日 (五) 06:28Xingchao讨论 | 贡献的版本

跳转至: 导航搜索

Paper Recommendation

Pre-Trained Multi-View Word Embedding.[1]

Learning Word Representation Considering Proximity and Ambiguity.[2]

Continuous Distributed Representations of Words as Input of LSTM Network Language Model.[3]

WikiRelate! Computing Semantic Relatedness Using Wikipedia.[4]

Japanese-Spanish Thesaurus Construction Using English as a Pivot[5]

Chaos Work

SSA Model

  Build 2-dimension SSA-Model.
     Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result is : 
        27.83%   46.53%     2  classify
  Test 25,50-dimension SSA-Model for transform
     Start at : 2014-10-02 <--> End at : 2014-10-03 <--> Result is : 
        11.96%   27.43%     50 classify
  Test All-Belong SSA model for transform
     Start at : 2014-10-02

SEMPRE Research

Work Schedule

  Download SEMPRE toolkit.
  Start at : 2014-09-30

Paper related

  Semantic Parsing via Paraphrasing [6]

Knowledge Vector

  Pre-process corpus.
     Start at : 2014-09-30.
        Use toolkit Wikipedia_Extractor [7] waiting
     End at : 2014-10-03  Result : 
        Original corpus is about 47G and after preprocessing the corpus is almost 17.8G
  Analysis corpus, and training word2vec by wikipedia.
     Start at : 2014-10-03.

Moses translation model

  Pre-process corpus, remove the sentence which contains rarely seen words.
      Start at : 2014-09-30 <--> End at : 2014-10-02  <--> Result : 
      Original lines is 8973724, Clean corpus (remove sentences which contain words less than 10) is 6033397
  Train Model.
      Start at : 2014-10-02
  

Non Linear Transform Testing

Work Schedule

  Re-train best mse for test data.
      Start at : 2014-10-01 <-->  End at : 2014-10-02 <--> Result : 
      Performance is inconsistent to expectations. Best result for Non-Linear is 1e-2.