“Xingchao work”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(相同用户的58个中间修订版本未显示)
第1行: 第1行:
==Paper Recommendation==
+
=Chaos Work=
Pre-Trained Multi-View Word Embedding.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/3/3c/Pre-Trained_Multi-View_Word_Embedding.pdf]
+
[[SLT]]
 
+
Learning Word Representation Considering Proximity and Ambiguity.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/b/b0/Learning_Word_Representation_Considering_Proximity_and_Ambiguity.pdf]
+
 
+
Continuous Distributed Representations of Words as Input of LSTM Network Language Model.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/5/5a/Continuous_Distributed_Representations_of_Words.pdf]
+
 
+
WikiRelate! Computing Semantic Relatedness Using Wikipedia.[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/c/cb/WikiRelate%21_Computing_Semantic_Relatedness_Using_Wikipedia.pdf]
+
 
+
Japanese-Spanish Thesaurus Construction Using English as a Pivot[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/e/e8/Japanese-Spanish_Thesaurus_Construction.pdf]
+
 
+
==Chaos Work==
+
 
+
===SSA Model===
+
 
+
  Build 2-dimension SSA-Model.
+
      Start at : 2014-09-30 <--> End at : 2014-10-02 <--> Result is :
+
        27.83%  46.53%    2  classify
+
  Test 25,50-dimension SSA-Model for transform
+
      Start at : 2014-10-02 <--> End at : 2014-10-03 <--> Result is :
+
        11.96%  27.43%    50 classify
+
  Test All-Belong SSA model for transform
+
      Start at : 2014-10-02
+
 
+
===SEMPRE Research===
+
====Work Schedule ====
+
  Download SEMPRE toolkit.
+
  Start at : 2014-09-30
+
 
+
====Paper related====
+
  Semantic Parsing via Paraphrasing [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/8/85/Semantic_Parsing_via_Paraphrasing.pdf]
+
 
+
===Knowledge Vector===
+
 
+
  Pre-process corpus.
+
      Start at : 2014-09-30.
+
        Use toolkit Wikipedia_Extractor [http://medialab.di.unipi.it/wiki/Wikipedia_Extractor] waiting
+
      End at : 2014-10-03  Result :
+
        Original corpus is about 47G and after preprocessing the corpus is almost 17.8G
+
  Analysis corpus, and training word2vec by wikipedia.
+
      Start at : 2014-10-03.
+
 
+
===Moses translation model===
+
 
+
  Pre-process corpus, remove the sentence which contains rarely seen words.
+
      Start at : 2014-09-30 <--> End at : 2014-10-02  <--> Result :
+
      Original lines is 8973724, Clean corpus (remove sentences which contain words less than 10) is 6033397
+
  Train Model.
+
      Start at : 2014-10-02
+
 
+
 
+
===Non Linear Transform Testing===
+
====Work Schedule====
+
  Re-train best mse for test data.
+
      Start at : 2014-10-01 <-->  End at : 2014-10-02 <--> Result :
+
      Performance is inconsistent to expectations. Best result for Non-Linear is 1e-2.
+

2016年4月8日 (五) 04:44的最后版本

Chaos Work

SLT