“ASR work Schedule”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“=Text Processing Team Schedule= ==Members== ===Former Members=== * Rong Liu (刘荣) : 优酷 * Xiaoxi Wang (王晓曦) : 图灵机器人 * Xi Ma (马习) : 清华...”为内容创建页面)
 
Work Process
第18行: 第18行:
  
 
==Work Process==
 
==Work Process==
===Similar questions senetence vector model training with RNN/LSTM and the attention RNN/LSTM chatting model training (Tianyi Luo)===
 
--------------------2016-04-22
 
* Speed up process of the test performance about theano version of Generationg the similar questions' vectors based on RNN.
 
--------------------2016-04-21
 
* Finish helping Teacher Wang to prepare for text group's presentation(Tang poetry and Songci generation and Intelligent QA system) for Tsinghua University's 105 anniversary.
 
* Submit our IJCAI paper to arxiv. (Solve a big problem about submitting the paper including Chinese chacracters. [http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/How_to_submit_the_latex_files_including_Chinese_characters_to_arxiv Solution])
 
* Optimize theano version of Generationg the similar questions' vectors based on RNN.
 
--------------------2016-04-20
 
* Finish submiting the camera version paper of IJCAI 2016.
 
* Update the version of Technical Report about Chinese Song Iambics generation.
 
--------------------2016-04-19
 
* Optimize theano version of Generationg the similar questions' vectors based on RNN.
 
--------------------2016-04-18
 
* Optimize theano version of Generationg the similar questions' vectors based on RNN.
 
* Finish implementing theano version of LSTM Max margin vector training.
 
 
===Reproduce DSSM Baseline (Chao Xing)===
 
: 2016-04-28 : Given a talk to text team for some recently paper.
 
              Knowledge Base Completion via Search-Based Question Answering : [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/b/b1/Knowledge_Base_Completion_via_Search-Based_Question_Answering_-_Report.pdf pdf]
 
              Open Domain Question Answering via Semantic Enrichment  : [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/1/15/Open_Domain_Question_Answering_via_Semantic_Enrichment_-_Report.pdf pdf]
 
              A Neural Conversational Model : [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/1/15/A_Neural_Conversational_Model_-_Report.pdf pdf]
 
              And given a tiny results for CNN-DSSM in huilan's weekly report.
 
: 2016-04-27 : Code Multi-layer CNN, suffered from memory error in GPU in tensorflow.
 
              So I run such test on CPU, should slow.
 
: 2016-04-26 : Code done tricky & analysis such tricky.
 
: 2016-04-25 : Find a tricky to improve accuracy given by Tianyi.
 
            : Code for this tricky.
 
: 2016-04-23 : Set a series of experiment set.
 
              1. Try deep CNN-DSSM, current model just follow proposed model contain one convolution layer, need to be a tuneable parameter.
 
              2. Test whether mixture data effective to current model and deep CDSSM.
 
              3. Code Recurrent CNN-DSSM (new approach.)
 
: 2016-04-22 : Find a problem : Use labs' gpu machine 970 iteration per time is 1537 second but huilan's server is just 7 second.
 
              Achieve reasonable results when apply max-margin method to CNN-DSSM model.
 
: 2016-04-21 : True DSSM model doesn't work well, analysis as below:
 
                1. Not exactly reproduce DSSM model, because the original one is English version, I just adapt it to Chinese but after word segmentation.
 
                  So the input is tri-gram words not tri-gram letter.
 
                2. Our dataset far from rich, because of we do not use pre-trained word vectors as initial vectors, we can hardly achieve good performance.
 
            : Request
 
                1. As we have rich pre-trained word vectors, maybe CDSSM or RDSSM corrected to our task.
 
                2. Different length of sequences seek to be fixed dimension vectors, just CNN and RNN can do such things, DNN can not do it by using
 
                  fix length of word vectors
 
            : Coding done CDSSM. Test for it's performance.
 
                One problem : When you install tensorflow by pip 0.8.0 and you want to use conv2d function by gpu, you need make sure you had already
 
                            install your cudnn's version as 4.0 not lastest 5.0.
 
: 2016-04-20 : Find reproduced DSSM model's bug, fix it.
 
: 2016-04-19 : Code mixture data model by less memory dependency done. Test it's performance.
 
: 2016-04-18 : Code mixture data model.
 
: 2016-04-16 : Code mixture data model, but face to memory error. Dr. Wang help me fix it.
 
: 2016-04-15 : Share Papers. Investigation a series of DSSM papers for future work. And show our intern students how to do research.
 
            : Original DSSM model : Learning Deep Structured Semantic Models for Web Search using Clickthrough Data [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/4/45/2013_-_Learning_Deep_Structured_Semantic_Models_for_Web_Search_using_Clickthrough_Data_-_Report.pdf pdf]
 
            : CNN based DSSM model : A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/b/b7/2014_-_A_Latent_Semantic_Model_with_Convolutional-Pooling_Structure_for_Information_Retrieval_-_Report.pdf pdf]
 
            : Use DSSM model for a new area : Modeling Interestingness with Deep Neural Networks [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/1/1f/2014_-_Modeling_Interestingness_with_Deep_Neural_Networks_-_Report.pdf pdf]
 
            : Latest approach for LSTM + RNN DSSM model : SEMANTIC MODELLING WITH LONG-SHORT-TERM MEMORY FOR INFORMATION RETRIEVAL [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/2/24/2015_-_SEMANTIC_MODELLING_WITH_LONG-SHORT-TERM_MEMORY_FOR_INFORMATION_RETRIEVAL_-_Report.pdf pdf]
 
 
: 2016-04-14 : Test dssm-dnn model, code dssm-cnn model.
 
              Continue investigate deep neural question answering system.
 
: 2016-04-13 : test dssm model, investigate deep neural question answering system.
 
            : Share theano ppt [http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/%E6%96%87%E4%BB%B6:Theano-RBM.pptx theano]
 
            : Share tensorflow ppt [http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/%E6%96%87%E4%BB%B6:Tensorflow.pptx tensorflow]
 
: 2016-04-12 : Write done dssm tensor flow version.
 
: 2016-04-11 : Write tensorflow toolkit ppt for intern student.
 
: 2016-04-10 : Learn tensorflow toolkit.
 
: 2016-04-09 : Learn tensorflow toolkit.
 
: 2016-04-08 : Finish theano version.
 
 
===Deep Poem Processing With Image (Ziwei Bai)===
 
: 2016-04-20 :combine my program with Qixin Wang's
 
: 2016-04-10 : web spider to catch a thousand pices of images.
 
: 2016-04-13 :1、download theano for python2.7。  2.debug cnn.py
 
: 2016-04-15 :web spider to catch 30 thousands pices of images and store them into a matrix
 
: 2016-04-16 :modify the code of CNN and spider
 
: 2016-04-17 :train convouloutional neural network
 
 
===RNN Piano Processing (Jiyuan Zhang)===
 
:2016-4-12:select appropriate  midis and run rnnrbm model
 
:2016-4-13:view  rnnrbm model‘s  code
 
:2016-4-14~15:coding to select 4/4 beat of midis
 
:2016-4-17~22:run data, failed several times ,then modify code  and  view rnnrbm model's code
 
:2016-4-25~29:replace rnnrbm  with lstmrbm, then run lstmrbm's model
 
 
 
===Question & Answering (Aiting Liu)===
 
===Question & Answering (Aiting Liu)===
 
: 2016-04-24 : make my biweekly report
 
: 2016-04-24 : make my biweekly report
第105行: 第25行:
 
: 2016-04-16 : try to figure out how the PARALAX dataset is constructed
 
: 2016-04-16 : try to figure out how the PARALAX dataset is constructed
 
: 2016-04-17 : download the PARALAX dataset and try to turn it into what we want it to be
 
: 2016-04-17 : download the PARALAX dataset and try to turn it into what we want it to be
 
===Generation Model (Aodong li)===
 
: 2016-05-05 : check in
 

2016年5月5日 (四) 02:40的版本

Text Processing Team Schedule

Members

Former Members

  • Rong Liu (刘荣) : 优酷
  • Xiaoxi Wang (王晓曦) : 图灵机器人
  • Xi Ma (马习) : 清华大学研究生
  • DongXu Zhang (张东旭) : --

Current Members

  • Tianyi Luo (骆天一)
  • Chao Xing (邢超)
  • Qixin Wang (王琪鑫)
  • Yiqiao Pan (潘一桥)
  • Aodong Li (李傲冬)
  • Ziwei Bai (白子薇)
  • Aiting Liu (刘艾婷)

Work Process

Question & Answering (Aiting Liu)

2016-04-24 : make my biweekly report
2016-04-23 : read Fader's paper (2011)
2016-04-20 : read Fader's paper (2013)
2016-04-15 : learn dssm and sent2vec
2016-04-16 : try to figure out how the PARALAX dataset is constructed
2016-04-17 : download the PARALAX dataset and try to turn it into what we want it to be