|
|
(7位用户的32个中间修订版本未显示) |
第1行: |
第1行: |
− | =Text Processing Team Schedule= | + | =Speech Processing Team Schedule= |
| | | |
| ==Members== | | ==Members== |
− | ===Former Members===
| |
− | * Rong Liu (刘荣) : 优酷
| |
− | * Xiaoxi Wang (王晓曦) : 图灵机器人
| |
− | * Xi Ma (马习) : 清华大学研究生
| |
− | * DongXu Zhang (张东旭) : --
| |
| | | |
| ===Current Members=== | | ===Current Members=== |
− | * Tianyi Luo (骆天一) | + | * Zhiyuan Tang |
− | * Chao Xing (邢超) | + | * Lantian Li |
− | * Qixin Wang (王琪鑫) | + | * Ying Shi |
− | * Yiqiao Pan (潘一桥) | + | * Yunqi Cai |
− | * Aodong Li (李傲冬) | + | * Wenqiang Du |
− | * Ziwei Bai (白子薇) | + | * Yue Fan |
− | * Aiting Liu (刘艾婷) | + | * Jiawen Kang |
| + | * Ruiqi Liu |
| + | * Yang Zhang |
| + | |
| + | ===Former Members=== |
| + | * Chao Liu: ChangTing Technology |
| + | * Xiangtao Meng: China Construction Bank |
| + | * Shi Yin: Huawei |
| + | * Yiye Lin: University of Southern California |
| + | * Sheng Su: Student of Beijing University of Posts and Telecommunications |
| + | * Xuewei Zhang: Baidu |
| + | * Xiangyu Zeng: Columbia University |
| + | * Jingyi Lin |
| + | * Yixiang Chen |
| + | * Hang Luo |
| + | * Yanqing Wang |
| + | * Zhiyong Zhang |
| + | * Mengyuan Zhao |
| | | |
| ==Work Process== | | ==Work Process== |
− | ===Similar questions senetence vector model training with RNN/LSTM and the attention RNN/LSTM chatting model training (Tianyi Luo)===
| |
− | --------------------2016-04-22
| |
− | * Speed up process of the test performance about theano version of Generationg the similar questions' vectors based on RNN.
| |
− | --------------------2016-04-21
| |
− | * Finish helping Teacher Wang to prepare for text group's presentation(Tang poetry and Songci generation and Intelligent QA system) for Tsinghua University's 105 anniversary.
| |
− | * Submit our IJCAI paper to arxiv. (Solve a big problem about submitting the paper including Chinese chacracters. [http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/How_to_submit_the_latex_files_including_Chinese_characters_to_arxiv Solution])
| |
− | * Optimize theano version of Generationg the similar questions' vectors based on RNN.
| |
− | --------------------2016-04-20
| |
− | * Finish submiting the camera version paper of IJCAI 2016.
| |
− | * Update the version of Technical Report about Chinese Song Iambics generation.
| |
− | --------------------2016-04-19
| |
− | * Optimize theano version of Generationg the similar questions' vectors based on RNN.
| |
− | --------------------2016-04-18
| |
− | * Optimize theano version of Generationg the similar questions' vectors based on RNN.
| |
− | * Finish implementing theano version of LSTM Max margin vector training.
| |
| | | |
− | ===Reproduce DSSM Baseline (Chao Xing)===
| + | [http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/Status_report Latest] |
− | : 2016-04-28 : Given a talk to text team for some recently paper.
| + | |
− | Knowledge Base Completion via Search-Based Question Answering : [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/b/b1/Knowledge_Base_Completion_via_Search-Based_Question_Answering_-_Report.pdf pdf]
| + | [[asr-progress 2017.08]] |
− | Open Domain Question Answering via Semantic Enrichment : [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/1/15/Open_Domain_Question_Answering_via_Semantic_Enrichment_-_Report.pdf pdf]
| + | |
− | A Neural Conversational Model : [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/1/15/A_Neural_Conversational_Model_-_Report.pdf pdf]
| + | [[asr-progress 2017.07]] |
− | And given a tiny results for CNN-DSSM in huilan's weekly report.
| + | |
− | : 2016-04-27 : Code Multi-layer CNN, suffered from memory error in GPU in tensorflow.
| + | [[asr-progress 2017.06]] |
− | So I run such test on CPU, should slow.
| + | |
− | : 2016-04-26 : Code done tricky & analysis such tricky.
| + | [[asr-progress 2017.04]] |
− | : 2016-04-25 : Find a tricky to improve accuracy given by Tianyi.
| + | |
− | : Code for this tricky.
| + | [[asr-progress 2017.01]] |
− | : 2016-04-23 : Set a series of experiment set.
| + | |
− | 1. Try deep CNN-DSSM, current model just follow proposed model contain one convolution layer, need to be a tuneable parameter.
| + | [[asr-progress 2016.12]] |
− | 2. Test whether mixture data effective to current model and deep CDSSM.
| + | |
− | 3. Code Recurrent CNN-DSSM (new approach.)
| + | [[asr-progress 2016.11]] |
− | : 2016-04-22 : Find a problem : Use labs' gpu machine 970 iteration per time is 1537 second but huilan's server is just 7 second.
| + | |
− | Achieve reasonable results when apply max-margin method to CNN-DSSM model.
| + | |
− | : 2016-04-21 : True DSSM model doesn't work well, analysis as below:
| + | |
− | 1. Not exactly reproduce DSSM model, because the original one is English version, I just adapt it to Chinese but after word segmentation.
| + | |
− | So the input is tri-gram words not tri-gram letter.
| + | |
− | 2. Our dataset far from rich, because of we do not use pre-trained word vectors as initial vectors, we can hardly achieve good performance.
| + | |
− | : Request
| + | |
− | 1. As we have rich pre-trained word vectors, maybe CDSSM or RDSSM corrected to our task.
| + | |
− | 2. Different length of sequences seek to be fixed dimension vectors, just CNN and RNN can do such things, DNN can not do it by using
| + | |
− | fix length of word vectors
| + | |
− | : Coding done CDSSM. Test for it's performance.
| + | |
− | One problem : When you install tensorflow by pip 0.8.0 and you want to use conv2d function by gpu, you need make sure you had already
| + | |
− | install your cudnn's version as 4.0 not lastest 5.0.
| + | |
− | : 2016-04-20 : Find reproduced DSSM model's bug, fix it.
| + | |
− | : 2016-04-19 : Code mixture data model by less memory dependency done. Test it's performance.
| + | |
− | : 2016-04-18 : Code mixture data model.
| + | |
− | : 2016-04-16 : Code mixture data model, but face to memory error. Dr. Wang help me fix it.
| + | |
− | : 2016-04-15 : Share Papers. Investigation a series of DSSM papers for future work. And show our intern students how to do research.
| + | |
− | : Original DSSM model : Learning Deep Structured Semantic Models for Web Search using Clickthrough Data [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/4/45/2013_-_Learning_Deep_Structured_Semantic_Models_for_Web_Search_using_Clickthrough_Data_-_Report.pdf pdf]
| + | |
− | : CNN based DSSM model : A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/b/b7/2014_-_A_Latent_Semantic_Model_with_Convolutional-Pooling_Structure_for_Information_Retrieval_-_Report.pdf pdf]
| + | |
− | : Use DSSM model for a new area : Modeling Interestingness with Deep Neural Networks [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/1/1f/2014_-_Modeling_Interestingness_with_Deep_Neural_Networks_-_Report.pdf pdf]
| + | |
− | : Latest approach for LSTM + RNN DSSM model : SEMANTIC MODELLING WITH LONG-SHORT-TERM MEMORY FOR INFORMATION RETRIEVAL [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/2/24/2015_-_SEMANTIC_MODELLING_WITH_LONG-SHORT-TERM_MEMORY_FOR_INFORMATION_RETRIEVAL_-_Report.pdf pdf]
| + | |
| | | |
− | : 2016-04-14 : Test dssm-dnn model, code dssm-cnn model.
| + | [[asr-progress 2016.10]] |
− | Continue investigate deep neural question answering system.
| + | |
− | : 2016-04-13 : test dssm model, investigate deep neural question answering system.
| + | |
− | : Share theano ppt [http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/%E6%96%87%E4%BB%B6:Theano-RBM.pptx theano]
| + | |
− | : Share tensorflow ppt [http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/%E6%96%87%E4%BB%B6:Tensorflow.pptx tensorflow]
| + | |
− | : 2016-04-12 : Write done dssm tensor flow version.
| + | |
− | : 2016-04-11 : Write tensorflow toolkit ppt for intern student.
| + | |
− | : 2016-04-10 : Learn tensorflow toolkit.
| + | |
− | : 2016-04-09 : Learn tensorflow toolkit.
| + | |
− | : 2016-04-08 : Finish theano version.
| + | |
| | | |
− | ===Deep Poem Processing With Image (Ziwei Bai)===
| + | [[asr-progress 2016.09]] |
− | : 2016-04-20 :combine my program with Qixin Wang's
| + | |
− | : 2016-04-10 : web spider to catch a thousand pices of images.
| + | |
− | : 2016-04-13 :1、download theano for python2.7。 2.debug cnn.py
| + | |
− | : 2016-04-15 :web spider to catch 30 thousands pices of images and store them into a matrix
| + | |
− | : 2016-04-16 :modify the code of CNN and spider
| + | |
− | : 2016-04-17 :train convouloutional neural network
| + | |
| | | |
− | ===RNN Piano Processing (Jiyuan Zhang)===
| + | [[asr-progress ...2016.08|asr-progress 2016.08]] |
− | :2016-4-12:select appropriate midis and run rnnrbm model
| + | |
− | :2016-4-13:view rnnrbm model‘s code
| + | |
− | :2016-4-14~15:coding to select 4/4 beat of midis
| + | |
− | :2016-4-17~22:run data, failed several times ,then modify code and view rnnrbm model's code
| + | |
− | :2016-4-25~29:replace rnnrbm with lstmrbm, then run lstmrbm's model
| + | |
| | | |
− | ===Question & Answering (Aiting Liu)=== | + | ==Holiday plan== |
− | : 2016-04-24 : make my biweekly report
| + | [[2017 Spring Festival]] |
− | : 2016-04-23 : read Fader's paper (2011)
| + | |
− | : 2016-04-20 : read Fader's paper (2013)
| + | |
− | : 2016-04-15 : learn dssm and sent2vec
| + | |
− | : 2016-04-16 : try to figure out how the PARALAX dataset is constructed
| + | |
− | : 2016-04-17 : download the PARALAX dataset and try to turn it into what we want it to be
| + | |
| | | |
− | ===Generation Model (Aodong li)===
| + | [[2020 Spring Festival]] |
− | : 2016-05-05 : check in
| + | |