* [[Opensource: Natural Language Process]]
* Data set: http://cogcomp.cs.illinois.edu/Data/QA/QC/
* [[open system]]
* Method: vsm-tfidf/No classifier
* classes:9-bigclasses,,48-smallclasses
* Result:
{| border="2px"
|+ classification result
* SEMPRE (QA toolkit) [http://www-nlp.stanford.edu/software/sempre/]
* Z-MERT[http://www.cs.jhu.edu/~ozaidan/zmert/]
! Training Set !! 1000 !! 2000 !! 3000 !! 4000 !! 5500
* templatemaker[https://github.com/paulsmith/templatemaker]
! bigclass
* SPMF: A Java Open-Source Pattern Mining Library
| 0.678 || 0.718 || 0.708 || 0.708 || 0.73
:* SPMF is a cross-platform library implemented in Java, specialized for discovering patterns in transaction and sequence databases such as frequent itemsets, association rules and sequential patterns.clustering.
! smallclass
| 0.58 || 0.606 || 0.606 || 0.616 || 0.628
===Data Set===
* big class:教育,社保,就业,医疗,住房,婚育收养,证件办理,资质认定,企业开办,经营纳税,公用事业
* small class:
* search in ML
:* 教育:学期教育,小学教育,初中教育,高中教育,职业教育,继续教育,特殊教育,教育救助
:* ML for Search and Ads(刘铁岩) NLPCC 2014[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/%E6%96%87%E4%BB%B6:L06-ML_for_Search_and_Ads_-_ADL52.pdf]
:* 社保:社保征收,养老保险,医疗保险,工伤保险,失业保险,生育医疗保险,老年人福利,残疾人福利,儿童福利,低保,专项救助,临时救助,优待抚恤,就业安置
:* emantic Matching in Search_ADL [http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/%E6%96%87%E4%BB%B6:L04-Semantic_Matching_in_Search_ADL_Jun_XU_final.pdf]
:* 就业:公务员招考,毕业生就业,人才引进,外地来深建设者就业,失业再就业,退伍军人安置,技能培训,技能鉴定,劳动权益,自主创业
* 知识图谱
:* 医疗:医疗机构,门诊住院,药品药店,疾病预防,食品药品安全,卫生监督,医疗保险,医疗救助
:* Constructing and Mining Web-scale Knowledge Graphs(KDD 2014)[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/4/4c/Kdd2014_gabrilovich_bordes_knowledge_graphs.pdf]
:* 住房:租房,售房,货币补贴,买卖商品房,二手房买卖,房屋租赁,服务机构及人员,公积金开户,公积金缴存,公积金贷款
:* 垂直知识图谱工具与应用[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/6/6f/%E5%9E%82%E7%9B%B4%E7%9F%A5%E8%AF%86%E5%9B%BE%E8%B0%B1%E5%B7%A5%E5%85%B7%E4%B8%8E%E5%BA%94%E7%94%A810%E6%9C%8816%E6%97%A5.pdf]
:* 婚育收养:结婚,离婚,撤销婚姻,生育服务,计划生育奖励,计划生育技术服务,收养服务
:* 知识图谱:大数据语义链接的基石[http://cslt.riit.tsinghua.edu.cn/mediawiki/images/c/c6/%E7%9F%A5%E8%AF%86%E5%9B%BE%E8%B0%B1%EF%BC%9A%E5%A4%A7%E6%95%B0%E6%8D%AE%E8%AF%AD%E4%B9%89%E9%93%BE%E6%8E%A5%E7%9A%84%E5%9F%BA%E7%9F%B3-%E6%9D%8E%E6%B6%93%E5%AD%90_%281%29.pdf]
:* 证件办理: 户籍身份,出境入境,驾驶证,教育培训,医疗卫生,司法律师,交通旅游,工程建设,其他类
:* Ontology Reasoning for the Semantic Web and Its Application to Knowledge Graph[]
:* 资质认定:教育机构,食品机构,医疗机构,就业服务机构,旅游服务机构,交通运输机构,房地产机构,工程建设机构,其他机构
:* 企业开办:名称预核准,前置审批,商事主体登记注册,规则审批,消防证件办理,组织机构代码证申请,外商投资企业设立变更,税务登记
:* 经营纳税:企业年报,知识产权,广告业务,信用合同,税务登记,发票业务,申报纳税
:* 公用事业:供水,供电,煤气,污水垃圾处理,文体休闲,园林绿化
*Test Set
:* label the big class about 1000 query from nanshandata
:* result of big class test is 0.355(395/1112) of title, 0.3444(383/1112) of title+description.
{| border="2px"
|+ Acc of query classification
! Parameters  !! keyword_beta !! keyword_init !! accuracy
| 0 || 0 || 0.355
| 0 || 0 || 0
| 0 || 0 || .344

