“Hulan-2014-11-06”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
improve fuzzy match
Lr讨论 | 贡献
plan to do
第45行: 第45行:
  
 
==plan to do==
 
==plan to do==
 +
* prepare the
 +
 
==plan to discuss==
 
==plan to discuss==
 
* add the triples search to QA engine
 
* add the triples search to QA engine

2014年11月6日 (四) 08:55的版本

Dialog system

Algorithm

Spell mistake

  • retrain the ngram model(caoli)

improve fuzzy match

  • add Synonyms similarity using MERT-4 method

improve lucene search

  • our vsm method
different result in lucene
method lucene vsm_idf(haiguan) VSM_idf(baidu) vsm_idf(tain) vsm_idf(calculate)
Accary 0.6628 0.6228 0.6197 0.5827 0.5426
  • lucene top
  • top10(82.95%),top20(86.34),top50(90.23%),top100(94.11%),top200(96.18%),top1000(97.31%),top2000(97.87%),top5000(98.75%),top10000(99.06)
  • lucene Optimization(liurong)
  • rewrite the method to select the 50 standard question not same template.(liurong)
  • check the word segment for template.(liurong)
  • boost the query keyword using IDF
boost keyword in lucene
method Default idf_train idf_train_norm idf_baidu idf_baidu_norm
Accary 0.66228 0.651629 0.57644 0.647869 0.65288
  • using MERT-4 method to get good value of multi-feature.like IDF,NER,baidu_weight,keyword etc.(liurong**)

Multi-Scene Recognition

  • add the triples search to QA engine (liurong*)
  • discuss the detail and give a report.

knowledge structure

Knowledge Management and labeling system

  • continue coding.

plan to do

  • prepare the

plan to discuss

  • add the triples search to QA engine