“Hulan-2014-10-31”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
Lr讨论 | 贡献
(以“==Dialog system== ==Algorithm== ===Spell mistake=== :* using ngram to get candidate sentence. ===improve lucene search=== * lucene similarity method {| border="2px"...”为内容创建页面)
(没有差异)

2014年10月31日 (五) 04:50的版本

Dialog system

Algorithm

Spell mistake

  • using ngram to get candidate sentence.

improve lucene search

  • lucene similarity method
different result in lucene
method Default BM25 LMDirichlet DFR LMJelinekMercer IB
Accary 0.66228 0.66228 0.4091 0.65476 0.65476 0.6666
  • our vsm method
  • our vsm method re-rank(54%),lucene(67%)
  • lucene top50(caoli)
  • top10(82.95%),top20(86.34),top50(90.22%)
  • need to check the other 10% error
  • lucene Optimization(liurong)
  • rewrite the method to select the 50 standard question not same template.
  • test the boost keyword weight and extract the synonyms word.
  • check the word segment for template.
  • min-segment method improve the accuracy.

plan to discuss