“Search method”版本间的差异
来自cslt Wiki
(→synonyms method) |
(→synonyms method) |
||
第39行: | 第39行: | ||
* lucene | * lucene | ||
:* lucene4.6 already added synonyms method (org.apache.lucene.analysis.synonym[http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/synonym/package-summary.html#package_description]) | :* lucene4.6 already added synonyms method (org.apache.lucene.analysis.synonym[http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/synonym/package-summary.html#package_description]) | ||
− | :* a -> x a b -> y b c d -> z | + | :* (a -> x) (a b -> y) (b c d -> z) |
==find== | ==find== |
2014年11月5日 (三) 12:56的版本
lucene method
- data set
- jiangkaipeng:
- different method result
method | Default | BM25 | LMDirichlet | DFR | LMJelinekMercer | IB |
---|---|---|---|---|---|---|
Accary | 0.66228 | 0.66228 | 0.4091 | 0.65476 | 0.65476 | 0.6666 |
- add boost keyword
method | Default | idf_train | idf_train_norm | idf_baidu | idf_baidu_norm |
---|---|---|---|---|---|
Accary | 0.66228 | 0.651629 | 0.57644 | 0.647869 | 0.65288 |
our method
method | lucene | BM25 | VSM |
---|---|---|---|
Accary | 0.6184 | 0.614 | 0.377 |
synonyms method
- fuzzy match
- lucene
- lucene4.6 already added synonyms method (org.apache.lucene.analysis.synonym[1])
- (a -> x) (a b -> y) (b c d -> z)
find
- 采用最细粒度分词(对于标准问题在建立索引时,模板不用),可以提高正确率。61=>66.对于标准问题建索引时.
- 对输入的问题不应用细粒度分词(细粒度的59%,不用66%)。
- lucene4.6 已经增加了同义词拓展[2]