“Lucene”版本间的差异
来自cslt Wiki
第1行: | 第1行: | ||
=test idf and tf= | =test idf and tf= | ||
− | + | *data | |
− | [{如何,怎么}} {办理,办} {户口,户口本} # 到当地派出所办理 # 如何办理户口 | + | [{如何,怎么}} {办理,办} {户口,户口本} # 到当地派出所办理 # 如何办理户口 |
− | {办理,办} {户口,户口本} [{流程,步骤}] # 到当地派出所办理 # 如何办理户口 | + | {办理,办} {户口,户口本} [{流程,步骤}] # 到当地派出所办理 # 如何办理户口 |
− | [{如何,怎么}} {办理,办} {身份证,身份} # 到当地派出所办理 # 如何办理身份证 | + | [{如何,怎么}} {办理,办} {身份证,身份} # 到当地派出所办理 # 如何办理身份证 |
− | {办理,办} {身份证} [{流程,步骤}] # 到当地派出所办理 # 如何办理身份证 | + | {办理,办} {身份证} [{流程,步骤}] # 到当地派出所办理 # 如何办理身份证 |
− | + | *搜索 | |
− | question:如何 question:办理户口 | + | query:"如何办理户口" => question:如何 question:办理户口 |
− | + | *result | |
− | + | doc=0 score=0.11657263 shardIndex=-1|0.11657263 = (MATCH) product of: | |
− | + | 0.23314527 = (MATCH) sum of: | |
− | + | 0.23314527 = (MATCH) weight(question:如何 in 0) [DefaultSimilarity], result of: | |
− | + | 0.23314527 = score(doc=0,freq=1.0 = termFreq=1.0 | |
− | + | ), product of: | |
− | + | 0.40397802 = queryWeight, product of: | |
− | + | 1.5389965 = idf(docFreq=6, maxDocs=12) | |
− | + | 0.26249444 = queryNorm | |
− | + | 0.57712364 = fieldWeight in 0, product of: | |
− | + | 1.0 = tf(freq=1.0), with freq of: | |
− | + | 1.0 = termFreq=1.0 | |
− | + | 1.5389965 = idf(docFreq=6, maxDocs=12) | |
− | + | 0.375 = fieldNorm(doc=0) | |
− | + | 0.5 = coord(1/2) | |
− | + | *详细计算流程 | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | ), product of: | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + |
2014年11月28日 (五) 08:13的版本
test idf and tf
- data
[{如何,怎么}} {办理,办} {户口,户口本} # 到当地派出所办理 # 如何办理户口 {办理,办} {户口,户口本} [{流程,步骤}] # 到当地派出所办理 # 如何办理户口 [{如何,怎么}} {办理,办} {身份证,身份} # 到当地派出所办理 # 如何办理身份证 {办理,办} {身份证} [{流程,步骤}] # 到当地派出所办理 # 如何办理身份证
- 搜索
query:"如何办理户口" => question:如何 question:办理户口
- result
doc=0 score=0.11657263 shardIndex=-1|0.11657263 = (MATCH) product of: 0.23314527 = (MATCH) sum of: 0.23314527 = (MATCH) weight(question:如何 in 0) [DefaultSimilarity], result of: 0.23314527 = score(doc=0,freq=1.0 = termFreq=1.0 ), product of: 0.40397802 = queryWeight, product of: 1.5389965 = idf(docFreq=6, maxDocs=12) 0.26249444 = queryNorm 0.57712364 = fieldWeight in 0, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 1.5389965 = idf(docFreq=6, maxDocs=12) 0.375 = fieldNorm(doc=0) 0.5 = coord(1/2)
- 详细计算流程