Hulan-2014-11-06
来自cslt Wiki
2014年11月6日 (四) 08:55
Lr(讨论 | 贡献)的版本
Dialog system
Algorithm
Spell mistake
- retrain the ngram model(caoli)
improve fuzzy match
- add Synonyms similarity using MERT-4 method
improve lucene search
different result in lucene
method |
lucene |
vsm_idf(haiguan) |
VSM_idf(baidu) |
vsm_idf(tain) |
vsm_idf(calculate)
|
Accary
|
0.6628 |
0.6228 |
0.6197 |
0.5827 |
0.5426
|
- top10(82.95%),top20(86.34),top50(90.23%),top100(94.11%),top200(96.18%),top1000(97.31%),top2000(97.87%),top5000(98.75%),top10000(99.06)
- lucene Optimization(liurong)
- rewrite the method to select the 50 standard question not same template.(liurong)
- check the word segment for template.(liurong)
- boost the query keyword using IDF
boost keyword in lucene
method |
Default |
idf_train |
idf_train_norm |
idf_baidu |
idf_baidu_norm
|
Accary
|
0.66228 |
0.651629 |
0.57644 |
0.647869 |
0.65288
|
- using MERT-4 method to get good value of multi-feature.like IDF,NER,baidu_weight,keyword etc.(liurong**)
Multi-Scene Recognition
- add the triples search to QA engine (liurong*)
- discuss the detail and give a report.
knowledge structure
Knowledge Management and labeling system
plan to do
plan to discuss
- add the triples search to QA engine