2014年11月6日 (四) 09:07的最后版本

Dialog system

Algorithm

Spell mistake

retrain the ngram model(caoli)
prepare the test and development set(caoli)

improve fuzzy match

add Synonyms similarity using MERT-4 method

improve lucene search

our vsm method

different result in lucene
method	lucene	vsm_idf(haiguan)	VSM_idf(baidu)	vsm_idf(tain)	vsm_idf(calculate)
Accary	0.6628	0.6228	0.6197	0.5827	0.5426

lucene top

top10(82.95%),top20(86.34),top50(90.23%),top100(94.11%),top200(96.18%),top1000(97.31%),top2000(97.87%),top5000(98.75%),top10000(99.06)
test the result of top(100,200,1000) in full qa(lucene+fuzzymatch)(caoli)

lucene Optimization(liurong)

rewrite the method to select the 50 standard question not same template.(liurong)
check the word segment for template.(liurong)
boost the query keyword using IDF

boost keyword in lucene
method	Default	idf_train	idf_train_norm	idf_baidu	idf_baidu_norm
Accary	0.66228	0.651629	0.57644	0.647869	0.65288

using MERT-4 method to get good value of multi-feature.like IDF,NER,baidu_weight,keyword etc.(liurong this month)

Multi-Scene Recognition

add the triples search to QA engine

discuss the detail and give a report.(liurong)

demo (liurong two week)

knowledge structure

Knowledge Management and labeling system

continue coding.

Patent

the GA method to improve QA .(liurong this month)

plan to discuss

how to add the spell check method to QA engine.

@@ 第2行： / 第2行： @@
 ==Algorithm==
 ===Spell mistake===
-:* retrain the ngram model(caoli)
+:* retrain the ngram model('''caoli''')
+:* prepare the test and development set('''caoli''')
+===improve fuzzy match===
+* add Synonyms similarity using MERT-4 method
 ===improve lucene search===
@@ 第17行： / 第21行： @@
 * lucene top
 :* top10(82.95%),top20(86.34),top50(90.23%),top100(94.11%),top200(96.18%),top1000(97.31%),top2000(97.87%),top5000(98.75%),top10000(99.06)
+:* test the result of top(100,200,1000) in full qa(lucene+fuzzymatch)('''caoli''')
 * lucene Optimization(liurong)
-:* rewrite the method to select the 50 standard question not same template.
+:* rewrite the method to select the 50 standard question not same template.(liurong)
-:* check the word segment for template.
+:* check the word segment for template.(liurong)
 :* boost the query keyword using IDF
 {| border="2px"
@@ 第31行： / 第36行： @@
 |-
 |}
-:* using MERT-4 method to get good value of multi-feature.like IDF,NER,baidu_weight,keyword etc.
+:* using MERT-4 method to get good value of multi-feature.like IDF,NER,baidu_weight,keyword etc.('''liurong this month''')
 ===Multi-Scene Recognition===
 * add the triples search to QA engine
-:* discuss the detail and give a report.
+:* discuss the detail and give a report.('''liurong''')
+* demo ('''liurong two week''')
 ==knowledge structure==
-* structure the default answer using attributes of the entity.
 ==Knowledge Management and labeling system==
-* prepare the interface and function.
+* continue coding.
-==plan to do==
+==Patent==
+* the GA method to improve QA .(liurong this month)
 ==plan to discuss==
-* add the triples search to QA engine
+* how to add the spell check method to QA engine.

“Hulan-2014-11-06”版本间的差异

2014年11月6日 (四) 09:07的最后版本

目录

Dialog system

Algorithm

Spell mistake

improve fuzzy match

improve lucene search

Multi-Scene Recognition

knowledge structure

Knowledge Management and labeling system

Patent

plan to discuss

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具