Hulan-2013-09-27

来自cslt Wiki

2013年9月27日 (五) 01:25Cslt（讨论 | 贡献）的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)

跳转至：导航、搜索

目录

1 ASR
- 1.1 ASR Kernel development
- 1.2 TTS
2 Dialog system

ASR

ASR Kernel development

[ASR group weekly report]

TTS

EST format learned.
Check details of each option.

Dialog system

MH, RY help compose 60 questions, which are being used for testing.
Conducting the initial experiment:

Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries. Add the scores of the Cosine score of the match with queries and answers directly.

CER: 8.3%
Query time: very slow

Remove 506 stop words. No significant change on CER & Query time.

Fast match by listing the words only in the query & answer as the feature, and matching by order to speed up the score calculation.

Hierarchical matching. First split all the answers to 11 top-level categories, and then split them into 1030 second-level categories. query+answer TF/IDF score.

CER: top-category: 6.7%, second-level category: 18.3%
Query time: 6/sec

Keep two top-level categories, try to reduce top-level errors:

CER: two top-category, no errors. second-level category: still on going

Next week:

Reverse index-based fast match, on going.

取自“http://index.cslt.org/mediawiki/index.php?title=Hulan-2013-09-27&oldid=8230”