“Hulan-2013-09-27”版本间的差异
来自cslt Wiki
(以内容“=ASR= ==ASR Kernel development== http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/2013-09-27 ASR group weekly report ==TTS== * EST format learned. * Check ...”创建新页面) |
(没有差异)
|
2013年9月27日 (五) 01:25的版本
ASR
ASR Kernel development
TTS
- EST format learned.
- Check details of each option.
Dialog system
- MH, RY help compose 60 questions, which are being used for testing.
- Conducting the initial experiment:
- Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries. Add the scores of the Cosine score of the match with queries and answers directly.
- CER: 8.3%
- Query time: very slow
- Remove 506 stop words. No significant change on CER & Query time.
- Fast match by listing the words only in the query & answer as the feature, and matching by order to speed up the score calculation.
- Hierarchical matching. First split all the answers to 11 top-level categories, and then split them into 1030 second-level categories. query+answer TF/IDF score.
- CER: top-category: 6.7%, second-level category: 18.3%
- Query time: 6/sec
- Keep two top-level categories, try to reduce top-level errors:
- CER: two top-category, no errors. second-level category: still on going
Next week:
- Reverse index-based fast match, on going.