Hulan-2013-09-27

来自cslt Wiki
2013年9月27日 (五) 01:25Cslt讨论 | 贡献的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航搜索

ASR

ASR Kernel development

[ASR group weekly report]

TTS

  • EST format learned.
  • Check details of each option.

Dialog system

  • MH, RY help compose 60 questions, which are being used for testing.
  • Conducting the initial experiment:
  1. Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries. Add the scores of the Cosine score of the match with queries and answers directly.
  • CER: 8.3%
  • Query time: very slow
  1. Remove 506 stop words. No significant change on CER & Query time.
  1. Fast match by listing the words only in the query & answer as the feature, and matching by order to speed up the score calculation.
  1. Hierarchical matching. First split all the answers to 11 top-level categories, and then split them into 1030 second-level categories. query+answer TF/IDF score.
  • CER: top-category: 6.7%, second-level category: 18.3%
  • Query time: 6/sec
  1. Keep two top-level categories, try to reduce top-level errors:
  • CER: two top-category, no errors. second-level category: still on going

Next week:

  • Reverse index-based fast match, on going.