Hulan-2013-10-11

来自cslt Wiki
2013年10月11日 (五) 01:41Cslt讨论 | 贡献的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航搜索

ASR

ASR Kernel development

[ASR group weekly report]

TTS

  • Lab format learned
  • All the details of the label format are clear
  • Construct label files from word/pingyin/phone transcription. Use csep word-segmentation tool to obtain these transcriptions from the original text.
  • Monophone, Triphone Chinese prototype system is ready. 500 sentences from 863 data are used for training. The trivial questions were used for clustering. 16k signals with 256 FFT transform. GV model used.
  • The voice is funny.

Next week:

  • Keep on collecting context-dependent labels


Dialog system

  • Conducting the initial experiment:
  1. Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries. Add the scores of the Cosine score of the match with queries and answers directly.
  1. Keep two top-level categories, try to reduce top-level errors:

Next week:

  • Reverse index-based fast match
  1. code done in python
  2. CER 7/60, speed 1 query/second
  • Use the new data set to verify the program.