Hulan-2013-10-11

来自cslt Wiki

2013年10月11日 (五) 01:41Cslt（讨论 | 贡献）的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)

跳转至：导航、搜索

目录

1 ASR
- 1.1 ASR Kernel development
- 1.2 TTS
2 Dialog system

ASR

ASR Kernel development

[ASR group weekly report]

TTS

Lab format learned
All the details of the label format are clear
Construct label files from word/pingyin/phone transcription. Use csep word-segmentation tool to obtain these transcriptions from the original text.
Monophone, Triphone Chinese prototype system is ready. 500 sentences from 863 data are used for training. The trivial questions were used for clustering. 16k signals with 256 FFT transform. GV model used.
The voice is funny.

Next week:

Keep on collecting context-dependent labels

Dialog system

Conducting the initial experiment:

Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries. Add the scores of the Cosine score of the match with queries and answers directly.

Keep two top-level categories, try to reduce top-level errors:

Next week:

Reverse index-based fast match

code done in python
CER 7/60, speed 1 query/second

Use the new data set to verify the program.

取自“http://index.cslt.org/mediawiki/index.php?title=Hulan-2013-10-11&oldid=8290”