2013年10月11日 (五) 03:12的最后版本

ASR

ASR Kernel development

[ASR group weekly report]

TTS

CD lab files done. Refining the script.
Training toolkit is cleaned up. Now no alignment is required. Parallel training is done.
Tried syllable based system instead of phones.
Collected an online-novel reading.

Next week:

Refine the script
Clean up the online reading.

Dialog system

Conducting the initial experiment:

Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries.

Keep two top-level categories, try to reduce top-level errors:

Add the scores of the Cosine score of the match with queries and answers directly:
top-level errors: 2/60 , all errors: 13/60
Use scores of the Cosine score of the match with queries only:
top-level errors: 0/60 , all errors: 7/60
speed: 2 query/second

Next week:

Reverse index-based fast match (only match with queries)

code done in python
CER 7/60, speed 1 query/second

Use the new data set to verify the program.

@@ 第3行： / 第3行： @@
 ==ASR Kernel development==
-[[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/2013-09-27  ASR group weekly report]]
+[[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/2013-10-11  ASR group weekly report]]
 ==TTS==
-* Lab format learned
+* CD lab files done. Refining the script.
-* All the details of the label format are clear
+* Training toolkit is cleaned up. Now no alignment is required. Parallel training is done.
-* Construct label files from word/pingyin/phone transcription. Use csep word-segmentation tool to obtain these transcriptions from the original text.
+* Tried syllable based system instead of phones.
-* Monophone, Triphone Chinese prototype system is ready. 500 sentences from 863 data are used for training. The trivial questions were used for clustering. 16k signals with 256 FFT transform. GV model used.
+* Collected an online-novel reading.
-* The voice is funny.
 Next week:
-* Keep on collecting context-dependent labels
+* Refine the script
+* Clean up the online reading.
 =Dialog system=
@@ 第22行： / 第21行： @@
 * Conducting the initial experiment:
-#Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries. Add the scores of the Cosine score of the match with queries and answers directly.
+#Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries.
 # Keep two top-level categories, try to reduce top-level errors:
+ Add the scores of the Cosine score of the match with queries and answers directly:
+ top-level errors: 2/60 , all errors: 13/60
+ Use scores of the Cosine score of the match with queries only:
+ top-level errors: 0/60 , all errors: 7/60
+ speed: 2 query/second
 Next week:
-* Reverse index-based fast match
+* Reverse index-based fast match (only match with queries)
 # code done in python
 # CER 7/60, speed 1 query/second
 * Use the new data set to verify the program.

“Hulan-2013-10-11”版本间的差异

2013年10月11日 (五) 03:12的最后版本

目录

ASR

ASR Kernel development

TTS

Dialog system

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具