“Hulan-2013-10-11”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以内容“=ASR= ==ASR Kernel development== http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/2013-09-27 ASR group weekly report ==TTS== * Lab format learned * All the...”创建新页面)
 
TTS
第7行: 第7行:
 
==TTS==
 
==TTS==
  
* Lab format learned
+
* CD lab files done. Refining the script.  
* All the details of the label format are clear
+
* Training toolkit is cleaned up. Now no alignment is required. Parallel training is done.
* Construct label files from word/pingyin/phone transcription. Use csep word-segmentation tool to obtain these transcriptions from the original text.
+
* Tried syllable based system instead of phones.
* Monophone, Triphone Chinese prototype system is ready. 500 sentences from 863 data are used for training. The trivial questions were used for clustering. 16k signals with 256 FFT transform. GV model used.
+
* Collected an online-novel reading.  
* The voice is funny.  
+
  
 
Next week:
 
Next week:
* Keep on collecting context-dependent labels
 
 
  
 +
* Refine the script
 +
* Clean up the online reading.
  
 
=Dialog system=
 
=Dialog system=

2013年10月11日 (五) 01:47的版本

ASR

ASR Kernel development

[ASR group weekly report]

TTS

  • CD lab files done. Refining the script.
  • Training toolkit is cleaned up. Now no alignment is required. Parallel training is done.
  • Tried syllable based system instead of phones.
  • Collected an online-novel reading.

Next week:

  • Refine the script
  • Clean up the online reading.

Dialog system

  • Conducting the initial experiment:
  1. Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries. Add the scores of the Cosine score of the match with queries and answers directly.
  1. Keep two top-level categories, try to reduce top-level errors:

Next week:

  • Reverse index-based fast match
  1. code done in python
  2. CER 7/60, speed 1 query/second
  • Use the new data set to verify the program.