“Hulan-2013-10-11”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以内容“=ASR= ==ASR Kernel development== http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/2013-09-27 ASR group weekly report ==TTS== * Lab format learned * All the...”创建新页面)
 
Dialog system
 
(1位用户的2个中间修订版本未显示)
第3行: 第3行:
 
==ASR Kernel development==
 
==ASR Kernel development==
  
[[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/2013-09-27 ASR group weekly report]]
+
[[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/2013-10-11 ASR group weekly report]]
  
 
==TTS==
 
==TTS==
  
* Lab format learned
+
* CD lab files done. Refining the script.  
* All the details of the label format are clear
+
* Training toolkit is cleaned up. Now no alignment is required. Parallel training is done.
* Construct label files from word/pingyin/phone transcription. Use csep word-segmentation tool to obtain these transcriptions from the original text.
+
* Tried syllable based system instead of phones.
* Monophone, Triphone Chinese prototype system is ready. 500 sentences from 863 data are used for training. The trivial questions were used for clustering. 16k signals with 256 FFT transform. GV model used.
+
* Collected an online-novel reading.  
* The voice is funny.  
+
  
 
Next week:
 
Next week:
* Keep on collecting context-dependent labels
 
 
  
 +
* Refine the script
 +
* Clean up the online reading.
  
 
=Dialog system=
 
=Dialog system=
第22行: 第21行:
 
* Conducting the initial experiment:
 
* Conducting the initial experiment:
  
#Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries. Add the scores of the Cosine score of the match with queries and answers directly.
+
#Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries.  
  
 
# Keep two top-level categories, try to reduce top-level errors:
 
# Keep two top-level categories, try to reduce top-level errors:
 +
Add the scores of the Cosine score of the match with queries and answers directly:
 +
top-level errors: 2/60 , all errors: 13/60
 +
Use scores of the Cosine score of the match with queries only:
 +
top-level errors: 0/60 , all errors: 7/60
 +
speed: 2 query/second
  
 
Next week:  
 
Next week:  
  
* Reverse index-based fast match
+
* Reverse index-based fast match (only match with queries)
  
 
# code done in python
 
# code done in python
# CER 7/60, speed 1 query/second
+
# CER 7/60, speed 1 query/second  
  
 
* Use the new data set to verify the program.
 
* Use the new data set to verify the program.

2013年10月11日 (五) 03:12的最后版本

ASR

ASR Kernel development

[ASR group weekly report]

TTS

  • CD lab files done. Refining the script.
  • Training toolkit is cleaned up. Now no alignment is required. Parallel training is done.
  • Tried syllable based system instead of phones.
  • Collected an online-novel reading.

Next week:

  • Refine the script
  • Clean up the online reading.

Dialog system

  • Conducting the initial experiment:
  1. Using 9k dim TF/IDF, compose feature vectors for each query, each answer. Mach the TF/IDF of query+answer to match the TF/IDF of new queries.
  1. Keep two top-level categories, try to reduce top-level errors:
Add the scores of the Cosine score of the match with queries and answers directly:
top-level errors: 2/60 , all errors: 13/60
Use scores of the Cosine score of the match with queries only:
top-level errors: 0/60 , all errors: 7/60
speed: 2 query/second

Next week:

  • Reverse index-based fast match (only match with queries)
  1. code done in python
  2. CER 7/60, speed 1 query/second
  • Use the new data set to verify the program.