2013年11月29日 (五) 03:58的版本

ASR

5000 utterance training done.
500 utterance TN recording done. Quality control is not very good. Resulting synthesis is not satisfied.
41 WD utterance recording. Quality control fine. Adaptation done. Sounds OK.
Buzzy sound was investigated.The main source is the source model (excitation). STRAIGHT sounds better.

There are totally 2000 errors. Investigated into 600 errors.

NULL query, 1.4%
English upper/lower mismatch. 1.6%
Traditional/Simple Chinese mismatch. 2.2%
High frequency of sub-important words, like taxing. 1.3%
Database labeling error (matched query is better than the labeled correct query). 21.8%
Stand query or query involve many unimportant words, leading to less TF/IDF. STOP words still impact. 10.7%
TF/IDF incorrectly weighted the matched terms. 3.9%
Synonym can not match. 36.5%
Category words can not match. 13.5%
Answer label incorrect. Semantic relationship missing. 6.8%
Word segmentation hide keywords. 4%
Vague query. None discriminative words after stop words purging. 1.6%

@@ 第8行： / 第8行： @@
 * This week
-#* 5000 utterance training done.
+:* 5000 utterance training done.
-#* 500 utterance TN recording done. Quality control is not very good. Resulting synthesis is not satisfied.
+:* 500 utterance TN recording done. Quality control is not very good. Resulting synthesis is not satisfied.
-#* 41 WD utterance recording. Quality control fine. Adaptation done. Sounds OK.
+:* 41 WD utterance recording. Quality control fine. Adaptation done. Sounds OK.
-#* Buzzy sound was investigated.The main source is the source model (excitation).  STRAIGHT sounds better.
+:* Buzzy sound was investigated.The main source is the source model (excitation).  STRAIGHT sounds better.
 * Next week
-#* Developing CGI service.
+:* Developing CGI service.
-#* Prepare to 2000 utt female record.
+:* Prepare to 2000 utt female record.
 =Dialog system=