“ASR-events-OC16-details”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
OC16 MixASR-CHEN Challenge
第12行: 第12行:
 
OC16-CE80 is a speech database provided by SpeechOcean (http://www.speechocean.com) for this challenge. The main features involve:
 
OC16-CE80 is a speech database provided by SpeechOcean (http://www.speechocean.com) for this challenge. The main features involve:
  
* XX speakers (XX males, XX females)
+
* 140 speakers  
 
* Mobile channel
 
* Mobile channel
* XXX utterances per speaker in average, amounting to 80 hours of speech signals in total.
+
* 80 hours of speech signals
 
* Transcriptions are provided
 
* Transcriptions are provided
 
* The licence file is [[OC16-CE80|here]]
 
* The licence file is [[OC16-CE80|here]]
第41行: 第41行:
 
==Challenge procedure==
 
==Challenge procedure==
 
   
 
   
* June 6, OC16-CE80 is ready and registration request is acceptable.
+
* June 13, OC16-CE80 is ready and registration request is acceptable.
 
* July 15-17, OC16-CE80 test set release. Participants can response with their decoding results before July 17, 12:00PM, Beijing time.
 
* July 15-17, OC16-CE80 test set release. Participants can response with their decoding results before July 17, 12:00PM, Beijing time.
 
* July 20, participants can obtain their own WER.
 
* July 20, participants can obtain their own WER.

2016年6月13日 (一) 03:46的版本

OC16 MixASR-CHEN Challenge

The OC16 MixASR-CHEN challenge is part of the special session "mixlingual speech processing" on O-COCOSDA 2016. The challenge is a Chinese-English mixed speech recognition task, where the host and embedding languages are Chinese and English respectively.

Data

The challenge requires three resources:

OC16-CE80

OC16-CE80 is a speech database provided by SpeechOcean (http://www.speechocean.com) for this challenge. The main features involve:

  • 140 speakers
  • Mobile channel
  • 80 hours of speech signals
  • Transcriptions are provided
  • The licence file is here

THCHS30

THCHS30 is a pure speech database provided by CSLT@Tsinghua University. All the resources of THCHS30 can be used to improve the system, especially the lexicon and LM. The data is available at:

http://www.openslr.org/18/

CMU English dictionary

To recognize English words, CMU English dictionary 0.7b is allowed to be used.

http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/cmudict-0.7b


Participation rules

  • Participants of both the special session and the OC16 MixASR-CHEN challenge can apply for OC16-CE80 by sending emails to the organizers (see below).
  • Agreement for the usage of OC16-CE80 should be signed and returned to the organizer before the data can be downloaded.
  • Publications based on OC16-CE80 should cite the following paper: "Dong Wang, Difei Tang, Qing Chen, OC16-CE80: a Chinese-English Mixlingual database and an ASR baseline"

Challenge procedure

  • June 13, OC16-CE80 is ready and registration request is acceptable.
  • July 15-17, OC16-CE80 test set release. Participants can response with their decoding results before July 17, 12:00PM, Beijing time.
  • July 20, participants can obtain their own WER.
  • OC16, summary is given on the special session.

Registration

If you are interested to participate the challenge, or if you have any other questions, comments, suggestions about the challenge, please send email to the organizer:

  • Dr. Dong Wang (wangdong99@mails.tsinghua.edu.cn)
  • Mr. Difei Tang (tangdifei@speechocean.com)
  • Ms. Chen Qing (chenqing@speechocean.com)