“ASR-events-OC17”版本间的差异
来自cslt Wiki
第3行: | 第3行: | ||
The modern society demonstrates clear mutual influence among languages, e.g., Mandarin to minor languages in China, and English to other languages in the world. This leads to a clear mixlingual phenomenon, i.e., some words of a foreign (or target, embedded) language are embedded in a host (or source, matrix) language. This mixlingual phenomenon results in a serious problem in speech recognition (ASR). | The modern society demonstrates clear mutual influence among languages, e.g., Mandarin to minor languages in China, and English to other languages in the world. This leads to a clear mixlingual phenomenon, i.e., some words of a foreign (or target, embedded) language are embedded in a host (or source, matrix) language. This mixlingual phenomenon results in a serious problem in speech recognition (ASR). | ||
− | Based on the success of the first [[ASR-events-OC16|MixASR-CHEN 2016 challenge]], the MixASR-CHEN 2017 challenge follows the same theme of Mixlingual ASR. The task is more | + | Based on the success of the first [[ASR-events-OC16|MixASR-CHEN 2016 challenge]], the MixASR-CHEN 2017 challenge follows the same theme of Mixlingual ASR. The task is more challenging in several ways: |
* There are more utterances that involve multiple English words in the test data | * There are more utterances that involve multiple English words in the test data |
2017年4月20日 (四) 12:14的版本
Introduction
The modern society demonstrates clear mutual influence among languages, e.g., Mandarin to minor languages in China, and English to other languages in the world. This leads to a clear mixlingual phenomenon, i.e., some words of a foreign (or target, embedded) language are embedded in a host (or source, matrix) language. This mixlingual phenomenon results in a serious problem in speech recognition (ASR).
Based on the success of the first MixASR-CHEN 2016 challenge, the MixASR-CHEN 2017 challenge follows the same theme of Mixlingual ASR. The task is more challenging in several ways:
- There are more utterances that involve multiple English words in the test data
- There are more English words that are not in the CMU dictionary
- There are more English phrases that involve multiple English words
Challenge details
- The database information is here.
- The challenge plan is here.
- The tools that can be used are here.
- The Kaldi baseline is here.
- Registration and result submission here.
- Participants from both academy and industry are welcome.
- As the first challenge, we will base the challenge on the OCOCOSDA forum and will release the results on OCOCOSDA 2017. Challenge participants are highly recommended to submit their system as a paper to the conference.
Important date
- May 1st: training/dev dataset release
- July 15: OC16-CE80 test data release
- July 18: OC16-CE80 test data release
- July 29: Paper submission deadline
- OC2017: challenge result release
Organizers
- Dong Wang (Tsinghua University)
- Zhiyuan Tang (Tsinghua University)
- Qing Chen (Speech Ocean), chenqing@speechocean.com