ASR-events-OC16

Call for Papers for Special Session “Mixlingual Speech Processing”

Introduction

The modern society demonstrates clear mutual influence among languages, e.g., Mandarin to minor languages in China, and English to other languages in the world. This leads to a clear mixlingual phenomenon, i.e., some words of a foreign (or target, embedded) language are embedded in a host (or source, matrix) language. This mixlingual effect causes significant problems in various speech processing tasks. This special session invites papers on but not limited to the following topics:

 Mixlingual phonetic and phonological analysis
 Mixlingual speech recognition
 Mixlingual speech synthesis
 Language turn detection
 Mixlingual language understanding

To assist the research on mixlingual speech processing, the special session offers a large Chinese-English mixlingual speech database OC16-CE80 (provided by Speechocean) that involves 80h of speech data and the associated resources. Participants to the special session can apply for OC16-CE80 for free if they need data to evaluate their research.

Important date

 June 13 OC16-CE80 training/dev data release
 July 31 Paper submission deadline
 Aug. 15 Paper acceptance notification

Call for Submissions to the OC16 Chinese-English MixASR (OC16 MixASR-CHEN) Challenge

Introduction

The OC16-CE80 database involves 80h of Chinese-English mixlingual data, where English words are embedded in the host Chinese sentences. This special session calls for a Chinese-English MixASR challenge based on this database.

The participants to the challenge are encouraged to submit the system design and results to the special session “mix lingual speech processing”, although they don’t have to.
Since the release date of the test set is close to the paper submission deadline, you can use the dev set for your paper publication.
More details about the data and the challenge is here.

Important date

 June 13 OC16-CE80 training/dev data release
 July 15 OC16-CE80 test data release
 July 29 Paper submission deadline
 Sept. 30 OC16-CE80 extend submission deadline
 OC2016: challenge result release

Extend submission

The 'official submission' has past the due, and we received a number of good submissions. The WER results have been returned to the participants individually.

We now accept 'extend submissions'. Any participants can submit your results (or new results for participants that have sent the official submission), until 30th, Sept. We are happy to help evaluate your submissions and report your results (if you agree) as 'the results of extended submission' on the OC16 special session.

Many thanks for your participation, we look forward your new submissions and discuss this interesting topic in OC16.

Challenge results

Primary submission

Order	Institute	Chinese WER	English WER	Overal WER
Baseline	Tsinghua, CSLT	19.00	43.67	20.09
1	Samsung R&D Institute of China - Beijing (SRC-B)	14.53	26.78	14.75
2	Shanghai Normal University	15.98	28.28	16.11
3	Academia Sinica, Taiwan + ASUS	19.42	28.20	19.05
4	Rokid	22.44	37.02	21.84
5	National Taipei University of Technology	29.14	39.24	28.18
6	Anonymous company	30.76	75.65	29.16

Extended submission

Order	Institute	Chinese WER	English WER	Overal WER
Baseline	Tsinghua, CSLT	19.00	43.67	20.09
1	National Taipei University of Technology	15.92	24.47	15.89

Slides for result announcement at OCOCOSDA-16

Ground Truth and Scripts for evaluation

1. Ground Truth provided by SpeechOcean.

2. Scripts for evaluation (for reference only):

English_word_continuous.py: There are blanks in some English words, like ' w o r d ', which can be transformed to ' word ' by this script.

wer_output_filter_ch： used as ref_filtering_cmd and hyp_filtering_cmd in local/score.sh(Kaldi) for evaluating only Chinese text.

wer_output_filter_ch_en： used as ref_filtering_cmd and hyp_filtering_cmd in local/score.sh(Kaldi) for evaluating mixed Chinese and English text.

wer_output_filter_en： used as ref_filtering_cmd and hyp_filtering_cmd in local/score.sh(Kaldi) for evaluating only English text.

ASR-events-OC16

目录

Call for Papers for Special Session “Mixlingual Speech Processing”

Introduction

Important date

Call for Submissions to the OC16 Chinese-English MixASR (OC16 MixASR-CHEN) Challenge

Introduction

Important date

Extend submission

Challenge results

Ground Truth and Scripts for evaluation

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具