“OLR Challenge 2018”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“=Oriental Language Recognition (OLR) 2017 Challenge= Oriental languages involve interesting specialties. The OLR challenge series aim at boosting language recogniti...”为内容创建页面)
 
OLR 2018 Workshop
 
(2位用户的31个中间修订版本未显示)
第1行: 第1行:
=Oriental Language Recognition (OLR) 2017 Challenge=
+
=Oriental Language Recognition (OLR) 2018 Challenge=
  
 
Oriental languages involve interesting specialties. The OLR challenge series aim at boosting language recognition technology for oriental languages.  
 
Oriental languages involve interesting specialties. The OLR challenge series aim at boosting language recognition technology for oriental languages.  
Following the success of [[OLR Challenge 2016]], the new challenge in 2017 follows the same theme, but sets up more challenging tasks in the sense of:
+
Following the success of [[OLR Challenge 2017]] and [[OLR Challenge 2016]], the new challenge in 2018 follows the same theme, but sets up more challenging tasks in the sense of:
  
* more languages: OLR 2016 involves 7 languages, OLR 2017 involves 10 languages.  
+
* Short-utterance identification task: This is a close-set identification task, which means the language of each utterance is among the known 10 target languages. The utterances are as short as 1 second.
* shorter speech segments. OLR 2017 sets individual tasks for 1 second, 3 second and the original segments separately.  
+
* Confusing-language identification task: This task identifies the language of utterances from 3 highly confusing languages (Cantonese, Korean and Mandarin).
 +
* Open-set recognition task: In this task, the test utterance may be in none of the 10 target languages.
  
We will publish the results on a special session of APSIPA ASC 2017. See more details for the [[AP17:OLR-special session| AP17 special session]].
+
We will publish the results on a special session of APSIPA ASC 2018.  
  
  
==Announcement==
+
==News==
 
+
* Ground truth for test data released, download [[ ap18-olr-test | here]].
* Number of teams that registered the challenge:<span style="color:red"> '''31'''.</span> <s>Number of teams that wait to register the challenge:<span style="color:red"> '''38-31'''</span></s>.
+
* Test data for 3 tasks released, download [[ ap18-olr-test | here]].
 
+
* '''Roobo''' agreed to sponsor OLR-2017. Teams with good results will be awarded, perhaps cash, but more probability a '''chatting robo'''t. :)
+
 
+
* According to the request of some participants, the date for the result submission is changed to <span style="color:red">Oct. 10, 12:00 PM</span>. The main reason is that some participants from China (particularly industrial participants) can not access the computing resource during Oct.1-Oct.7, which is China's national holiday.
+
  
 
==Data==
 
==Data==
  
The challenge is based on two multilingual databases, AP16-OL7 that was designed for the OLR challenge 2016, and a new complementary AP17-OL3 database.  
+
The challenge is based on two multilingual databases, AP16-OL7 that was designed for the OLR challenge 2016, and AP17-OL3 database that was designed for the OLR challenge 2017.  
  
AP16-OL7 is provided by SpeechOcean (www.speechocean.com), and AP17-OL3 is provided by Tsinghua University, Northwest Minzu University and Xinjiang University, under the [http://m2asr.cslt.org M2ASR project] supported by NSFC.  
+
AP16-OL7 is provided by SpeechOcean (www.speechocean.com), and AP17-OL3 is provided by Tsinghua University, Northwest Minzu University and Xinjiang University, under the [http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/Asr-project-nsfc M2ASR project] supported by NSFC.  
  
  
第47行: 第44行:
 
==Evaluation plan==
 
==Evaluation plan==
  
Refer to the scripts/paper following.
+
Refer to the following paper:
 +
 
 +
Zhiyuan Tang, Dong Wang, Qing Chen: AP18-OLR Challenge: Three Tasks and Their Baselines, submitted to APSIPA ASC 2018 ([https://arxiv.org/abs/1806.00616 arXiv])
  
 
==Evaluation tools==
 
==Evaluation tools==
* The Kaldi-based baseline scripts [https://github.com/tzyll/kaldi/tree/ap17_olr/egs/ap17_olr here]
+
* The Kaldi-based baseline scripts [https://github.com/tzyll/kaldi/tree/caser/egs/cslt_cases/lre_baseline here]
  
 
==Participation rules==
 
==Participation rules==
第59行: 第58行:
 
'''Dong Wang, Lantian Li, Difei Tang, Qing Chen, AP16-OL7: a multilingual database for oriental languages and a language recognition baseline, APSIPA ASC 2016.''' [http://wangd.cslt.org/public/pdf/ole.pdf pdf]
 
'''Dong Wang, Lantian Li, Difei Tang, Qing Chen, AP16-OL7: a multilingual database for oriental languages and a language recognition baseline, APSIPA ASC 2016.''' [http://wangd.cslt.org/public/pdf/ole.pdf pdf]
  
'''Zhiyuan Tang, Dong Wang, Yixiang Chen, Qing Chen: AP17-OLR Challenge: Data, Plan, and Baseline, submitted to APSIPA ASC 2017.''' [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/d/d6/AP17-OLR.pdf pdf]
+
'''Zhiyuan Tang, Dong Wang, Yixiang Chen, Qing Chen: AP17-OLR Challenge: Data, Plan, and Baseline, APSIPA ASC 2017.''' [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/d/d6/AP17-OLR.pdf pdf]
 +
 
 +
'''Zhiyuan Tang, Dong Wang, Qing Chen: AP18-OLR Challenge: Three Tasks and Their Baselines, submitted to APSIPA ASC 2018.''' [https://arxiv.org/pdf/1806.00616.pdf pdf]
  
 
==Important dates==
 
==Important dates==
  
* Jun. 20, AP17-OLR training/dev data release.  
+
* May. 1, AP18-OLR training/dev data release.  
* Sep. 20, register deadline.
+
* Sep. 1, register deadline.
* Oct. 1, test data release.
+
* Oct. 8, test data release, download [[ ap18-olr-test | here]].  
* <s>Oct. 2, 12:00 PM, Beijing time, submission deadline. </s>
+
* Oct. 15, 24:00, Beijing time, submission deadline.  
* Oct. 10, 12:00 PM, Beijing time, submission deadline.
+
* APSIPA ASC 2018, results announcement.
* Oct. 22, 12:00 PM, Beijing time, delayed submission deadline.
+
* Dec. 12, 12:00 PM, Beijing time, extended submission deadline.  
+
* APSIPA ASC 2017, results announcement.
+
  
 
==Registration procedure==
 
==Registration procedure==
第76行: 第74行:
 
If you intend to participate the challenge, or if you have any questions, comments or suggestions about the challenge, please send email to the organizers:  
 
If you intend to participate the challenge, or if you have any questions, comments or suggestions about the challenge, please send email to the organizers:  
  
* Dr. Dong Wang (wangdong99@mails.tsinghua.edu.cn)
+
* Prof. Dong Wang (wangdong99@mails.tsinghua.edu.cn)
 
* Dr. Zhiyuan Tang (tangzhiyuan12@mails.ucas.ac.cn)  
 
* Dr. Zhiyuan Tang (tangzhiyuan12@mails.ucas.ac.cn)  
 
* Ms. Qing Chen (chenqing@speechocean.com)
 
* Ms. Qing Chen (chenqing@speechocean.com)
第89行: 第87行:
 
* Qing Chen, SpeechOcean
 
* Qing Chen, SpeechOcean
  
==Registration status==
 
  
[[olr-2017-registered team|Team list (for Admin)]]
+
=Ranking list=
 +
The Oriental Language Recognition (OLR) Challenge 2018, co-organized by CSLT@Tsinghua University and Speechocean, was completed with a great success.
 +
The results have been published in the APSIPA ASC, Dec 12-15, 2018, Hawaii, USA.
  
==Sponsor==
+
=== Overview ===
 +
There are totally <span style="color:red"> '''25'''</span> teams that registered this challenge.
 +
Until the deadline of submission, <span style="color:red">'''17'''</span> teams submitted their results.
 +
The submissions have been ranked in terms of the 3 language recognition tasks respectively,
 +
one is short-utterance identification task, the second one is confusing-language identification task, and the third one is open-set identification task.
 +
We just present the results and details of the top 10 teams.
  
[[文件:Roobo.png]]
+
=== Task 1 ===
 
+
[http://www.roobo.com/ Official site]
+
 
+
[[文件:M2asr logo.png|200px|]]
+
 
+
[http://m2asr.cslt.org Official site]
+
 
+
= Challenge results =
+
 
+
The Oriental Language Recognition (OLR) Challenge 2017, co-organized by CSLT@Tsinghua University and Speechocean, was completed with a great success.
+
The results have been published in the APSIPA ASC, Dec 12-15, 2017, Kuala Lumpur, Malaysia ([[媒体文件:OLR_Challenge_2017_微信稿.pdf| News file/新闻稿]]).
+
 
+
 
+
 
+
=== Overview ===
+
There are totally <span style="color:red"> '''31'''</span> teams that registered this challenge.
+
Until the deadline of extended submission (2017/12/12), 
+
<span style="color:red">'''19'''</span> teams submitted their results completely,
+
<span style="color:red">'''6'''</span> teams submitted partially or responded actively,
+
and
+
<span style="color:red">'''6'''</span> teams did not give any response after the data download.
+
The  <span style="color:red"> '''19'''</span> complete submissions have been ranked in two lists,
+
one is by the overall performance on all conditions,
+
and the other is by the performance on the short-utterance condition.
+
  
[[文件:Team-nation.png | 500px]]
+
[[文件:olr18-1-overview.png | 600px]]
  
=== Ranking List on Overall Performance ===
+
[[文件:olr18-1-10.png | 600px]]
For the overall performance ranking list, we present the results and details of the 19 teams that have successfully submitted their results. Note that:
+
  
* The top 6 systems defeated the baseline system provided by us.
+
[[文件:olr18-1-10-detail.png | 600px]]
* The submissions with a star after the team name is an extended submission. This means they should not be treated equally as the regular submissions (without a star).
+
  
[[文件:Olr17-overall.png | 600px]]
 
  
[[文件:Olr17-teams.png | 600px]]
+
=== Task 2 ===
  
=== Ranking List on Short-Utterance Condition ===
+
[[文件:olr18-2-overview.png | 600px]]
For the short-utterance performance ranking list, we present the results of the 19 teams that have successfully submitted their results. Note that:
+
  
* The top 10 systems defeated the baseline system provided by us.
+
[[文件:olr18-2-10.png | 600px]]
* The submissions with a star after the team name is an extended submission. This means they should not be treated equally as the regular submissions (without a star).
+
  
[[文件:Olr17-short-utt.png | 350px]]
+
[[文件:olr18-2-10-detail.png | 600px]]
  
=== Failed participants ===
 
  
There are 6 teams failed to make a complete submission. These teams submitted partially or explained the reason of the failure. All these teams require to be anonymous. The names of these teams are:
+
=== Task 3 ===
  
* Chicken Dinner
+
[[文件:olr18-3-overview.png | 600px]]
* CLR
+
* LonelySpoon
+
* CIAIC
+
* asrboys
+
* 519
+
  
=== Non-response participants ===
+
[[文件:olr18-3-10.png | 600px]]
  
There are 6 teams who downloaded the data but responded nothing still now. These teams are regarded as unfaithful data users:
+
[[文件:olr18-3-10-detail.png | 600px]]
  
[[文件:Olr17-passive.png | 600px]]
 
  
===Award===
 
  
* Best Overall Performance Award: NUS-I2R-NTU(NUS, I2U, NTU joint team)
+
= OLR 2018 Workshop =
[[文件:Olr17-nus.jpg|500px]]
+
* Best Short-Utterance Performance Award: SASI (University of New South Wales, Sydney, Australia)
+
[[文件:Olr17-sasi.png|500px]]
+
  
= Ground truth =
+
The OLR 2018 workshop has been successfully conducted at Tsinghua Unviersity on 3.24, 2019.
  
The ground truth for the test is [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/e/e0/Olr17-groundtruth.txt here].
+
[[OLR2018 WorkShop|workshop link]]

2019年4月4日 (四) 01:37的最后版本

Oriental Language Recognition (OLR) 2018 Challenge

Oriental languages involve interesting specialties. The OLR challenge series aim at boosting language recognition technology for oriental languages. Following the success of OLR Challenge 2017 and OLR Challenge 2016, the new challenge in 2018 follows the same theme, but sets up more challenging tasks in the sense of:

  • Short-utterance identification task: This is a close-set identification task, which means the language of each utterance is among the known 10 target languages. The utterances are as short as 1 second.
  • Confusing-language identification task: This task identifies the language of utterances from 3 highly confusing languages (Cantonese, Korean and Mandarin).
  • Open-set recognition task: In this task, the test utterance may be in none of the 10 target languages.

We will publish the results on a special session of APSIPA ASC 2018.


News

  • Ground truth for test data released, download here.
  • Test data for 3 tasks released, download here.

Data

The challenge is based on two multilingual databases, AP16-OL7 that was designed for the OLR challenge 2016, and AP17-OL3 database that was designed for the OLR challenge 2017.

AP16-OL7 is provided by SpeechOcean (www.speechocean.com), and AP17-OL3 is provided by Tsinghua University, Northwest Minzu University and Xinjiang University, under the M2ASR project supported by NSFC.


The features for AP16-OL7 involve:

  • Mobile channel
  • 7 languages in total
  • 71 hours of speech signals in total
  • Transcriptions and lexica are provided
  • The data profile is here
  • The License for the data is here

The features for AP17-OL3 involve:

  • Mobile channel
  • 3 languages in total
  • Tibetan provided by Prof. Guanyu Li@Northwest Minzu Univ.
  • Uyghur and Kazak provided by Prof. Askar Hamdulla@Xinjiang University.
  • 35 hours of speech signals in total
  • Transcriptions and lexica are provided
  • The data profile is here
  • The License for the data is here

Evaluation plan

Refer to the following paper:

Zhiyuan Tang, Dong Wang, Qing Chen: AP18-OLR Challenge: Three Tasks and Their Baselines, submitted to APSIPA ASC 2018 (arXiv)

Evaluation tools

  • The Kaldi-based baseline scripts here

Participation rules

  • Participants from both academy and industry are welcome
  • Publications based on the data provided by the challenge should cite the following paper:

Dong Wang, Lantian Li, Difei Tang, Qing Chen, AP16-OL7: a multilingual database for oriental languages and a language recognition baseline, APSIPA ASC 2016. pdf

Zhiyuan Tang, Dong Wang, Yixiang Chen, Qing Chen: AP17-OLR Challenge: Data, Plan, and Baseline, APSIPA ASC 2017. pdf

Zhiyuan Tang, Dong Wang, Qing Chen: AP18-OLR Challenge: Three Tasks and Their Baselines, submitted to APSIPA ASC 2018. pdf

Important dates

  • May. 1, AP18-OLR training/dev data release.
  • Sep. 1, register deadline.
  • Oct. 8, test data release, download here.
  • Oct. 15, 24:00, Beijing time, submission deadline.
  • APSIPA ASC 2018, results announcement.

Registration procedure

If you intend to participate the challenge, or if you have any questions, comments or suggestions about the challenge, please send email to the organizers:

  • Prof. Dong Wang (wangdong99@mails.tsinghua.edu.cn)
  • Dr. Zhiyuan Tang (tangzhiyuan12@mails.ucas.ac.cn)
  • Ms. Qing Chen (chenqing@speechocean.com)

Organizers

Wangdong.png

  • Dong Wang, Tsinghua University [home]
生成缩略图出错:/bin/bash: /usr/bin/convert: No such file or directory

Error code: 127
  • Zhiyuan Tang, Tsinghua University [home]
  • Qing Chen, SpeechOcean


Ranking list

The Oriental Language Recognition (OLR) Challenge 2018, co-organized by CSLT@Tsinghua University and Speechocean, was completed with a great success. The results have been published in the APSIPA ASC, Dec 12-15, 2018, Hawaii, USA.

Overview

There are totally 25 teams that registered this challenge. Until the deadline of submission, 17 teams submitted their results. The submissions have been ranked in terms of the 3 language recognition tasks respectively, one is short-utterance identification task, the second one is confusing-language identification task, and the third one is open-set identification task. We just present the results and details of the top 10 teams.

Task 1

Olr18-1-overview.png

Olr18-1-10.png

Olr18-1-10-detail.png


Task 2

Olr18-2-overview.png

Olr18-2-10.png

Olr18-2-10-detail.png


Task 3

Olr18-3-overview.png

Olr18-3-10.png

Olr18-3-10-detail.png


OLR 2018 Workshop

The OLR 2018 workshop has been successfully conducted at Tsinghua Unviersity on 3.24, 2019.

workshop link