OC17-plan

来自cslt Wiki
2017年4月20日 (四) 08:14Cslt讨论 | 贡献的版本

跳转至: 导航搜索


Constraint on resource

The only constraint is that only the resources listed in the data profile can be used in the system development.

Constraint on technology

We do not want to set much limitations on what techniques can be used by participants. Any technologies on front-end, acoustic model, language model, decoding are applicable. For example, one may want to use some speech enhancement methods to improve quality of the signal, while others may want to train a high-order LM.

Constraint on tools

Any tools can be used, however, we highly recommend tools publicly available. The tools (internal or external) should not use any heavy models trained using extra data. The only exception is the G2P conversion for the English words in OC17-EnWord. You can use some online services or third-party G2P tools that have been trained using their own database, only that they are publicly available. We think this simulates the real scenario when new English words are encountered and are required to be handled.

Constraint on system

Constraint on system. Although any open-source tools can be used to construct the system, participants are encouraged to use the Kaldi baseline provided by the organizer and augment their own novel techniques, so that the community will get more insight what techniques really help.