“CN-CVS”版本间的差异

2022年10月30日 (日) 11:44的版本

Collect audio and video data of more than 2500 Mandarin speakers.
Automatically clip videos through a pipeline including shot detection, VAD, face detection, face tracker, audio-visual synchronization detection.
Manually annotate speaker identity, human check data quality.
Create a benchmark database for video to speech synthesis task.

TODO

TODO

All the resources contained in the database are free for research institutes and individuals.
No commerical usage is permitted.

2022年10月25日 (二) 06:24的版本（查看源代码） Cchen（讨论 \| 贡献）（→‎Source Code） ←上一编辑	2022年10月30日 (日) 11:44的版本（查看源代码） Cchen（讨论 \| 贡献）小（Cchen移动页面MVS至CN-CVS）下一编辑→
（没有差异）