2024-02-05

People	This Week	Next Week	Task Tracking (DeadLine)
Dong Wang	Keep on NeuralMag paper, refine the complexity theory Design AI course for Primary School.
Lantian Li
Ying Shi	INTERSPEECH Paper: Keyword attributed Overlapping ASR SOTA model training (down) SOT model training (down) test (in progress) Cohort Overlapping ASR one fix cohort: 2-mix recognizes ONE WER 8.90% one fix cohort: 2-mix recognizes TOW WER 9.30% one fix cohort: 3-mix recognize THREE WER 37.83% apply number speaker prior WER 30.58%
Zhenghai You
Junming Yuan
Chen Chen	DeepFake by xiaolou,zehua syncnet and wer based experiments on noisy audio/video input seems noise is not the reason why these methods failed VTS Finetune a HuBERT with a HiFiGAN for "audio feature to speech" system (both single speaker and multi speaker is ok) Train a VTS(ResNet Conformer Encoder) for "Video to audio feature" system (for single speaker it works well to some degree) Try training multi-speaker video-to-audio-feature system Try joint train video encoder and hifigan
Xiaolou Li
Zehua Liu
Pengqi Li
Wan Lin
Tianhao Wang
Zhenyu Zhou
Junhui Chen
Jiaying Wang
Yu Zhang
Wenqiang Du
Yang Wei	Prepare data backup for corpus disk.
Lily

导航菜单