“2024-02-05”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第6行: 第6行:
 
|Dong Wang
 
|Dong Wang
 
||  
 
||  
* Keep on NeuralMag paper, refine the complexity theory
+
*  
* Design AI course for Primary School.
+
 
||
 
||
 
*
 
*
第29行: 第28行:
 
|Ying Shi
 
|Ying Shi
 
||  
 
||  
* INTERSPEECH Paper: Keyword attributed Overlapping ASR
+
*
**  SOTA model training (down)
+
**  SOT model training (down)
+
**  test (in progress)
+
* Cohort Overlapping ASR
+
** one fix cohort: 2-mix recognizes ONE WER 8.90%
+
** one fix cohort: 2-mix recognizes TOW WER 9.30%
+
** one fix cohort: 3-mix recognize THREE WER 37.83%  apply number speaker prior WER 30.58%
+
 
||
 
||
 
*  
 
*  

2024年2月5日 (一) 11:09的版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
Lantian Li
Ying Shi
Zhenghai You
Junming Yuan
Chen Chen
  • DeepFake
    • by xiaolou,zehua
    • syncnet and wer based experiments on noisy audio/video input
    • seems noise is not the reason why these methods failed
  • VTS
    • Finetune a HuBERT with a HiFiGAN for "audio feature to speech" system (both single speaker and multi speaker is ok)
    • Train a VTS(ResNet Conformer Encoder) for "Video to audio feature" system (for single speaker it works well to some degree)
    • Try training multi-speaker video-to-audio-feature system
    • Try joint train video encoder and hifigan
Xiaolou Li
Zehua Liu
Pengqi Li
Wan Lin
Tianhao Wang
Zhenyu Zhou
Junhui Chen
Jiaying Wang
Yu Zhang
Wenqiang Du
Yang Wei
  • Prepare data backup for corpus disk.
Lily