“2024-01-29”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(恢复Lilt讨论)的编辑至Liuzehua的最后版本)
 
(12位用户的13个中间修订版本未显示)
第30行: 第30行:
 
|Ying Shi
 
|Ying Shi
 
||  
 
||  
*  
+
* Keyword-Attributed OverLap ASR
 +
** Fix test dataset: LibirMix-Espnet & LibriMix-Official (2 mix clean)
 +
** Finish model training: KA-ASR-Full, KA-ASR-Oracle, SOT-Our
 +
* Cohort Overlap ASR
 +
** Finish first step:  Recognize one source from mixture by employ speaker embedding
 +
* [https://z1et6d3xtb.feishu.cn/wiki/QoNWwCs9QibHt7k670hcxZBYncb?from=from_copylink group work]
 
||
 
||
 
*  
 
*  
第41行: 第46行:
 
|Zhenghai You
 
|Zhenghai You
 
||  
 
||  
*  
+
* cohort embedding replace speakerbeam speaker embedding
 
||
 
||
 
*  
 
*  
第71行: 第76行:
 
** Get some statistics
 
** Get some statistics
 
* DeepFake
 
* DeepFake
** Human Test on DFDC
+
** Human Test on DFDC [https://z1et6d3xtb.feishu.cn/docx/N5u6dSmgNoYHT2xzP2IcymxFn6d?from=from_copylink]
 
** Zehua & Xiaolou Report
 
** Zehua & Xiaolou Report
 
||
 
||
第83行: 第88行:
 
|Xiaolou Li
 
|Xiaolou Li
 
||  
 
||  
*  
+
* test on LAV-DF dataset
 +
* dataset survey
 +
* weekly report
 
||
 
||
 
*  
 
*  
第94行: 第101行:
 
|Zehua Liu
 
|Zehua Liu
 
||  
 
||  
*  
+
* weekly report
 +
* AV-Hubert test
 
||
 
||
 
*  
 
*  
第105行: 第113行:
 
|Pengqi Li
 
|Pengqi Li
 
||   
 
||   
*  
+
* Duration mismatch with XueYing[https://z1et6d3xtb.feishu.cn/docx/CDcxdX5BcomHlCx2So5cWxL8nVg]
 +
** Compare pre-TDNN 和 post-TDNN
 
||
 
||
 
*
 
*
第127行: 第136行:
 
|Tianhao Wang
 
|Tianhao Wang
 
||  
 
||  
*  
+
* IS24 paper writing (english version & latex)
 
||
 
||
 
*  
 
*  
第138行: 第147行:
 
|Zhenyu Zhou
 
|Zhenyu Zhou
 
||  
 
||  
*Signal leval Speaker Augmentation[https://z1et6d3xtb.feishu.cn/docx/DViBdvm8KoQMMXxMXC0cWp2vnPf]:
+
*Signal leval Speaker Augmentation Plan[https://z1et6d3xtb.feishu.cn/docx/DViBdvm8KoQMMXxMXC0cWp2vnPf]:
·Transformation(Random based & Knowledge based)  
+
**Transformation(Random based & Knowledge based)  
·Speaker Characteristics Guided Voice Conversion
+
**Speaker Characteristics Guided Voice Conversion
 
||
 
||
 
*
 
*
第174行: 第183行:
 
|Yu Zhang
 
|Yu Zhang
 
||
 
||
*  
+
* Financial Pipeline
 +
** adapt portfolio policy to position changes
 
||
 
||
 
*
 
*
第185行: 第195行:
 
|Wenqiang Du
 
|Wenqiang Du
 
||  
 
||  
*  Diting Projiect
+
*  Diting Project
 
**data aug
 
**data aug
 
**add gaussian noise to control FA
 
**add gaussian noise to control FA
第199行: 第209行:
 
|Yang Wei
 
|Yang Wei
 
||  
 
||  
*  
+
* Huilan stuff
 +
** Develop stream mode ASR interface for ASR service
 +
** Deal with time delay problem with long text input for TTS service
 
||
 
||
 
*
 
*

2024年2月19日 (一) 10:50的最后版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • MicroMagnetic paper, the first pass completed.
Lantian Li
  • GPU status [1]
  • ASIP-BUPT (Neural Scoring)
  • ASIP Annual report
Ying Shi
  • Keyword-Attributed OverLap ASR
    • Fix test dataset: LibirMix-Espnet & LibriMix-Official (2 mix clean)
    • Finish model training: KA-ASR-Full, KA-ASR-Oracle, SOT-Our
  • Cohort Overlap ASR
    • Finish first step: Recognize one source from mixture by employ speaker embedding
  • group work
Zhenghai You
  • cohort embedding replace speakerbeam speaker embedding
Junming Yuan
  • Check and organize the mix-training pretraining experiment project.
·Solving the error of MFA on dragon03.(done)
·Extending the pretraining data.(done)
·Exploring the effect of BN in the few-shot finetuning(in progress).
Chen Chen
  • CNCVS data collect
    • Finished testing phase with support from sunyiwei,shuyanzhi,mengshuaiming
  • Child Record Website
    • Finished phoneme annotation phase
    • Get some statistics
  • DeepFake
    • Human Test on DFDC [2]
    • Zehua & Xiaolou Report
Xiaolou Li
  • test on LAV-DF dataset
  • dataset survey
  • weekly report
Zehua Liu
  • weekly report
  • AV-Hubert test
Pengqi Li
  • Duration mismatch with XueYing[3]
    • Compare pre-TDNN 和 post-TDNN
Wan Lin
Tianhao Wang
  • IS24 paper writing (english version & latex)
Zhenyu Zhou
  • Signal leval Speaker Augmentation Plan[4]:
    • Transformation(Random based & Knowledge based)
    • Speaker Characteristics Guided Voice Conversion
Junhui Chen
Jiaying Wang
  • speaker encoder preparation(ResNet34_ASP_AAMSoftmax-LMFT)
  • gender divide test on speaker beam
  • cohort with min SNR loss pb
Yu Zhang
  • Financial Pipeline
    • adapt portfolio policy to position changes
Wenqiang Du
  • Diting Project
    • data aug
    • add gaussian noise to control FA
Yang Wei
  • Huilan stuff
    • Develop stream mode ASR interface for ASR service
    • Deal with time delay problem with long text input for TTS service
Lily
  • update statistical results[5]