“2024-08-19”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第48行: 第48行:
 
*
 
*
 
|-
 
|-
 +
 +
 +
|-
 +
|Jiaying Wang
 +
||
 +
* reproduce conditional chain code
 +
** on both libri and wsj: training loss hard to reduce (around -3) and the corresponding test sisdr is positive
 +
* rewriting the code (preserving the original model)
 +
||
 +
*
 +
||
 +
*
 +
|-
 +
  
 
|-
 
|-
第96行: 第110行:
  
  
|-
 
|Wan Lin
 
||
 
* VoxBlink1
 
** Data processing
 
** Baseline(ResNet34) training and NS training [https://z1et6d3xtb.feishu.cn/docx/BywjdkGvNou12sxQ4dAcxYa9noh?from=from_copylink]
 
||
 
*
 
||
 
*
 
|-
 
  
  
第131行: 第134行:
 
*
 
*
 
|-
 
|-
 +
  
  
 
|-
 
|-
|Junhui Chen
+
|Wan Lin
 
||
 
||
 
* VoxBlink1
 
* VoxBlink1
 
** Data processing
 
** Data processing
** Baseline(ResNet34 ASP) training and NS training [https://z1et6d3xtb.feishu.cn/docx/BywjdkGvNou12sxQ4dAcxYa9noh#Ro69dyERUoN0HvxOGMWcOJjrnuf]
+
** Baseline(ResNet34) training and NS training [https://z1et6d3xtb.feishu.cn/docx/BywjdkGvNou12sxQ4dAcxYa9noh?from=from_copylink]
 
||
 
||
 
*
 
*
第147行: 第151行:
  
 
|-
 
|-
|Jiaying Wang
+
|Junhui Chen
 
||
 
||
* reproduce conditional chain code
+
* VoxBlink1
** on both libri and wsj: training loss hard to reduce (around -3) and the corresponding test sisdr is positive
+
** Data processing
* rewriting the code (preserving the original model)  
+
** Baseline(ResNet34 ASP) training and NS training [https://z1et6d3xtb.feishu.cn/docx/BywjdkGvNou12sxQ4dAcxYa9noh#Ro69dyERUoN0HvxOGMWcOJjrnuf]
 
||
 
||
 
*
 
*
第157行: 第161行:
 
*
 
*
 
|-
 
|-
 +
  
  

2024年8月19日 (一) 10:43的版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • AI primary (middle-school) 1-6
Lantian Li
  • GPU status [1]
  • AI primary
    • High school handbook (30/40)
  • High school handbook (40/40)
Ying Shi
Zhenghai You
Jiaying Wang
  • reproduce conditional chain code
    • on both libri and wsj: training loss hard to reduce (around -3) and the corresponding test sisdr is positive
  • rewriting the code (preserving the original model)
Junming Yuan
  • Verified two parameters in Hubert pretraining config file that were confused with the original paper.[2]
    • Confirmed that in the second iteration of pretraining, features should be extracted from the 6-th layer of the transformer, not the 9-th layer.
      • in 175k step, result of 6-th layer: 71.55/9.39, result of 9-th layer: 37.31/16.72
    • Basically confirmed the setting of the parameter 'untie_final_proj' for the two iterations of pretraining.
Xiaolou Li
Zehua Liu
Pengqi Li
  • Investigating Extremely Short-Utterance in speaker recognition[3]
Tianhao Wang
  • reproducing CLIPSep on two datasets: MUSIC and VGGSound [4]
    • MUSIC: Text query: 10.06 SDR, Image query: 12.13 SDR
    • VGGSound: Text query: 2.78 SDR, Image query: 5.01 SDR
Zhenyu Zhou
Wan Lin
  • VoxBlink1
    • Data processing
    • Baseline(ResNet34) training and NS training [5]
Junhui Chen
  • VoxBlink1
    • Data processing
    • Baseline(ResNet34 ASP) training and NS training [6]
Yu Zhang
Wenqiang Du
  • Complete Primary school handbook draft (45 + 8)
  • Modify the format, expression, and distribution of knowledge points in the draft(40%)
Yang Wei
Lily
Turi
Yue Gu
  • modify the introduction
  • complete the interspeech poster, and open source the paper code
  • rest for two days, next I will focus on my new work
Qi Qu
  • Inactive due to absence.
  • KWS:
    • zh48 test dataset to be updated: ~30 speakers in 3 locations.
    • yue10 (Cantonese 10 keywords) train dataset to be updated: ~120 speakers verified, more to come.
    • Try to find suitable keyword-wise thresholds based on Recall ~ FA relation.