“2024-08-26”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(15位用户的27个中间修订版本未显示)
第6行: 第6行:
 
|Dong Wang
 
|Dong Wang
 
||
 
||
*
+
* Primary school book (17)
 +
* College AI education
 +
 
 
||
 
||
 
*
 
*
第17行: 第19行:
 
|Lantian Li
 
|Lantian Li
 
||
 
||
*
+
* GPU status [https://z1et6d3xtb.feishu.cn/wiki/XGcGwRK5viJmpRkjH9AczIhynCh]
 +
*AI primary
 +
** High school handbook (40/40)
 
||
 
||
 
*
 
*
第28行: 第32行:
 
|Ying Shi
 
|Ying Shi
 
||
 
||
*  
+
* Fenyinta stuff
 +
* reproduce cohort-SOT overlap ASR and some analysis
 +
* Text enroll keywords spotting intermediate PIT-SOT CTC + high layer cross-attention is in progress  [https://z1et6d3xtb.feishu.cn/docx/DI3UdF496ojxCQxTqUycsjDQnxf?from=from_copylink here]
 
||
 
||
 
*
 
*
第39行: 第45行:
 
|Zhenghai You
 
|Zhenghai You
 
||
 
||
*
+
* Speaker Augument: Completed experiments in Libri2mix, Low SISDR testset lower speaker confusion rate
 +
* ExFormer: Always inferior to the SOTA[https://z1et6d3xtb.feishu.cn/docx/ZbtsdTGuQo4IXnxuxHXcpvBynoe]
 
||
 
||
 
*
 
*
第49行: 第56行:
 
|Junming Yuan
 
|Junming Yuan
 
||
 
||
* Confirmed that the performance gap of the 10% is determined by the impact of GPUs.
+
* Confirmed that the performance gap of the 10% is determined by the impact of GPUs[https://z1et6d3xtb.feishu.cn/docx/PaATdHi26oEc0Pxovd4cSyp0nQ2].
 
** To fully reproduce the official model, it would take approximately 32 days.
 
** To fully reproduce the official model, it would take approximately 32 days.
 
* Investigate how to train Hubert with Mix-speech (in progress)
 
* Investigate how to train Hubert with Mix-speech (in progress)
||
 
*
 
||
 
*
 
|-
 
 
 
|-
 
|Chen Chen
 
||
 
*
 
 
||
 
||
 
*
 
*
第73行: 第69行:
 
|Xiaolou Li
 
|Xiaolou Li
 
||
 
||
*
+
* LLM long context test
 +
* Poster for IS24
 +
* Paper reading
 
||
 
||
 
*
 
*
第84行: 第82行:
 
|Zehua Liu
 
|Zehua Liu
 
||
 
||
*
+
*CNVSRC 2024 Website
 +
*Data transfer to HUAWEI
 +
*LLM in Chinese VSR(In-context-learning)
 
||
 
||
 
*
 
*
第95行: 第95行:
 
|Pengqi Li
 
|Pengqi Li
 
||
 
||
*
+
* Extend Proposal for 'HOW PHONEMES CONTRIBUTE TO DEEP SPEAKER MODELS?'[https://z1et6d3xtb.feishu.cn/docx/NSjYdWC6JorHKBxoUdxcMzman3f]
 +
** Reviewing code, paper.
 +
** Analyzing di-phones in Audio-Mnist.
 +
** Start exp with TIMIT dataset.
 
||
 
||
 
*
 
*
 
||
 
||
*
+
* 9.20(one month)
 
|-
 
|-
  
第106行: 第109行:
 
|Wan Lin
 
|Wan Lin
 
||
 
||
*
+
* Neural Scoring: vox2+voxblink1 [https://z1et6d3xtb.feishu.cn/docx/BywjdkGvNou12sxQ4dAcxYa9noh?from=from_copylink]
 
||
 
||
 
*
 
*
第117行: 第120行:
 
|Tianhao Wang
 
|Tianhao Wang
 
||
 
||
*
+
* AudioSep reproducing
 +
* IS24 poster
 
||
 
||
 
*
 
*
 
||
 
||
*
 
 
|-
 
|-
  
第128行: 第131行:
 
|Zhenyu Zhou
 
|Zhenyu Zhou
 
||
 
||
*
+
*Some thinking about onnx quantization[https://z1et6d3xtb.feishu.cn/docx/S9ChdyH7go490txZ2ZHcNjXTn2b]
 
||
 
||
 
*
 
*
第139行: 第142行:
 
|Junhui Chen
 
|Junhui Chen
 
||
 
||
*
+
* Neural Scoring:
 +
** Vox2+Voxblink-clean test[https://z1et6d3xtb.feishu.cn/docx/BywjdkGvNou12sxQ4dAcxYa9noh#Ro69dyERUoN0HvxOGMWcOJjrnuf]
 
||
 
||
 
*
 
*
第150行: 第154行:
 
|Jiaying Wang
 
|Jiaying Wang
 
||
 
||
*
+
* re-write conditional chain code(can be finished this week)
 +
* check wsj data
 +
 
 
||
 
||
 
*
 
*
第161行: 第167行:
 
|Yu Zhang
 
|Yu Zhang
 
||
 
||
*
+
* AED engineering problem assist
 +
* Prepare for report
 
||
 
||
 
*
 
*
第172行: 第179行:
 
|Wenqiang Du
 
|Wenqiang Du
 
||
 
||
*
+
* Complete the unified format and recheck of Primary school handbook
||
+
* Write middle school handbook(29-41)
*
+
* Training Chinese and Cantonese KWS model
 +
 
 
||
 
||
 
*
 
*
第183行: 第191行:
 
|Yang Wei
 
|Yang Wei
 
||
 
||
*
+
* Check the badcase of KWS model test.
 
||
 
||
 
*
 
*
第212行: 第220行:
 
|Yue Gu
 
|Yue Gu
 
||
 
||
*
+
* write the introduction
 +
* test the adaptation model on the same accent data:[https://www.yuque.com/g/shibeiing/angax2/efpdhqvsxdi4phua/collaborator/join?token=ysvAeigXC9KFi4CF&source=doc_collaborator]
 +
(got sick today)
 
||
 
||
 
*
 
*
第221行: 第231行:
 
|Qi Qu
 
|Qi Qu
 
||
 
||
*  
+
* KWS:
 +
** zh48 test dataset updated: 29 speakers in 3 locations, ~600 utterances per keyword.
 +
** Recall ~ FA relations plotted.
 
||
 
||
 
*
 
*

2024年8月26日 (一) 11:03的最后版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • Primary school book (17)
  • College AI education
Lantian Li
  • GPU status [1]
  • AI primary
    • High school handbook (40/40)
Ying Shi
  • Fenyinta stuff
  • reproduce cohort-SOT overlap ASR and some analysis
  • Text enroll keywords spotting intermediate PIT-SOT CTC + high layer cross-attention is in progress here
Zhenghai You
  • Speaker Augument: Completed experiments in Libri2mix, Low SISDR testset lower speaker confusion rate
  • ExFormer: Always inferior to the SOTA[2]
Junming Yuan
  • Confirmed that the performance gap of the 10% is determined by the impact of GPUs[3].
    • To fully reproduce the official model, it would take approximately 32 days.
  • Investigate how to train Hubert with Mix-speech (in progress)
Xiaolou Li
  • LLM long context test
  • Poster for IS24
  • Paper reading
Zehua Liu
  • CNVSRC 2024 Website
  • Data transfer to HUAWEI
  • LLM in Chinese VSR(In-context-learning)
Pengqi Li
  • Extend Proposal for 'HOW PHONEMES CONTRIBUTE TO DEEP SPEAKER MODELS?'[4]
    • Reviewing code, paper.
    • Analyzing di-phones in Audio-Mnist.
    • Start exp with TIMIT dataset.
  • 9.20(one month)
Wan Lin
  • Neural Scoring: vox2+voxblink1 [5]
Tianhao Wang
  • AudioSep reproducing
  • IS24 poster
Zhenyu Zhou
  • Some thinking about onnx quantization[6]
Junhui Chen
  • Neural Scoring:
    • Vox2+Voxblink-clean test[7]
Jiaying Wang
  • re-write conditional chain code(can be finished this week)
  • check wsj data
Yu Zhang
  • AED engineering problem assist
  • Prepare for report
Wenqiang Du
  • Complete the unified format and recheck of Primary school handbook
  • Write middle school handbook(29-41)
  • Training Chinese and Cantonese KWS model
Yang Wei
  • Check the badcase of KWS model test.
Lily
Turi
  • Added more sections to the draft paper
    • Need to refine and do more experiments
Yue Gu
  • write the introduction
  • test the adaptation model on the same accent data:[8]

(got sick today)

Qi Qu
  • KWS:
    • zh48 test dataset updated: 29 speakers in 3 locations, ~600 utterances per keyword.
    • Recall ~ FA relations plotted.