“2024-09-30”版本间的差异
来自cslt Wiki
(11位用户的15个中间修订版本未显示) | |||
第6行: | 第6行: | ||
|Dong Wang | |Dong Wang | ||
|| | || | ||
− | * | + | * AI graph (high education version) |
|| | || | ||
* | * | ||
第47行: | 第47行: | ||
|Zhenghai You | |Zhenghai You | ||
|| | || | ||
− | * | + | * Exploring the role of speaker encoder in TSE[https://z1et6d3xtb.feishu.cn/docx/GHF8doRjDo50ihxGUPpcsZgLncb] |
+ | ** Joint traing Spk Enc have better separation effect, but the EER is poor | ||
+ | ** Pretrain & Freezing Spk Enc EER well, but SI-SDR is poor | ||
+ | ** Further explore the different impacts of using spk aug on different tasks | ||
+ | * The generality of SPK-AUG | ||
+ | ** Refactored DPRNN-TSE results are reliable and have been accelerated from 87 hours to 32 hours | ||
|| | || | ||
* | * | ||
第79行: | 第84行: | ||
|Xiaolou Li | |Xiaolou Li | ||
|| | || | ||
− | * | + | * Use MFA on LRS3 to cut it into small segments |
+ | * Use discrete embedding of avhubert in vsp-llm training (Still training) | ||
+ | * Some idea of align video feature and LLM (Dense Connector, CL methods) | ||
+ | * Handover the data collection and get familiar with the process | ||
+ | * Data Collection: 3138 h (need to re-check, DDL: 10.15) | ||
|| | || | ||
* | * | ||
第90行: | 第99行: | ||
|Zehua Liu | |Zehua Liu | ||
|| | || | ||
− | * | + | *Baseline System VSP-LLM |
+ | *Try Qwen2.5-14B[https://z1et6d3xtb.feishu.cn/docx/JBsidACDVojhCaxFQLbcCVbsnAc?from=from_copylink] | ||
|| | || | ||
* | * | ||
第122行: | 第132行: | ||
|- | |- | ||
|Tianhao Wang | |Tianhao Wang | ||
+ | || | ||
+ | * AudioSep reproduction | ||
+ | ** problem: LAION CLAP needs 48kHz audio so the data needs to be up-resample | ||
|| | || | ||
* | * | ||
+ | || | ||
+ | * | ||
+ | |- | ||
+ | |||
+ | |||
+ | |- | ||
+ | |Xiaoxue Luo | ||
+ | || | ||
+ | *AI-Graph High school handbook(v0.1) | ||
|| | || | ||
* | * | ||
第134行: | 第156行: | ||
|Zhenyu Zhou | |Zhenyu Zhou | ||
|| | || | ||
− | * | + | * Model Quantization document submit |
+ | * Review conditional chain code | ||
|| | || | ||
* | * | ||
第145行: | 第168行: | ||
|Junhui Chen | |Junhui Chen | ||
|| | || | ||
− | * | + | * Voxblink1 model training and testing |
+ | ** Writing test code for NS in ossi test. | ||
|| | || | ||
* | * | ||
第167行: | 第191行: | ||
|Yu Zhang | |Yu Zhang | ||
|| | || | ||
− | * | + | * Fri Report |
+ | * Change SocioDojo Agent from ChatGPT-3.5-Turbo to Llama-3.1-8B (still working) | ||
|| | || | ||
* | * | ||
第178行: | 第203行: | ||
|Wenqiang Du | |Wenqiang Du | ||
|| | || | ||
− | * | + | *Check primary school handbook(43/45) |
+ | *Release chinese and haining KWS model | ||
|| | || | ||
* | * | ||
第199行: | 第225行: | ||
|Lily | |Lily | ||
|| | || | ||
− | * | + | * APSIPA workshop Tianjin and Prepare Friday's report |
+ | * Prepare for online-course | ||
+ | * AI radiance's daily work | ||
|| | || | ||
* | * | ||
第209行: | 第237行: | ||
|Turi | |Turi | ||
|| | || | ||
− | * | + | * Segmented audios in dataset into individual words. |
+ | * Paper reading | ||
|| | || | ||
* | * | ||
第226行: | 第255行: | ||
|Qi Qu | |Qi Qu | ||
|| | || | ||
− | * | + | * KWS |
+ | ** Testing zh48 models on dataset of Mandarin Chinese w/ Guangdong accent: recall drops significantly. | ||
+ | * AED | ||
+ | ** Evaluating third-party solution of baby crying detection. | ||
+ | * Misc. | ||
+ | ** Preparing for live talk. | ||
|| | || | ||
* | * |
2024年10月7日 (一) 08:53的最后版本
People | This Week | Next Week | Task Tracking (DeadLine) |
---|---|---|---|
Dong Wang |
|
|
|
Lantian Li |
|
|
|
Ying Shi |
|
|
|
Zhenghai You |
|
|
|
Junming Yuan |
|
|
|
Chen Chen |
|
|
|
Xiaolou Li |
|
|
|
Zehua Liu |
|
|
|
Pengqi Li |
|
|
|
Wan Lin |
|
|
|
Tianhao Wang |
|
|
|
Xiaoxue Luo |
|
|
|
Zhenyu Zhou |
|
|
|
Junhui Chen |
|
|
|
Jiaying Wang |
|
|
|
Yu Zhang |
|
|
|
Wenqiang Du |
|
|
|
Yang Wei |
|
|
|
Lily |
|
|
|
Turi |
|
|
|
Yue Gu |
|
|
|
Qi Qu |
|
|
|