“2024-09-30”版本间的差异
来自cslt Wiki
Duwenqiang(讨论 | 贡献) (以“{| class="wikitable" !People !! This Week !! Next Week !! Task Tracking (<font color="red">DeadLine</font>) |- |- |Dong Wang || * || * || * |- |- |Lantian Li || *...”为内容创建页面) |
|||
(15位用户的19个中间修订版本未显示) | |||
第6行: | 第6行: | ||
|Dong Wang | |Dong Wang | ||
|| | || | ||
− | * | + | * AI graph (high education version) |
|| | || | ||
* | * | ||
第17行: | 第17行: | ||
|Lantian Li | |Lantian Li | ||
|| | || | ||
− | * | + | * AI-Graph handbook v0.1 |
+ | * AI-Graph EN (12/50) | ||
+ | * Huawei TiDing 3.0 - Model Quantization | ||
+ | * BUPT/AI-Radiance trivial things | ||
|| | || | ||
* | * | ||
第28行: | 第31行: | ||
|Ying Shi | |Ying Shi | ||
|| | || | ||
− | * | + | * Add 4 kinds of negative sampling strategies Optimized Text-enroll KWS code |
+ | ** (deletion, substitution, insertion, and shuffle) and verify them to ensure no bugs. | ||
+ | ** Find that new negative sampling will increase the difficulty of training which indicates that only depending on positional embedding is not enough. | ||
+ | * Reproduce conditional chain overlap asr (Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals) | ||
+ | ** According to Jiaying's work the code released by the published paper can not work | ||
+ | ** Write dominance-based conditional chain overlap asr by myself (in progress) | ||
|| | || | ||
* | * | ||
第39行: | 第47行: | ||
|Zhenghai You | |Zhenghai You | ||
|| | || | ||
− | * | + | * Exploring the role of speaker encoder in TSE[https://z1et6d3xtb.feishu.cn/docx/GHF8doRjDo50ihxGUPpcsZgLncb] |
+ | ** Joint traing Spk Enc have better separation effect, but the EER is poor | ||
+ | ** Pretrain & Freezing Spk Enc EER well, but SI-SDR is poor | ||
+ | ** Further explore the different impacts of using spk aug on different tasks | ||
+ | * The generality of SPK-AUG | ||
+ | ** Refactored DPRNN-TSE results are reliable and have been accelerated from 87 hours to 32 hours | ||
|| | || | ||
* | * | ||
第71行: | 第84行: | ||
|Xiaolou Li | |Xiaolou Li | ||
|| | || | ||
− | * | + | * Use MFA on LRS3 to cut it into small segments |
+ | * Use discrete embedding of avhubert in vsp-llm training (Still training) | ||
+ | * Some idea of align video feature and LLM (Dense Connector, CL methods) | ||
+ | * Handover the data collection and get familiar with the process | ||
+ | * Data Collection: 3138 h (need to re-check, DDL: 10.15) | ||
|| | || | ||
* | * | ||
第82行: | 第99行: | ||
|Zehua Liu | |Zehua Liu | ||
|| | || | ||
− | * | + | *Baseline System VSP-LLM |
+ | *Try Qwen2.5-14B[https://z1et6d3xtb.feishu.cn/docx/JBsidACDVojhCaxFQLbcCVbsnAc?from=from_copylink] | ||
|| | || | ||
* | * | ||
第104行: | 第122行: | ||
|Wan Lin | |Wan Lin | ||
|| | || | ||
− | * | + | * Voxblink1 model training and testing [https://z1et6d3xtb.feishu.cn/docx/BywjdkGvNou12sxQ4dAcxYa9noh?from=from_copylink] |
|| | || | ||
* | * | ||
第114行: | 第132行: | ||
|- | |- | ||
|Tianhao Wang | |Tianhao Wang | ||
+ | || | ||
+ | * AudioSep reproduction | ||
+ | ** problem: LAION CLAP needs 48kHz audio so the data needs to be up-resample | ||
|| | || | ||
* | * | ||
+ | || | ||
+ | * | ||
+ | |- | ||
+ | |||
+ | |||
+ | |- | ||
+ | |Xiaoxue Luo | ||
+ | || | ||
+ | *AI-Graph High school handbook(v0.1) | ||
|| | || | ||
* | * | ||
第126行: | 第156行: | ||
|Zhenyu Zhou | |Zhenyu Zhou | ||
|| | || | ||
− | * | + | * Model Quantization document submit |
+ | * Review conditional chain code | ||
|| | || | ||
* | * | ||
第137行: | 第168行: | ||
|Junhui Chen | |Junhui Chen | ||
|| | || | ||
− | * | + | * Voxblink1 model training and testing |
+ | ** Writing test code for NS in ossi test. | ||
|| | || | ||
* | * | ||
第159行: | 第191行: | ||
|Yu Zhang | |Yu Zhang | ||
|| | || | ||
− | * | + | * Fri Report |
+ | * Change SocioDojo Agent from ChatGPT-3.5-Turbo to Llama-3.1-8B (still working) | ||
|| | || | ||
* | * | ||
第170行: | 第203行: | ||
|Wenqiang Du | |Wenqiang Du | ||
|| | || | ||
− | * | + | *Check primary school handbook(43/45) |
+ | *Release chinese and haining KWS model | ||
|| | || | ||
* | * | ||
第191行: | 第225行: | ||
|Lily | |Lily | ||
|| | || | ||
− | * | + | * APSIPA workshop Tianjin and Prepare Friday's report |
+ | * Prepare for online-course | ||
+ | * AI radiance's daily work | ||
|| | || | ||
* | * | ||
第201行: | 第237行: | ||
|Turi | |Turi | ||
|| | || | ||
− | * | + | * Segmented audios in dataset into individual words. |
+ | * Paper reading | ||
|| | || | ||
* | * | ||
第209行: | 第246行: | ||
|Yue Gu | |Yue Gu | ||
|| | || | ||
− | * | + | * Almost complete the revisions of my journal paper |
|| | || | ||
* | * | ||
第218行: | 第255行: | ||
|Qi Qu | |Qi Qu | ||
|| | || | ||
− | * | + | * KWS |
+ | ** Testing zh48 models on dataset of Mandarin Chinese w/ Guangdong accent: recall drops significantly. | ||
+ | * AED | ||
+ | ** Evaluating third-party solution of baby crying detection. | ||
+ | * Misc. | ||
+ | ** Preparing for live talk. | ||
|| | || | ||
* | * |
2024年10月7日 (一) 08:53的最后版本
People | This Week | Next Week | Task Tracking (DeadLine) |
---|---|---|---|
Dong Wang |
|
|
|
Lantian Li |
|
|
|
Ying Shi |
|
|
|
Zhenghai You |
|
|
|
Junming Yuan |
|
|
|
Chen Chen |
|
|
|
Xiaolou Li |
|
|
|
Zehua Liu |
|
|
|
Pengqi Li |
|
|
|
Wan Lin |
|
|
|
Tianhao Wang |
|
|
|
Xiaoxue Luo |
|
|
|
Zhenyu Zhou |
|
|
|
Junhui Chen |
|
|
|
Jiaying Wang |
|
|
|
Yu Zhang |
|
|
|
Wenqiang Du |
|
|
|
Yang Wei |
|
|
|
Lily |
|
|
|
Turi |
|
|
|
Yue Gu |
|
|
|
Qi Qu |
|
|
|