“2025-03-10”版本间的差异
来自cslt Wiki
Yuanjunming(讨论 | 贡献) |
Luoxiaoxue(讨论 | 贡献) |
||
| (15位用户的18个中间修订版本未显示) | |||
| 第6行: | 第6行: | ||
|Dong Wang | |Dong Wang | ||
|| | || | ||
| − | * | + | |
| + | * Revise AI textbook of the colleage version | ||
|| | || | ||
| 第18行: | 第19行: | ||
|Lantian Li | |Lantian Li | ||
|| | || | ||
| − | * | + | * Submit the high school textbook |
| + | * Proofreading of the EN book (3/4) | ||
|| | || | ||
* | * | ||
| 第43行: | 第45行: | ||
|Zhenghai You | |Zhenghai You | ||
|| | || | ||
| − | * | + | * Training IRA TSE for noisy enroll situation[https://z1et6d3xtb.feishu.cn/wiki/OXubwl2fIip91vkYsgMc1duhnLd] |
|| | || | ||
* | * | ||
| 第55行: | 第57行: | ||
* Pretraining work: | * Pretraining work: | ||
** MT-HuBERT & Cocktail-HuBERT will be finished next week. | ** MT-HuBERT & Cocktail-HuBERT will be finished next week. | ||
| − | ** Get a set of comparable finetuning results for each pretrain model at the 400K training step.[https://z1et6d3xtb.feishu.cn/docx/ElAKdh07GoD8qKxGFLfc3seAnOh] | + | ** Get a set of comparable finetuning results(15/5/3-shot) for each pretrain model at the 400K training step.[https://z1et6d3xtb.feishu.cn/docx/ElAKdh07GoD8qKxGFLfc3seAnOh] |
* Check and add reference for AI junior high school handbook(1/2).(Done) | * Check and add reference for AI junior high school handbook(1/2).(Done) | ||
|| | || | ||
| 第67行: | 第69行: | ||
|Xiaolou Li | |Xiaolou Li | ||
|| | || | ||
| − | * | + | * Writing NFSC document |
| + | * VSR training (1500 h) already have some result | ||
| + | ** cnvsrc-single valid 300: 29.47% | ||
| + | ** cnvsrc-multi valid: 31.60% | ||
| + | ** webVideo valid: 15.54% | ||
| + | * Finished producing pseudo-label for CVS3(4000h) | ||
|| | || | ||
* | * | ||
| 第78行: | 第85行: | ||
|Zehua Liu | |Zehua Liu | ||
|| | || | ||
| − | * | + | *Writing NFSC document |
| + | *Lora finetune VLM(both Encoder and LLM Decoder) result seem not very well(maybe need parameeter adjustment) | ||
| + | *Pretrained VSR Encoder + VLM(Decoder) seems better than Normal LM | ||
|| | || | ||
| − | * | + | *Design VTS architecture and implement it |
|| | || | ||
* | * | ||
| 第101行: | 第110行: | ||
|Wan Lin | |Wan Lin | ||
|| | || | ||
| − | * | + | * Supply NS experiments [https://z1et6d3xtb.feishu.cn/docx/MxBNdPbLao0tsoxkBVCcUgUoneh?from=from_copylink] |
| + | * Help xiaochen reproduce the diarization SV method | ||
|| | || | ||
* | * | ||
| 第112行: | 第122行: | ||
|Tianhao Wang | |Tianhao Wang | ||
|| | || | ||
| − | * | + | * 3-mix training: CLAPSep baseline: SDR=5.560; Ours: SDR=6.574. |
| + | * subset data training (in progress) | ||
|| | || | ||
* | * | ||
| 第123行: | 第134行: | ||
|Xiaoxue Luo | |Xiaoxue Luo | ||
|| | || | ||
| − | * | + | * Sound separation |
| + | ** baseline: change the code of AudioSep so that its audio mixing method during training is the same as our method | ||
| + | * paper reading and sharing in last Friday | ||
|| | || | ||
* | * | ||
| 第145行: | 第158行: | ||
|Junhui Chen | |Junhui Chen | ||
|| | || | ||
| − | * | + | * speaker diarization baseline for NS (mix test: baseline EER 15.972% -> 12.983%) others still testing... |
| + | * make ppt about scaling law on speaker volume. | ||
|| | || | ||
* | * | ||
| 第166行: | 第180行: | ||
|- | |- | ||
|Yu Zhang | |Yu Zhang | ||
| + | || | ||
| + | * Multi Agent Investment | ||
| + | ** use Top 31 stocks in 11 sector to do portfolio for better correlation with input news (no excess return) | ||
| + | ** analysis the trading decision | ||
| + | * Huawei AED | ||
| + | ** smallest model to keep AUC excess 0.9 | ||
| + | ** split inference into two phase (Phase 1: Human Voice vs None Human Voice, Phase 2: Speech vs Other Human Voice) with two smaller model | ||
|| | || | ||
* | * | ||
| − | |||
| − | |||
|| | || | ||
* | * | ||
| 第190行: | 第209行: | ||
|Yang Wei | |Yang Wei | ||
|| | || | ||
| − | * | + | * Adapt text enroll kws model with synthesized dialect data.(recall: 83% -> 94%)[https://z1et6d3xtb.feishu.cn/docx/WFBJdF3D0o6w6bxHCJBcn9DIndg] |
|| | || | ||
* | * | ||
| 第200行: | 第219行: | ||
|Turi | |Turi | ||
|| | || | ||
| − | * | + | * Finetuned Llama3 on Oromo text (pretrain) |
| + | * Experiment to use it as LM for ASR failed, 100%+ WER | ||
|| | || | ||
* | * | ||
| 第208行: | 第228行: | ||
|Yue Gu | |Yue Gu | ||
|| | || | ||
| − | * | + | * a 0.4% CER reduction has achieved for one spk, but no improvement was discovered on other spks. I'm still do some exps. |
| + | * restart the synthetic-data related exps, try to fill the gap between synthetic data and real data on the output distribution of model. | ||
|| | || | ||
* | * | ||
| 第217行: | 第238行: | ||
|Qi Qu | |Qi Qu | ||
|| | || | ||
| − | * | + | * Technical investigation on Visual Event Detection. |
| + | * Experiment on annotating and auditing audio with Audio LLM: insufficient VRAM; poor I/O in CPU/GPU hybrid mode. | ||
|| | || | ||
* | * | ||
2025年3月10日 (一) 10:59的最后版本
| People | This Week | Next Week | Task Tracking (DeadLine) |
|---|---|---|---|
| Dong Wang |
|
|
|
| Lantian Li |
|
|
|
| Ying Shi |
|
|
|
| Zhenghai You |
|
|
|
| Junming Yuan |
|
|
|
| Xiaolou Li |
|
|
|
| Zehua Liu |
|
|
|
| Pengqi Li |
|
|
|
| Wan Lin |
|
|
|
| Tianhao Wang |
|
|
|
| Xiaoxue Luo |
|
|
|
| Zhenyu Zhou |
|
|
|
| Junhui Chen |
|
|
|
| Jiaying Wang |
|
|
|
| Yu Zhang |
|
|
|
| Wenqiang Du |
|
|
|
| Yang Wei |
|
|
|
| Turi |
|
|
|
| Yue Gu |
|
|
|
| Qi Qu |
|
|
|