People |
This Week |
Next Week |
Task Tracking (DeadLine)
|
Dong Wang
|
- Revise AI textbook of the colleage version
|
|
|
Lantian Li
|
- Submit the high school textbook
- Proofreading of the EN book (3/4)
|
|
|
Ying Shi
|
- Compare Ascend and Nvidia
- Performance: Clean ASR task 20epochs WER 6.91% : 7.02% (Ascend vs Nvidia)
- Speed: Nvidia is one time faster than Ascend
- Start think about my thesis
|
|
|
Zhenghai You
|
- Training IRA TSE for noisy enroll situation[1]
|
|
|
Junming Yuan
|
- Pretraining work:
- MT-HuBERT & Cocktail-HuBERT will be finished next week.
- Get a set of comparable finetuning results(15/5/3-shot) for each pretrain model at the 400K training step.[2]
- Check and add reference for AI junior high school handbook(1/2).(Done)
|
|
|
Xiaolou Li
|
- Writing NFSC document
- VSR training (1500 h) already have some result
- cnvsrc-single valid 300: 29.47%
- cnvsrc-multi valid: 31.60%
- webVideo valid: 15.54%
- Finished producing pseudo-label for CVS3(4000h)
|
|
|
Zehua Liu
|
- Writing NFSC document
- Lora finetune VLM(both Encoder and LLM Decoder) result seem not very well(maybe need parameeter adjustment)
- Pretrained VSR Encoder + VLM(Decoder) seems better than Normal LM
|
- Design VTS architecture and implement it
|
|
Pengqi Li
|
- Prepare the AI course for Tsinghua University Junior High School.
- Add references to the handbook(junior high school version 1/2)(Done).
|
|
|
Wan Lin
|
- Supply NS experiments [3]
- Help xiaochen reproduce the diarization SV method
|
|
|
Tianhao Wang
|
- 3-mix training: CLAPSep baseline: SDR=5.560; Ours: SDR=6.574.
- subset data training (in progress)
|
|
|
Xiaoxue Luo
|
|
|
|
Zhenyu Zhou
|
|
|
|
Junhui Chen
|
- speaker diarization baseline for NS (mix test: baseline EER 15.972% -> 12.983%) others still testing...
- make ppt about scaling law on speaker volume.
|
|
|
Jiaying Wang
|
|
|
|
Yu Zhang
|
- Multi Agent Investment
- use Top 31 stocks in 11 sector to do portfolio for better correlation with input news (no excess return)
- analysis the trading decision
- Huawei AED
- smallest model to keep AUC excess 0.9
- split inference into two phase (Phase 1: Human Voice vs None Human Voice, Phase 2: Speech vs Other Human Voice) with two smaller model
|
|
|
Wenqiang Du
|
- Check Primary handbook V3.0(Done)
|
|
|
Yang Wei
|
- Adapt text enroll kws model with synthesized dialect data.(recall: 83% -> 94%)[4]
|
|
|
Turi
|
- Finetuned Llama3 on Oromo text (pretrain)
- Experiment to use it as LM for ASR failed, 100%+ WER
|
|
|
Yue Gu
|
- a 0.4% CER reduction has achieved for one spk, but no improvement was discovered on other spks. I'm still do some exps.
- restart the synthetic-data related exps, try to fill the gap between synthetic data and real data on the output distribution of model.
|
|
|
Qi Qu
|
- Technical investigation on Visual Event Detection.
- Experiment on annotating and auditing audio with Audio LLM: insufficient VRAM; poor I/O in CPU/GPU hybrid mode.
|
|
|