|
|
| 第132行: |
第132行: |
| | |Tianhao Wang | | |Tianhao Wang |
| | || | | || |
| − | * | + | * CLIPSep exps for 2-mix and 5-mix |
| | + | ** 2-mix(whole vggsound, 300 classes): SDR-mix = -1.1748, SDR-separate = 5.0145 |
| | + | ** 5-mix(50 classes of vggsound): SDR-mix = -11.4529, SDR-separate = -0.4764 |
| | || | | || |
| | * | | * |
| People |
This Week |
Next Week |
Task Tracking (DeadLine)
|
| Dong Wang
|
- AI handbook high-education version, experiment booklet
- Check AI primary school handbook (1-20)
|
|
|
| Lantian Li
|
|
|
|
| Ying Shi
|
- Finish Text enroll keywords spotting code & document and deliver to Wei & Du
- Cohort Overlap ASR code v0.0
- code has finished and training has been done
- Cohort Speech separation code v0.0
- code has finished training is in progress
- here
|
|
|
| Zhenghai You
|
- Exploring the role of speaker encoder in TSE and generality of SPK-AUG[1]
|
|
|
| Junming Yuan
|
- MT-Hubert exp[2]:
- codebook set + infoNCE ---> FC+softmax+CE / FC+sigmoid+BCE
- To reduce the learning rate can work.
- verified the feat-mask MT-Hubert with different lr
- time-mask MT-Hubert verification (in progress)
|
|
|
| Chen Chen
|
|
|
|
| Xiaolou Li
|
|
|
|
| Zehua Liu
|
- Av-Hubert(Frozen) as Encoder performe very bad(cer:80%)[3]
- after finetune maybe better ,but still bad
- Qwen-14B perform better(47%) than Qwen-7B(50%)
- Finish In-Context-Learning code and is training
- maybe i will get result very soon
|
- verify collected data with XiaoLou
- finish VTS data Acceptance report
|
|
| Pengqi Li
|
- Evaluate TAO and LayerCAM(verification) reliability.
- Exploring the Consistency of TAO and LayerCAM Results on different models and datasets.
|
|
|
| Wan Lin
|
|
|
|
| Tianhao Wang
|
- CLIPSep exps for 2-mix and 5-mix
- 2-mix(whole vggsound, 300 classes): SDR-mix = -1.1748, SDR-separate = 5.0145
- 5-mix(50 classes of vggsound): SDR-mix = -11.4529, SDR-separate = -0.4764
|
|
|
| Xiaoxue Luo
|
- Paper reading about sound separation
- AudioSep reproduction
- Training time is too long -> replace with a small dataset(in training)
|
|
|
| Zhenyu Zhou
|
- Model quantization version2
- Multi-talker mix data preparation
|
|
|
| Junhui Chen
|
|
|
|
| Jiaying Wang
|
|
|
|
| Yu Zhang
|
- SocioDojo Llama version
- news integration is adjusted once every 12 hours
- wikipedia & google search is banned
|
|
|
| Wenqiang Du
|
- Check the data from past training models and update the KWS model again(Model testing)
- Chinese, Cantonese, Minnan, Haining and Uyghur
|
|
|
| Yang Wei
|
- Train text enroll KWS model with updated code (in progress)
|
|
|
| Lily
|
|
|
|
| Turi
|
- Whisper model finetuning[4]
|
|
|
| Yue Gu
|
- revise the TASLP paper
- read several papers about accent and prosody
|
|
|
| Qi Qu
|
- AED: classifiers retrained w/ new method (suppression on negative stimuli) and improvement attested.
|
|
|