“2026-01-12”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(7位用户的7个中间修订版本未显示)
第18行: 第18行:
 
|Lantian Li
 
|Lantian Li
 
||
 
||
*
+
* Final review of my MLA book (6/10)
 +
* MoE daily work
 
||
 
||
 
*
 
*
第42行: 第43行:
 
|Yang Wei
 
|Yang Wei
 
||
 
||
*
+
* Test for audio visual speech separation + ASR (offline script), with 200 2mix speech video, CER: 20%.
 +
* Chinese mispronunciation detection experiment, with Chinese Hubert as feature. (precision: 0.18, recall: 0.72)
 
||
 
||
 
*
 
*
第53行: 第55行:
 
|Ying Shi
 
|Ying Shi
 
||
 
||
*
+
* Thesis
 +
* Some stuff about HUAWEI project
 
||
 
||
 
*  
 
*  
第76行: 第79行:
 
|Lily
 
|Lily
 
||
 
||
*
+
* Check English version of middle-scholl handbook
 +
* Check English version of high-scholl handbook
 +
* Organized course materials production (小初高分册)
 
||
 
||
 
*
 
*
第87行: 第92行:
 
|Pengqi Li
 
|Pengqi Li
 
||
 
||
*
+
* Drafting the method part of paper.
 +
* Identified bugs in the reproduction code. Re-ran experiments and confirmed that conclusions remain consistent.
 +
* Assisting with the revision of the middle school handbook.
 
||
 
||
 
*
 
*
第98行: 第105行:
 
|Junming Yuan
 
|Junming Yuan
 
||
 
||
*
+
* Aug-MT-HuBERT:
 +
** Based on last week’s best configuration, Continued pre-training for 300K steps.
 +
***No improvement was observed on clean-speech tasks.
 +
** After inspection, found that the pre-training inherited a low lr from the previous model after 1.6M steps.
 +
***After increase the lr and retraining for 200K steps, there was still no improvement on clean-speech tasks.
 +
***200K steps, PR(PER): 8.14, ASR(WER): 8.93
 +
* SS Adaptation:
 +
** following the SA-WavLM strategy, further evaluated MT-HuBERT under low-resource settings (10% and 1% of the training data).
 +
*** 10% data: Cocktail(11.29) > MT-HuBERT(11.07) > WavLM(10.81)
 +
*** 1% data: Cocktail(8.56) > MT-HuBERT(8.43) > WavLM(8.11)
 +
* Draft Paper writing(EN version almost done)
 
||
 
||
 
*
 
*
第183行: 第200行:
 
|Bochao Hu
 
|Bochao Hu
 
||
 
||
*
+
* finish final exam
 +
* debug and train vsr E2E model, still in training
 +
* resently results [https://z1et6d3xtb.feishu.cn/wiki/MQS9wt7gYikboBkJ09ScovGxnmg?from=from_copylink]
 
||
 
||
 
*
 
*
第194行: 第213行:
 
|Hongcheng Zhang
 
|Hongcheng Zhang
 
||
 
||
*
+
* finish final exam
 +
* debug for asu-llm code and train the task audio caption with WavCaps(1/20 epoches)
 
||
 
||
 
*
 
*

2026年1月12日 (一) 10:58的最后版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • Double check English version of high-scholl handbook.
  • Check English version of middle-school handbook.
Lantian Li
  • Final review of my MLA book (6/10)
  • MoE daily work
Wenqiang Du
  • Complete recording high school AI courses(14/14)
  • Creat PPT for high school AI course (3/14)
Yang Wei
  • Test for audio visual speech separation + ASR (offline script), with 200 2mix speech video, CER: 20%.
  • Chinese mispronunciation detection experiment, with Chinese Hubert as feature. (precision: 0.18, recall: 0.72)
Ying Shi
  • Thesis
  • Some stuff about HUAWEI project
Yue Gu
  • Revise the Phd thesis structure
  • Seminar. In an anonymous vote of 20 people, 75% chose to freely discuss.
Lily
  • Check English version of middle-scholl handbook
  • Check English version of high-scholl handbook
  • Organized course materials production (小初高分册)
Pengqi Li
  • Drafting the method part of paper.
  • Identified bugs in the reproduction code. Re-ran experiments and confirmed that conclusions remain consistent.
  • Assisting with the revision of the middle school handbook.
Junming Yuan
  • Aug-MT-HuBERT:
    • Based on last week’s best configuration, Continued pre-training for 300K steps.
      • No improvement was observed on clean-speech tasks.
    • After inspection, found that the pre-training inherited a low lr from the previous model after 1.6M steps.
      • After increase the lr and retraining for 200K steps, there was still no improvement on clean-speech tasks.
      • 200K steps, PR(PER): 8.14, ASR(WER): 8.93
  • SS Adaptation:
    • following the SA-WavLM strategy, further evaluated MT-HuBERT under low-resource settings (10% and 1% of the training data).
      • 10% data: Cocktail(11.29) > MT-HuBERT(11.07) > WavLM(10.81)
      • 1% data: Cocktail(8.56) > MT-HuBERT(8.43) > WavLM(8.11)
  • Draft Paper writing(EN version almost done)
Yu Zhang
  • GPU Util: [1]
  • Finish final exam
  • Writing code to analyze LLM Swarm metrics, mainly looking at how ECS/PKS correlate with the optimized edge probability.
Junhui Chen
  • finish final exam
  • debug code about swarm minicrossword test
Xiaolou Li
Jiaying Wang
  • spk model training(130/300):reconstruct data, now training data is aligned with separation data
    • recall@k of 130 epoch: 2mix recall@2=0.9799, 3mix recall@3=0.9101, 4mix recall@4=0.8247
Tianhao Wang
  • revised ChainSep paper
Xiaoxue Luo
  • 2-5mix multi_head separation model for Huawei project
    • write code for the multi-speaker and multi-sound events separation task,complete data preparation and feature extraction
    • adjust the model structure to three heads(speech, music and others), the model is still in training, and current val_sisdr is 11.38
Bochao Hu
  • finish final exam
  • debug and train vsr E2E model, still in training
  • resently results [2]
Hongcheng Zhang
  • finish final exam
  • debug for asu-llm code and train the task audio caption with WavCaps(1/20 epoches)