“2024-11-11”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“{| class="wikitable" !People !! This Week !! Next Week !! Task Tracking (<font color="red">DeadLine</font>) |- |- |Dong Wang || * || * || * |- |- |Lantian Li || *...”为内容创建页面)
 
 
(16位用户的26个中间修订版本未显示)
第6行: 第6行:
 
|Dong Wang
 
|Dong Wang
 
||
 
||
*
+
* Tianjian AI book (done)
 
||
 
||
 
*
 
*
第17行: 第17行:
 
|Lantian Li
 
|Lantian Li
 
||
 
||
*
+
* Complete all the script for the 2025 AI calendar
 +
* AI-Graph EN (32/50)
 
||
 
||
 
*
 
*
第39行: 第40行:
 
|Zhenghai You
 
|Zhenghai You
 
||
 
||
*
+
* Huawei project with IRA-TSE[https://z1et6d3xtb.feishu.cn/docx/R05DdrPVqoSzQYxNlhicedxenkd]
 
||
 
||
 
*
 
*
第49行: 第50行:
 
|Junming Yuan
 
|Junming Yuan
 
||
 
||
*
+
* re-check some details from Cocktail HuBERT paper and prepared the code.
 +
**pseudo-label preparation finished.
 +
* paper reading
 
||
 
||
 
*
 
*
第58行: 第61行:
  
 
|-
 
|-
|Chen Chen
+
|Xiaolou Li
 
||
 
||
*
+
* Finish VTS documents with Zehua
 +
* Process the CVS3 data
 +
* Inherit the AV-HuBERT training code and debug
 
||
 
||
 
*
 
*
第69行: 第74行:
  
 
|-
 
|-
|Xiaolou Li
+
|Zehua Liu
 
||
 
||
*
+
*Finish 2 VTS documents with Xiaolou
 +
**Financial Document
 +
**Technical Document
 +
*Paper Reading on last Friday
 
||
 
||
 
*
 
*
第80行: 第88行:
  
 
|-
 
|-
|Zehua Liu
+
|Pengqi Li
 
||
 
||
*
+
* Analyze the distribution of phoneme importance(PID) in the TIMIT dataset based on more SOTA models(TDNN 4.4% , ECAPA:2.8%).
 +
** Conclusions still need to be further analyzed in conjunction with other databases.[https://z1et6d3xtb.feishu.cn/docx/VtlIdFxdRodp8Nx8oQjcVLC4nCd]
 
||
 
||
 
*
 
*
第91行: 第100行:
  
 
|-
 
|-
|Pengqi Li
+
|Wan Lin
 
||
 
||
*
+
* NS: detection
 +
** clean: 1.479% EER vs. 1.239% EER
 +
** multi: in training
 
||
 
||
 
*
 
*
第102行: 第113行:
  
 
|-
 
|-
|Wan Lin
+
|Tianhao Wang
 
||
 
||
*
+
* ablation study about some new approach for sound separation [https://z1et6d3xtb.feishu.cn/docx/NLlsdyLtuoptjYxjcX0cwlVbnXc]
 
||
 
||
 
*
 
*
第113行: 第124行:
  
 
|-
 
|-
|Tianhao Wang
+
|Xiaoxue Luo
 
||
 
||
*
+
* paper reading to investigate some new approach for sound separation
 +
* retrain AudioSep with a DPRNN block(AudioSep-DP)
 
||
 
||
 
*
 
*
第126行: 第138行:
 
|Zhenyu Zhou
 
|Zhenyu Zhou
 
||
 
||
*
+
*Attemp to add silence loss during training(seems like useless)
 +
*Conditional Chain 2-5 mix results(still some bugs,the acc of speaker number is poor)[https://z1et6d3xtb.feishu.cn/docx/D2UQdxMBvojkF9xCXGfcFBLGned]
 
||
 
||
 
*
 
*
第137行: 第150行:
 
|Junhui Chen
 
|Junhui Chen
 
||
 
||
*
+
* VAD frame level detection loss
 +
** Loss decreases faster in the early stages of training
 +
* Change test encoder: from resnet34 to transformer encoder (coding...)
 
||
 
||
 
*
 
*
第159行: 第174行:
 
|Yu Zhang
 
|Yu Zhang
 
||
 
||
*
+
* SocioDojo
 +
** Single stock (TSLA) investment (still running)
 +
* Investigate some Text guided LLM centric time-series forecaster and reproduce some of them (Time-LLM LLM-Process, AutoTimes), and some toy experiment about how prompt prefix influence the forecast result
 
||
 
||
 
*
 
*
第170行: 第187行:
 
|Wenqiang Du
 
|Wenqiang Du
 
||
 
||
*
+
* Training of New language Models(Cantonese)
 +
* Prepare the PPT for the competition
 
||
 
||
 
*
 
*
第181行: 第199行:
 
|Yang Wei
 
|Yang Wei
 
||
 
||
*
+
* Train text enroll KWS model with 7000h data
 
||
 
||
 
*
 
*
第201行: 第219行:
 
|Turi
 
|Turi
 
||
 
||
*
+
* kws data preparation and checking some implementations
 +
* Paper Reading about kws
 
||
 
||
 
*
 
*
第209行: 第228行:
 
|Yue Gu
 
|Yue Gu
 
||
 
||
*
+
* use CosyVoice model to synthesize the target speaker utterance, which is employed as the supplement for target speaker adaptation. The adaptation exp is running.
 +
* icassp 2025 paper review
 +
* paper writing
 
||
 
||
 
*
 
*
第218行: 第239行:
 
|Qi Qu
 
|Qi Qu
 
||
 
||
*  
+
* KWS:
 +
** Yi (Liangshan, Sichuan) test dataset annotated and finalized. Optimal thresholds for predefined scenes. Cloud model service deployed.
 +
** Quantization for NPU with more calibration data (6k): mean_loss=1.3e-4, max_loss=6.2e-2.
 +
** NPU demo: feature extraction + model inference.
 +
** Text-enroll method: android demo benchmark.
 
||
 
||
 
*
 
*

2024年11月11日 (一) 11:05的最后版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • Tianjian AI book (done)
Lantian Li
  • Complete all the script for the 2025 AI calendar
  • AI-Graph EN (32/50)
Ying Shi
Zhenghai You
  • Huawei project with IRA-TSE[1]
Junming Yuan
  • re-check some details from Cocktail HuBERT paper and prepared the code.
    • pseudo-label preparation finished.
  • paper reading
Xiaolou Li
  • Finish VTS documents with Zehua
  • Process the CVS3 data
  • Inherit the AV-HuBERT training code and debug
Zehua Liu
  • Finish 2 VTS documents with Xiaolou
    • Financial Document
    • Technical Document
  • Paper Reading on last Friday
Pengqi Li
  • Analyze the distribution of phoneme importance(PID) in the TIMIT dataset based on more SOTA models(TDNN 4.4% , ECAPA:2.8%).
    • Conclusions still need to be further analyzed in conjunction with other databases.[2]
Wan Lin
  • NS: detection
    • clean: 1.479% EER vs. 1.239% EER
    • multi: in training
Tianhao Wang
  • ablation study about some new approach for sound separation [3]
Xiaoxue Luo
  • paper reading to investigate some new approach for sound separation
  • retrain AudioSep with a DPRNN block(AudioSep-DP)
Zhenyu Zhou
  • Attemp to add silence loss during training(seems like useless)
  • Conditional Chain 2-5 mix results(still some bugs,the acc of speaker number is poor)[4]
Junhui Chen
  • VAD frame level detection loss
    • Loss decreases faster in the early stages of training
  • Change test encoder: from resnet34 to transformer encoder (coding...)
Jiaying Wang
Yu Zhang
  • SocioDojo
    • Single stock (TSLA) investment (still running)
  • Investigate some Text guided LLM centric time-series forecaster and reproduce some of them (Time-LLM LLM-Process, AutoTimes), and some toy experiment about how prompt prefix influence the forecast result
Wenqiang Du
  • Training of New language Models(Cantonese)
  • Prepare the PPT for the competition
Yang Wei
  • Train text enroll KWS model with 7000h data
Lily
Turi
  • kws data preparation and checking some implementations
  • Paper Reading about kws
Yue Gu
  • use CosyVoice model to synthesize the target speaker utterance, which is employed as the supplement for target speaker adaptation. The adaptation exp is running.
  • icassp 2025 paper review
  • paper writing
Qi Qu
  • KWS:
    • Yi (Liangshan, Sichuan) test dataset annotated and finalized. Optimal thresholds for predefined scenes. Cloud model service deployed.
    • Quantization for NPU with more calibration data (6k): mean_loss=1.3e-4, max_loss=6.2e-2.
    • NPU demo: feature extraction + model inference.
    • Text-enroll method: android demo benchmark.