“2024-11-11”版本间的差异

2024年11月11日 (一) 11:05的最后版本

People	This Week	Next Week	Task Tracking (DeadLine)
Dong Wang	Tianjian AI book (done)
Lantian Li	Complete all the script for the 2025 AI calendar AI-Graph EN (32/50)
Ying Shi
Zhenghai You	Huawei project with IRA-TSE[1]
Junming Yuan	re-check some details from Cocktail HuBERT paper and prepared the code. pseudo-label preparation finished. paper reading
Xiaolou Li	Finish VTS documents with Zehua Process the CVS3 data Inherit the AV-HuBERT training code and debug
Zehua Liu	Finish 2 VTS documents with Xiaolou Financial Document Technical Document Paper Reading on last Friday
Pengqi Li	Analyze the distribution of phoneme importance(PID) in the TIMIT dataset based on more SOTA models(TDNN 4.4% , ECAPA:2.8%). Conclusions still need to be further analyzed in conjunction with other databases.[2]
Wan Lin	NS: detection clean: 1.479% EER vs. 1.239% EER multi: in training
Tianhao Wang	ablation study about some new approach for sound separation [3]
Xiaoxue Luo	paper reading to investigate some new approach for sound separation retrain AudioSep with a DPRNN block(AudioSep-DP)
Zhenyu Zhou	Attemp to add silence loss during training（seems like useless） Conditional Chain 2-5 mix results（still some bugs，the acc of speaker number is poor）[4]
Junhui Chen	VAD frame level detection loss Loss decreases faster in the early stages of training Change test encoder: from resnet34 to transformer encoder (coding...)
Jiaying Wang
Yu Zhang	SocioDojo Single stock (TSLA) investment (still running) Investigate some Text guided LLM centric time-series forecaster and reproduce some of them (Time-LLM LLM-Process, AutoTimes), and some toy experiment about how prompt prefix influence the forecast result
Wenqiang Du	Training of New language Models(Cantonese) Prepare the PPT for the competition
Yang Wei	Train text enroll KWS model with 7000h data
Lily
Turi	kws data preparation and checking some implementations Paper Reading about kws
Yue Gu	use CosyVoice model to synthesize the target speaker utterance, which is employed as the supplement for target speaker adaptation. The adaptation exp is running. icassp 2025 paper review paper writing
Qi Qu	KWS: Yi (Liangshan, Sichuan) test dataset annotated and finalized. Optimal thresholds for predefined scenes. Cloud model service deployed. Quantization for NPU with more calibration data (6k): mean_loss=1.3e-4, max_loss=6.2e-2. NPU demo: feature extraction + model inference. Text-enroll method: android demo benchmark.

@@ 第6行： / 第6行： @@
 |Dong Wang
 ||
-*
+* Tianjian AI book (done)
 ||
 *
@@ 第17行： / 第17行： @@
 |Lantian Li
 ||
-*
+* Complete all the script for the 2025 AI calendar
+* AI-Graph EN (32/50)
 ||
 *
@@ 第39行： / 第40行： @@
 |Zhenghai You
 ||
-*
+* Huawei project with IRA-TSE[https://z1et6d3xtb.feishu.cn/docx/R05DdrPVqoSzQYxNlhicedxenkd]
 ||
 *
@@ 第49行： / 第50行： @@
 |Junming Yuan
 ||
-*
+* re-check some details from Cocktail HuBERT paper and prepared the code.
+**pseudo-label preparation finished.
+* paper reading
 ||
 *
@@ 第58行： / 第61行： @@
 |-
-|Chen Chen
+|Xiaolou Li
 ||
-*
+* Finish VTS documents with Zehua
+* Process the CVS3 data
+* Inherit the AV-HuBERT training code and debug
 ||
 *
@@ 第69行： / 第74行： @@
 |-
-|Xiaolou Li
+|Zehua Liu
 ||
-*
+*Finish 2 VTS documents with Xiaolou
+**Financial Document
+**Technical Document
+*Paper Reading on last Friday
 ||
 *
@@ 第80行： / 第88行： @@
 |-
-|Zehua Liu
+|Pengqi Li
 ||
-*
+* Analyze the distribution of phoneme importance(PID) in the TIMIT dataset based on more SOTA models(TDNN 4.4% , ECAPA:2.8%).
+** Conclusions still need to be further analyzed in conjunction with other databases.[https://z1et6d3xtb.feishu.cn/docx/VtlIdFxdRodp8Nx8oQjcVLC4nCd]
 ||
 *
@@ 第91行： / 第100行： @@
 |-
-|Pengqi Li
+|Wan Lin
 ||
-*
+* NS: detection
+** clean: 1.479% EER vs. 1.239% EER
+** multi: in training
 ||
 *
@@ 第102行： / 第113行： @@
 |-
-|Wan Lin
+|Tianhao Wang
 ||
-*
+* ablation study about some new approach for sound separation [https://z1et6d3xtb.feishu.cn/docx/NLlsdyLtuoptjYxjcX0cwlVbnXc]
 ||
 *
@@ 第113行： / 第124行： @@
 |-
-|Tianhao Wang
+|Xiaoxue Luo
 ||
-*
+* paper reading to investigate some new approach for sound separation
+* retrain AudioSep with a DPRNN block(AudioSep-DP)
 ||
 *
@@ 第126行： / 第138行： @@
 |Zhenyu Zhou
 ||
-*
+*Attemp to add silence loss during training（seems like useless）
+*Conditional Chain 2-5 mix results（still some bugs，the acc of speaker number is poor）[https://z1et6d3xtb.feishu.cn/docx/D2UQdxMBvojkF9xCXGfcFBLGned]
 ||
 *
@@ 第137行： / 第150行： @@
 |Junhui Chen
 ||
-*
+* VAD frame level detection loss
+** Loss decreases faster in the early stages of training
+* Change test encoder: from resnet34 to transformer encoder (coding...)
 ||
 *
@@ 第159行： / 第174行： @@
 |Yu Zhang
 ||
-*
+* SocioDojo
+** Single stock (TSLA) investment (still running)
+* Investigate some Text guided LLM centric time-series forecaster and reproduce some of them (Time-LLM LLM-Process, AutoTimes), and some toy experiment about how prompt prefix influence the forecast result
 ||
 *
@@ 第170行： / 第187行： @@
 |Wenqiang Du
 ||
-*
+* Training of New language Models(Cantonese)
+* Prepare the PPT for the competition
 ||
 *
@@ 第181行： / 第199行： @@
 |Yang Wei
 ||
-*
+* Train text enroll KWS model with 7000h data
 ||
 *
@@ 第201行： / 第219行： @@
 |Turi
 ||
-*
+* kws data preparation and checking some implementations
+* Paper Reading about kws
 ||
 *
@@ 第209行： / 第228行： @@
 |Yue Gu
 ||
-*
+* use CosyVoice model to synthesize the target speaker utterance, which is employed as the supplement for target speaker adaptation. The adaptation exp is running.
+* icassp 2025 paper review
+* paper writing
 ||
 *
@@ 第218行： / 第239行： @@
 |Qi Qu
 ||
-*
+* KWS:
+** Yi (Liangshan, Sichuan) test dataset annotated and finalized. Optimal thresholds for predefined scenes. Cloud model service deployed.
+** Quantization for NPU with more calibration data (6k): mean_loss=1.3e-4, max_loss=6.2e-2.
+** NPU demo: feature extraction + model inference.
+** Text-enroll method: android demo benchmark.
 ||
 *

“2024-11-11”版本间的差异

2024年11月11日 (一) 11:05的最后版本

导航菜单

个人工具

名字空间

变种

查看

操作

搜索

导航

工具