“2024-09-23”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(13位用户的25个中间修订版本未显示)
第19行: 第19行:
 
|Lantian Li
 
|Lantian Li
 
||
 
||
*
+
* AI-Graph EN (1/4)
 +
* Huawei Project Proposal v1.0
 +
* First Lesson on 24-fall AI Undergraduates
 
||
 
||
 
*
 
*
第30行: 第32行:
 
|Ying Shi
 
|Ying Shi
 
||
 
||
*  
+
* Huawei project proposal
 +
* Optimize the Text-enroll KWS code
 +
**  improve readability.
 +
** remove redundant code.
 
||
 
||
 
*
 
*
第41行: 第46行:
 
|Zhenghai You
 
|Zhenghai You
 
||
 
||
*
+
* Exploring the generality of spk aug on different data and structures[https://z1et6d3xtb.feishu.cn/docx/XXjPdjTjho7qwNxzEr7cTVjjnse]
 
||
 
||
 
*
 
*
第53行: 第58行:
 
* double check mixed Hubert code:
 
* double check mixed Hubert code:
 
** fix some bugs (time-mask.etc)
 
** fix some bugs (time-mask.etc)
** time-mask vs. feat mask: (27.98%, 23.17%) vs.(23.19%, 25.99%)
+
** feat-mask vs. time-mask: (Top-1 acc, EER): (27.98%, 23.17%) vs.(23.19%, 25.99%)
** softmax+CE vs. sigmoid+BCE: (27.98%, 21.65%) vs.(26.24%, 23.91%)
+
** softmax+CE --> sigmoid+BCE still have problem.
** feat-mask MT hubert (in progress)
+
||
+
*
+
||
+
*
+
|-
+
 
+
 
+
|-
+
|Chen Chen
+
||
+
*
+
 
||
 
||
 
*
 
*
第77行: 第70行:
 
|Xiaolou Li
 
|Xiaolou Li
 
||
 
||
*
+
* Writing VTS documents
 +
* Paper Reading & Preparing for Report
 +
* Exp on LRS3
 +
** LLM: LLaMA2 -> LLaMA3.1 (30h ↓0.4%)
 +
** Grouping LLaMA2: (443h ↑0.5%, 30h ↓2.5%)
 +
* Rethinking the method to inject information (ablation study first)
 
||
 
||
 
*
 
*
第88行: 第86行:
 
|Zehua Liu
 
|Zehua Liu
 
||
 
||
*
+
*Finish VTS document with Xiaolou
 +
*Reorganize my previous code about VSR-LLM
 +
*Run some exps(still training)
 
||
 
||
 
*
 
*
第99行: 第99行:
 
|Pengqi Li
 
|Pengqi Li
 
||
 
||
*
+
* Implement TAO and LayerCAM on the verification task[https://z1et6d3xtb.feishu.cn/docx/EBB3dcGzioCEoaxh8vUchVPgn9c]
 +
* Evaluate it reliability.
 
||
 
||
 
*
 
*
第110行: 第111行:
 
|Wan Lin
 
|Wan Lin
 
||
 
||
*
+
* VC2 pre-train; VB1+VC2 mix-tuning
 +
** Data filter in VB1: 1.25% EER in vox1-o
 +
* VB1 pre-train; VC2 fine-tuning
 +
** VB1 pre-train: 2.61% EER in vox1-o
 +
** VC2 fine-tuning: maybe couldn't reach better performance
 
||
 
||
 
*
 
*
第121行: 第126行:
 
|Tianhao Wang
 
|Tianhao Wang
 
||
 
||
*
+
* IS24 paper reading & weekly report
 +
* sound separartion project proposal
 
||
 
||
*
+
* AudioSep reproduction
 
||
 
||
 
*
 
*
第132行: 第138行:
 
|Zhenyu Zhou
 
|Zhenyu Zhou
 
||
 
||
*
+
* Model quantification phase document
 +
* paper reading
 
||
 
||
 
*
 
*
第143行: 第150行:
 
|Junhui Chen
 
|Junhui Chen
 
||
 
||
*
+
* Some vb1 filter exps with NS as Wan Lin marked.
 +
* Prepare vb1 test data and code, ready for vb2 training.
 
||
 
||
 
*
 
*
第178行: 第186行:
 
|Wenqiang Du
 
|Wenqiang Du
 
||
 
||
*
+
* optimize primary school handbook(14/45)
 +
* Some of the company's work
 +
** Training of New Dialect Models
 +
** Project application
 
||
 
||
 
*
 
*
第189行: 第200行:
 
|Yang Wei
 
|Yang Wei
 
||
 
||
*
+
* Train text enroll KWS model with aishell and kespeech data.
 +
* Prepare live broadcast
 
||
 
||
 
*
 
*
第199行: 第211行:
 
|Lily
 
|Lily
 
||
 
||
*
+
* Prepare for holiday course(October 2nd、3rd) and online-course
 +
* AI radiance's daily work
 
||
 
||
 
*
 
*
第209行: 第222行:
 
|Turi
 
|Turi
 
||
 
||
*
+
* Trained conformer on Sagalee data excluding utterances containing digits
 +
** Achieved 21.28% WER, 2.65 WER reduction
 +
* Preparing KWS data from Sagalee dataset using MFA
 
||
 
||
 
*
 
*

2024年9月23日 (一) 10:58的最后版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • AIGraph high education version
  • Prepare AIgraph Large Model version
  • NMI paper publication staff
Lantian Li
  • AI-Graph EN (1/4)
  • Huawei Project Proposal v1.0
  • First Lesson on 24-fall AI Undergraduates
Ying Shi
  • Huawei project proposal
  • Optimize the Text-enroll KWS code
    • improve readability.
    • remove redundant code.
Zhenghai You
  • Exploring the generality of spk aug on different data and structures[1]
Junming Yuan
  • double check mixed Hubert code:
    • fix some bugs (time-mask.etc)
    • feat-mask vs. time-mask: (Top-1 acc, EER): (27.98%, 23.17%) vs.(23.19%, 25.99%)
    • softmax+CE --> sigmoid+BCE still have problem.
Xiaolou Li
  • Writing VTS documents
  • Paper Reading & Preparing for Report
  • Exp on LRS3
    • LLM: LLaMA2 -> LLaMA3.1 (30h ↓0.4%)
    • Grouping LLaMA2: (443h ↑0.5%, 30h ↓2.5%)
  • Rethinking the method to inject information (ablation study first)
Zehua Liu
  • Finish VTS document with Xiaolou
  • Reorganize my previous code about VSR-LLM
  • Run some exps(still training)
Pengqi Li
  • Implement TAO and LayerCAM on the verification task[2]
  • Evaluate it reliability.
Wan Lin
  • VC2 pre-train; VB1+VC2 mix-tuning
    • Data filter in VB1: 1.25% EER in vox1-o
  • VB1 pre-train; VC2 fine-tuning
    • VB1 pre-train: 2.61% EER in vox1-o
    • VC2 fine-tuning: maybe couldn't reach better performance
Tianhao Wang
  • IS24 paper reading & weekly report
  • sound separartion project proposal
  • AudioSep reproduction
Zhenyu Zhou
  • Model quantification phase document
  • paper reading
Junhui Chen
  • Some vb1 filter exps with NS as Wan Lin marked.
  • Prepare vb1 test data and code, ready for vb2 training.
Jiaying Wang
Yu Zhang
  • Dataset collection from THS
  • Retraining R^2 SAC paper, with same env still failed (TCN ACC: 0.708, RECALL: 0.183), will check with Han this week
  • Paper reading and some plan (report this Fri)
Wenqiang Du
  • optimize primary school handbook(14/45)
  • Some of the company's work
    • Training of New Dialect Models
    • Project application
Yang Wei
  • Train text enroll KWS model with aishell and kespeech data.
  • Prepare live broadcast
Lily
  • Prepare for holiday course(October 2nd、3rd) and online-course
  • AI radiance's daily work
Turi
  • Trained conformer on Sagalee data excluding utterances containing digits
    • Achieved 21.28% WER, 2.65 WER reduction
  • Preparing KWS data from Sagalee dataset using MFA
Yue Gu
  • paper writing
  • open the code
  • prepare for the presentation
Qi Qu
  • KWS:
    • Finding ideal thresholds for b0-models in predefined scenes: Chinese Mandarin, Cantonese, Uyghur and Kazakh.
    • Finding ideal thresholds for b6-models with fixed b0-model thresholds.
  • AED:
    • Fixing parameters of Fbank feature extraction for CED and retraining classifiers.