“2024-11-25”版本间的差异
来自cslt Wiki
| (8位用户的8个中间修订版本未显示) | |||
| 第34行: | 第34行: | ||
|| | || | ||
* Design cohort- conditional chain multi-talker ASR with round-RNN | * Design cohort- conditional chain multi-talker ASR with round-RNN | ||
| − | ** WER result : round-1 32.15% , round-2: 69.69 round-3: 92.33% | + | ** WER result : round-1 32.15% , round-2: 69.69% round-3: 92.33% |
** For 500 utterances sub-test set: Only 28% of the sentences have a recognition order that matches the cosine distance. | ** For 500 utterances sub-test set: Only 28% of the sentences have a recognition order that matches the cosine distance. | ||
* Prepare for Huawei's interview. | * Prepare for Huawei's interview. | ||
| 第80行: | 第80行: | ||
|Xiaolou Li | |Xiaolou Li | ||
|| | || | ||
| − | * | + | * Data process |
| + | ** CVS3 1/4 already cut from original video, waiting for pre-process | ||
| + | ** Copying pre-processed GongAn video data from gonganbu | ||
| + | * VSR Contrastive Loss Exp | ||
| + | ** Inspired by paper [https://arxiv.org/abs/2408.11813] | ||
| + | ** Main idea: For better align visual feature to LLM input, calculate cos similarity of target and video feature, set the biggest as the positive pair. | ||
| + | ** Result: Under training | ||
| + | * Paper Reading | ||
|| | || | ||
* | * | ||
| 第117行: | 第124行: | ||
|Wan Lin | |Wan Lin | ||
|| | || | ||
| − | * | + | * NS: all transformer |
| + | ** 6k spk: EER 2.6% | ||
| + | ** 20k spk: EER 2.3% | ||
| + | ** 20k spk+multi-enroll: EER 1.9% | ||
|| | || | ||
* | * | ||
| 第127行: | 第137行: | ||
|- | |- | ||
|Tianhao Wang | |Tianhao Wang | ||
| + | || | ||
| + | * Experiments about query embedding conditional approach: | ||
| + | ** SDR: FiLM (7.492) > self-attention (6.573) | ||
|| | || | ||
* | * | ||
| + | || | ||
| + | * | ||
| + | |- | ||
| + | |||
| + | |||
| + | |- | ||
| + | |Xiaoxue Luo | ||
| + | || | ||
| + | * training of the USS(CED+AudioSep) model | ||
| + | ** adjust the audio format to meet the needs of the model(in training) | ||
| + | * production of 2025 Daily Sign( March ) | ||
|| | || | ||
* | * | ||
| 第151行: | 第175行: | ||
|Junhui Chen | |Junhui Chen | ||
|| | || | ||
| − | * | + | * Read paper (ICCIP keynote speak paper and some other) |
| + | * NS | ||
| + | ** Some tests about transformer feature extractor | ||
|| | || | ||
* | * | ||
| 第200行: | 第226行: | ||
|Yang Wei | |Yang Wei | ||
|| | || | ||
| − | * | + | * Fix some bugs about keyword sampling in text enroll kws training code. |
| + | * Add spec augmentation for text enroll kws training. | ||
|| | || | ||
* | * | ||
| 第229行: | 第256行: | ||
|Yue Gu | |Yue Gu | ||
|| | || | ||
| − | * | + | * Synthesis about 1h data for each target speaker, then using these data to train the adapter module.[https://z1et6d3xtb.feishu.cn/wiki/VPZfwx53ei2zkgkSvPtcCiDSnVh?from=from_copylink] |
| + | * writing taslp paper | ||
|| | || | ||
* | * | ||
2024年11月25日 (一) 11:04的最后版本
| People | This Week | Next Week | Task Tracking (DeadLine) |
|---|---|---|---|
| Dong Wang |
|
|
|
| Lantian Li |
|
|
|
| Ying Shi |
|
|
|
| Zhenghai You |
|
|
|
| Junming Yuan |
|
|
|
| Chen Chen |
|
|
|
| Xiaolou Li |
|
|
|
| Zehua Liu |
|
|
|
| Pengqi Li |
|
|
|
| Wan Lin |
|
|
|
| Tianhao Wang |
|
|
|
| Xiaoxue Luo |
|
|
|
| Zhenyu Zhou |
|
|
|
| Junhui Chen |
|
|
|
| Jiaying Wang |
|
|
|
| Yu Zhang |
|
| |
| Wenqiang Du |
|
| |
| Yang Wei |
|
|
|
| Lily |
|
|
|
| Turi |
|
|
|
| Yue Gu |
|
|
|
| Qi Qu |
|
|
|