|
|
| 第231行: |
第231行: |
| | |Turi | | |Turi |
| | || | | || |
| − | * | + | * Whisper-largev3 finetuning |
| | + | ** Freezing 20 layers of encoder achieved 9.75 WER. Vanilla finetuning 8.02 WER |
| | || | | || |
| | * | | * |
| 第239行: |
第240行: |
| | |Yue Gu | | |Yue Gu |
| | || | | || |
| − | * seek sugestions from other authors. Many suggestions are conflicting, so I'm try to figure out the reasons and fix these issues. | + | * seek suggestions from other authors. Many suggestions are conflicting, so I'm try to figure out the reasons and fix these issues. |
| | || | | || |
| | * | | * |
| People |
This Week |
Next Week |
Task Tracking (DeadLine)
|
| Dong Wang
|
|
|
|
| Lantian Li
|
|
|
|
| Ying Shi
|
- revise the code about cohort-overlap asr [the training is in progress]
- Support arbitrary source mixing training
- Use the real hypothesis by Token error rate
|
|
|
| Zhenghai You
|
- Introduce more hard samples to improve model performance[1]
- SPK-AUG with same length: There is an improvement, but the SI-SDR decreases when hard sample rate increases
- Design more hard samples
|
|
|
| Junming Yuan
|
- The result of time-mask MT-HuBERT [2]
|
|
|
| Chen Chen
|
|
|
|
| Xiaolou Li
|
|
|
|
| Zehua Liu
|
- Reading Papper about In-Context-Learning in ASR
- Training model with Adaptive Time Mask
- Try In-Context-Learning with only previous sentence[3]
- VTS Project Report starts
|
|
|
| Pengqi Li
|
- Consistency of TAO and LayerCAM
- Change TAO from input to final conv layer and obtain more consistency.(Aishell:0.93 in any model)
|
|
|
| Wan Lin
|
|
|
|
| Tianhao Wang
|
- AudioSep (CLAP) 5-mix exps:
- text-query: SDR=4.978, SI-SDR=1.972
- audio-query: SDR=6.907, SI-SDR=5.058
- This results with the loudness limitation
|
- AudioSep (CLAP) without loudness limitation
- Project things
|
|
| Xiaoxue Luo
|
- Comparative experiment between AudioSep and baseline system(CLIPSep)
- Prepare the report
|
|
|
| Zhenyu Zhou
|
- reproduce 5-mix speech Separation results:
- pit:2-mix:16.04 ;5-mix:6.87
- conditional:5-mix:5.38(40 epoch)
|
|
|
| Junhui Chen
|
|
|
|
| Jiaying Wang
|
|
|
|
| Yu Zhang
|
- SocioDojo (still worse than Nasdaq100 baseline)
- Change information sources, from the perspective of the report generated by LLM, more new information sources will be referenced.
- Prompt Actuator to consider current cash ratio before investing (with out this, the asset ratio goes up to 100%, which leads to high risks, still running)
- Read some papers about integrating time series into LLM
|
|
|
| Wenqiang Du
|
- Prepare data,code and environment for Pro.Mijiti
|
|
|
| Yang Wei
|
- Train text enroll KWS model with Aibabel training data. Not work.
|
|
|
| Lily
|
|
|
|
| Turi
|
- Whisper-largev3 finetuning
- Freezing 20 layers of encoder achieved 9.75 WER. Vanilla finetuning 8.02 WER
|
|
|
| Yue Gu
|
- seek suggestions from other authors. Many suggestions are conflicting, so I'm try to figure out the reasons and fix these issues.
|
|
|
| Qi Qu
|
- KWS:
- Text-enroll models exported to ONNX.
- C/JNI libs built based on ONNX models and ready for on-device test.
|
|
|