|
|
第31行: |
第31行: |
| ** Support arbitrary source mixing training | | ** Support arbitrary source mixing training |
| ** Use the real hypothesis by Token error rate | | ** Use the real hypothesis by Token error rate |
| + | ** Design stop criterion |
| || | | || |
| * | | * |
People |
This Week |
Next Week |
Task Tracking (DeadLine)
|
Dong Wang
|
|
|
|
Lantian Li
|
|
|
|
Ying Shi
|
- revise the code about cohort-overlap asr [the training is in progress]
- Support arbitrary source mixing training
- Use the real hypothesis by Token error rate
- Design stop criterion
|
|
|
Zhenghai You
|
- Introduce more hard samples to improve model performance[1]
- SPK-AUG with same length: There is an improvement, but the SI-SDR decreases when hard sample rate increases
- Design more hard samples
|
|
|
Junming Yuan
|
- The result of time-mask MT-HuBERT [2]
|
|
|
Chen Chen
|
|
|
|
Xiaolou Li
|
- VTS with LLM structure design and baseline code writing [3]
|
|
|
Zehua Liu
|
- Reading Papper about In-Context-Learning in ASR
- Training model with Adaptive Time Mask
- Try In-Context-Learning with only previous sentence[4]
- VTS Project Report starts
|
|
|
Pengqi Li
|
- Consistency of TAO and LayerCAM
- Change TAO from input to final conv layer and obtain more consistency.(Aishell:0.93 in any model)
|
|
|
Wan Lin
|
- NS: downsampling is not useful
- share speaker meeting in Friday
|
|
|
Tianhao Wang
|
- AudioSep (CLAP) 5-mix exps:
- text-query: SDR=4.978, SI-SDR=1.972
- audio-query: SDR=6.907, SI-SDR=5.058
- This results with the loudness limitation
|
- AudioSep (CLAP) without loudness limitation
- Project things
|
|
Xiaoxue Luo
|
- Comparative experiment between AudioSep and baseline system(CLIPSep)
- Prepare the report
|
|
|
Zhenyu Zhou
|
- reproduce 5-mix speech Separation results:
- pit:2-mix:16.04 ;5-mix:6.87
- conditional:5-mix:5.38(40 epoch)
|
|
|
Junhui Chen
|
- NS:speaker detection (method survey & debug)
- get sick
|
|
|
Jiaying Wang
|
|
|
|
Yu Zhang
|
- SocioDojo (still worse than Nasdaq100 baseline)
- Change information sources, from the perspective of the report generated by LLM, more new information sources will be referenced.
- Prompt Actuator to consider current cash ratio before investing (with out this, the asset ratio goes up to 100%, which leads to high risks, still running)
- Read some papers about integrating time series into LLM
|
|
|
Wenqiang Du
|
- Prepare data,code and environment for Pro.Mijiti
|
|
|
Yang Wei
|
- Train text enroll KWS model with Aibabel training data. Not work.
|
|
|
Lily
|
|
|
|
Turi
|
- Whisper-largev3 finetuning
- Freezing 20 layers of encoder achieved 9.75 WER. Vanilla finetuning 8.02 WER
|
|
|
Yue Gu
|
- seek suggestions from other authors. Many suggestions are conflicting, so I'm try to figure out the reasons and fix these issues.
|
|
|
Qi Qu
|
- KWS:
- Text-enroll models exported to ONNX.
- C/JNI libs built based on ONNX models and ready for on-device test.
|
|
|