2024-10-21

来自cslt Wiki
2024年10月21日 (一) 11:01Chenjh讨论 | 贡献的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航搜索
People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • Primary School AI hand book (20-30)
Lantian Li
  • AI-Graph EN (25/50)
  • Complete CSTR intro report (11.18)
Ying Shi
  • Cohort-Overlap ASR
    • condition on real decode result
    • Design stop criterion
  • Cohort-Speech separation
    • several configs for Dual-path model
  • group work
Zhenghai You
  • Weekly report
Junming Yuan
  • The result of feat-mask/time-mask MT-HuBERT [1]
Xiaolou Li
  • AVHuBERT unit exp
    • dc connector (↑0.8% than discrete unit)
    • concat feature and embedding (↑2% than discrete unit, ↓0.3% than baseline)
  • CVS3 quality check (30h totally) [2]
  • This work is help by Zehua, Linwan, Tianhao
  • MLLM system with audio output design
Zehua Liu
  • Verify VSR data
  • Finish Data Verification Report
  • ICL work(CER: 47.87% < CER: 51.08%)
  • Time Mask matters[3]
Pengqi Li
  • Complete the final report of the doctoral innovation project(School)
  • Exploring the Consistency of TAO and LayerCAM Results on different models and datasets.
    • Conclusion and hypothesis[4]
Wan Lin
  • help VSR data verification
  • experiment in voxblink2 [5]
Tianhao Wang
  • adjust the code of AudioSep (CLAP) to support multi-mix and audio-query (in training)
  • some project testing
Xiaoxue Luo
  • AudioSep reproduction
    • evaluate the performance of AudioSep
    • comparative experiment between AudioSep and baseline system(CLIPSep)
      • adjusting the code
Zhenyu Zhou
  • conditional chain 2-mix results reproduction(sisidr: 10.714 -> 15.6)
  • model quantization finial version submission
Junhui Chen
  • Experiments for NS
  • Look for speaker detection model with Resnet34 for frame label
Jiaying Wang
Yu Zhang
  • SocioDojo Llama 3.1 8B investment task
    • acc return is about 10% below nasdaq 100 index
  • add more professional information source, such as WSJ (current is Tweets Trending, which is too entertainment-oriented)
  • control the BUY/SELL amount of Actuator (current investments ratio is too high)
  • reproduce other Multi Agent investment pipeline such as FinAgent or FinRobot
Wenqiang Du
  • Participated in an AI competition
Yang Wei
  • Train text enroll KWS model and test with Aibabel dialect data.
Lily
Turi
  • Whisper finetuning on sagalee
    • with encoder frozen, whisper-large-v3 (20.5 WER)
  • Finetuning LLM
    • Finetuned Qwen2.5-0.5B on conversation dataset translated from English to Oromo
Yue Gu
  • write the cover letter
  • design a new speaker adaptation framework
Qi Qu
  • AED:
    • New CED-based classifiers deployed onto devices, yielding acceptable performance.
  • KWS:
    • Quantization and format conversion of production models for deployment on embedded device w/ NPU. Default quantization mode leads to unacceptable loss of precision. Will try hybrid quantization.
    • Text-enrollment KWS: some dynamic dimensions misinterpreted as constant duration exportation to ONNX.