ASR Status Report 2017-9-11

来自cslt Wiki
跳转至: 导航搜索
Date People Last Week This Week
2017.9.4


Jiayin Cai
  • Got phonetic feat from a stronger phonetic network
  • Finished part of the experiment using stronger phonetic feature.
  • Will be absent for school.
  • But I will finish the remaining experiment.
Xiaofei Kang
  • improve the human Test website:, save the test recordings, decline the positive samples
  • Recording and cutting the audios, a total of 12 groups
  • Continue to record the audios with zhangmiao
  • Continue to ask people to do human test
Miao Zhang
  • Perform human test
  • Record some other people and do the experiments again
  • Continue to ask people to do human test
  • Recording(the goal is to record 400 to 500 people) here
Yanqing Wang
  • Absent
Ying Shi
  • multi-decoding ASR model with more pdfs. Performance better than before but not well enough
  • add sperate symbel to discriminated kazak and uyghur word set
  • group-based softmax(in progress)
  • finish group-based softmax and test the performance
Yixiang Chen
  • Absent
Lantian Li
  • Go on speaker segmentation tasks, see here
    • Complete the phonetic-aware speaker segmentation.
      • Word-level boundaries from the ASR.
      • Word-level d-vector and clustering.
  • Try some smooth tricks.
Zhiyuan Tang
  • Organized the code and doc of Parrot system[1]
  • Theoretical study of pronunciation detection

Date People Last Week This Week
2017.9.4


Jiayin Cai
  • Finished the phonetic i-vector experiment.
  • get BN feature and train i-vector LID.
  • Get phonetic feat from a stronger phonetic network
  • combine PTN and phonetic i-vector.
Xiaofei Kang
  • cutting audio and marking:21 speakers,a total of 1050 sentences
  • Finish the new speaker recognition using the two recordings.
  • improve the human Test website
Miao Zhang
  • Absent
  • Perform human test on 21-style speech(add the disguise)
  • Draw spectrums and t-SNE plots compared with experiment results
Yanqing Wang
  • Absent.
Ying Shi
  • multi decodeing ASR model
  • multi decodeing with fake Lid here
  • read code about TTS
  • employ group softmax to train multi decoding ASR model
  • synthesis one 'real' speech
Yixiang Chen
  • Absent.
Lantian Li
  • Go on speaker segmentation tasks, see here
    • Dimensionality reduction.
    • Clustering.
    • Visualization.
  • Phonetic-aware speaker segmentation.
Zhiyuan Tang
  • more indicators for VV scoring system, see [2].
  • more indicators, a demo with Shuai.
  • toolbook writing.