“ASR Status Report 2017-8-21”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第148行: 第148行:
 
|Zhiyuan Tang  
 
|Zhiyuan Tang  
 
||  
 
||  
* reorganize auto-scoring system, next ???
+
* 1. align the candidate speech (fbank) with phone labels using nnet3-align-compiled (almost finished); 2.analyse the alignment with rhythm, tone, tune, for Parrot system, (revised goodness of pronunciation), to be done.
 
* collecting material (PPT) for Kaldi toolbook.
 
* collecting material (PPT) for Kaldi toolbook.
 
||
 
||
* prefer to rewrite the scoring part.
+
* analyse the alignment with rhythm, tone, tune, (revised goodness of pronunciation).  
 
* toolbook writing
 
* toolbook writing
 
|-
 
|-
  
 
|}
 
|}

2017年8月21日 (一) 04:44的版本

Date People Last Week This Week
2017.8.21 Xiaofei Kang
Miao Zhang
  • Prepare the data and finish experiments on 5 recorded speech.
  • Finish the human test website(include 20 styles), express my apprecation to Shuai sister!
Yanqing Wang
Ying Shi
Yixiang Chen
Lantian Li
Zhiyuan Tang




Date People Last Week This Week
2017.8.14 Xiaofei Kang
  • Recording 35 people audio, located in /work7/zhangmiao/speaker/wavdata/data_new
  • Learn the new test website from zhangmiao
  • Go home with my mom, and come back on Friday night.
Miao Zhang
  • Recording work
  • Test website's data preparation
  • check the linear chapter
  • Continue to record
  • do experiments on recorded speech if possible
  • check the NN chapter
Yanqing Wang
  • TRP uploaded.
  • explore the importance of sparseness structure:
    • After pruning, initialize non-zero values randomly, train.
    • train nnet with 177-dimension hidden layer.
    • result
  • continue exploring the values of trained nnet.
Ying Shi
  • general codeMap finished(kazak)
  • crawler program delayed(Most of the kazakh website is down. I will cralw data from overseas websites)
  • collect more Unicode. such as Tibetan, Mongolia.
  • crawler kazak data from overseas websites.
Yixiang Chen
  • Study English and help Lantian do some Exps.
Lantian Li
  • Visualization and quantification for d-vector [1].
    • phone-aware and phone-blind.
    • within speaker variation and between speaker variation.
  • Speaker segmentation Exps.
  • Finish speaker segmentation Exp.
  • Prepare IS17 presentation.
Zhiyuan Tang
  • 1. align the candidate speech (fbank) with phone labels using nnet3-align-compiled (almost finished); 2.analyse the alignment with rhythm, tone, tune, for Parrot system, (revised goodness of pronunciation), to be done.
  • collecting material (PPT) for Kaldi toolbook.
  • analyse the alignment with rhythm, tone, tune, (revised goodness of pronunciation).
  • toolbook writing