“ASR Status Report 2017-9-11”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(4位用户的15个中间修订版本未显示)
第8行: 第8行:
 
|Jiayin Cai
 
|Jiayin Cai
 
||
 
||
*
+
*Got phonetic feat from a stronger phonetic network
 +
*Finished part of the experiment using stronger phonetic feature.
 
||
 
||
*
+
*Will be absent for school.
 +
*But I will finish the remaining experiment.
 
|-
 
|-
  
第17行: 第19行:
 
|Xiaofei Kang
 
|Xiaofei Kang
 
||  
 
||  
*  
+
* improve the human Test website:, save the test recordings, decline the positive samples
 +
* Recording and cutting the audios, a total of 12 groups
 
||  
 
||  
*  
+
* Continue to record the audios with zhangmiao
 +
* Continue to ask people to do human test
 
|-
 
|-
  
第30行: 第34行:
 
||  
 
||  
 
* Continue to ask people to do human test
 
* Continue to ask people to do human test
* Recording(the goal is to record 400 to 500 people)
+
* Recording(the goal is to record 400 to 500 people) [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/c/cc/录音说明.pdf here]
 
|-
 
|-
  
第37行: 第41行:
 
|Yanqing Wang
 
|Yanqing Wang
 
||  
 
||  
*
+
* Absent
 
||
 
||
 
*   
 
*   
第46行: 第50行:
 
|Ying Shi   
 
|Ying Shi   
 
||  
 
||  
*
+
* multi-decoding ASR model with more pdfs. Performance better than before but not well enough
 +
* add sperate symbel to discriminated kazak and uyghur word set
 +
* group-based softmax(in progress)
 
||  
 
||  
*  
+
* finish group-based softmax and test the performance
 
|-
 
|-
  
第55行: 第61行:
 
|Yixiang Chen   
 
|Yixiang Chen   
 
||  
 
||  
*
+
* Absent
 
||  
 
||  
 
*  
 
*  
第64行: 第70行:
 
|Lantian Li   
 
|Lantian Li   
 
||  
 
||  
*  
+
* Go on speaker segmentation tasks, see [http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=lilt&step=view_request&cvssid=615 here]
 +
** Complete the phonetic-aware speaker segmentation.
 +
*** Word-level boundaries from the ASR.
 +
*** Word-level d-vector and clustering.
 
||
 
||
*  
+
* Try some smooth tricks.
 
|-
 
|-
  
第73行: 第82行:
 
|Zhiyuan Tang  
 
|Zhiyuan Tang  
 
||  
 
||  
*  
+
* Organized the code and doc of Parrot system[http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=tangzy&step=view_request&cvssid=635]
 
||
 
||
*  
+
* Theoretical study of pronunciation detection
*
+
 
|-
 
|-
  

2017年9月13日 (三) 00:45的最后版本

Date People Last Week This Week
2017.9.4


Jiayin Cai
  • Got phonetic feat from a stronger phonetic network
  • Finished part of the experiment using stronger phonetic feature.
  • Will be absent for school.
  • But I will finish the remaining experiment.
Xiaofei Kang
  • improve the human Test website:, save the test recordings, decline the positive samples
  • Recording and cutting the audios, a total of 12 groups
  • Continue to record the audios with zhangmiao
  • Continue to ask people to do human test
Miao Zhang
  • Perform human test
  • Record some other people and do the experiments again
  • Continue to ask people to do human test
  • Recording(the goal is to record 400 to 500 people) here
Yanqing Wang
  • Absent
Ying Shi
  • multi-decoding ASR model with more pdfs. Performance better than before but not well enough
  • add sperate symbel to discriminated kazak and uyghur word set
  • group-based softmax(in progress)
  • finish group-based softmax and test the performance
Yixiang Chen
  • Absent
Lantian Li
  • Go on speaker segmentation tasks, see here
    • Complete the phonetic-aware speaker segmentation.
      • Word-level boundaries from the ASR.
      • Word-level d-vector and clustering.
  • Try some smooth tricks.
Zhiyuan Tang
  • Organized the code and doc of Parrot system[1]
  • Theoretical study of pronunciation detection

Date People Last Week This Week
2017.9.4


Jiayin Cai
  • Finished the phonetic i-vector experiment.
  • get BN feature and train i-vector LID.
  • Get phonetic feat from a stronger phonetic network
  • combine PTN and phonetic i-vector.
Xiaofei Kang
  • cutting audio and marking:21 speakers,a total of 1050 sentences
  • Finish the new speaker recognition using the two recordings.
  • improve the human Test website
Miao Zhang
  • Absent
  • Perform human test on 21-style speech(add the disguise)
  • Draw spectrums and t-SNE plots compared with experiment results
Yanqing Wang
  • Absent.
Ying Shi
  • multi decodeing ASR model
  • multi decodeing with fake Lid here
  • read code about TTS
  • employ group softmax to train multi decoding ASR model
  • synthesis one 'real' speech
Yixiang Chen
  • Absent.
Lantian Li
  • Go on speaker segmentation tasks, see here
    • Dimensionality reduction.
    • Clustering.
    • Visualization.
  • Phonetic-aware speaker segmentation.
Zhiyuan Tang
  • more indicators for VV scoring system, see [2].
  • more indicators, a demo with Shuai.
  • toolbook writing.