“2021-12-06”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(4位用户的6个中间修订版本未显示)
第47行: 第47行:
 
* Report about e2e kws
 
* Report about e2e kws
 
* speech engrave (garbage node, sil training data, text to speech attention)
 
* speech engrave (garbage node, sil training data, text to speech attention)
* analyse fenyinta test data [[http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=shiying&step=view_request&cvssid=829] here ]
+
* analyse fenyinta test data [[http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=shiying&step=view_request&cvssid=829 here]]
 
||
 
||
 
* more analyse about speech engrave(speech to text attention)
 
* more analyse about speech engrave(speech to text attention)
第118行: 第118行:
 
|Zixi Yan
 
|Zixi Yan
 
||   
 
||   
*  
+
* Training wav2vec model
 
||
 
||
 
*
 
*
第128行: 第128行:
 
|Sirui Li
 
|Sirui Li
 
||   
 
||   
*  
+
* Fine-tune the wav2vec model
 
||
 
||
*  
+
* Comparing Tibetan and Chinese fine-tune results
 
||
 
||
 
*   
 
*   
第147行: 第147行:
 
|-
 
|-
  
 
|-
 
|Ruihai Hou
 
||
 
*
 
||
 
*
 
||
 
 
|-
 
  
 
|-
 
|-
 
|Renmiao Chen
 
|Renmiao Chen
 
||  
 
||  
*  
+
* Sample some audio,listen and analyze
 
||
 
||
*  
+
* divide data
 
||
 
||
 
*   
 
*   

2021年12月13日 (一) 10:57的最后版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • Refine spoof paper
  • Prepare talk for information theory in NN
  • Prepare talk for representation investigation.
  • Finish poof paper
Yunqi Cai
  • review papers about CQDs
  • Verify the deconvolution of infrared and visible faces
  • Verify infrared and visible image fusion based on GLOW model
  • Arrange research plans for interns
Lantian Li
  • Finish course on AI.
  • Study speaker separation and think about structural embedding.
  • Finish ETM response.
  • Exps of hard trials.
Ying Shi
  • Report about e2e kws
  • speech engrave (garbage node, sil training data, text to speech attention)
  • analyse fenyinta test data [here]
  • more analyse about speech engrave(speech to text attention)
  • speech engrave (text to speech attention)
Haoran Sun
  • some tests on our model
  • make some more efficient attempts
  • ——remove rhythm and pitch encoders
  • ——increase distance between speakers
  • ——improve content encoder
  • ——make use of speaker label
Chen Chen
  • pre-process audio data & train GAN with wav2vec2 output data directly
  • use kmeans and pca clustering wav2vec2 output to build better segment representation
Pengqi Li
  • reproduce a series of CAM method on speaker classification
Qingyang Zhu
Weida Liang
  • Finish the first version on improved exemplar autoencoder with cycle loss
  • Rethink the theory analysis part
  • Test on never-before-seen speaker conversion
  • Review the code of wav2vec, StarGAN and PPG based GAN
Zixi Yan
  • Training wav2vec model
Sirui Li
  • Fine-tune the wav2vec model
  • Comparing Tibetan and Chinese fine-tune results
Haoyu Jiang
  • Face sampling in CNCeleb dataset
  • Filter videos without the target's face
Renmiao Chen
  • Sample some audio,listen and analyze
  • divide data