“2021-12-06”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(8位用户的16个中间修订版本未显示)
第32行: 第32行:
 
|Lantian Li
 
|Lantian Li
 
||  
 
||  
*  
+
* Finish course on AI.
 +
* Study speaker separation and think about structural embedding.
 
||
 
||
*  
+
* Finish ETM response.
 +
* Exps of hard trials.
 
||
 
||
 
*   
 
*   
第43行: 第45行:
 
|Ying Shi
 
|Ying Shi
 
||  
 
||  
*  
+
* Report about e2e kws
 +
* speech engrave (garbage node, sil training data, text to speech attention)
 +
* analyse fenyinta test data [[http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=shiying&step=view_request&cvssid=829 here]]
 
||
 
||
*  
+
* more analyse about speech engrave(speech to text attention)
 +
* speech engrave (text to speech attention)
 
||
 
||
 
*   
 
*   
第54行: 第59行:
 
|Haoran Sun
 
|Haoran Sun
 
||  
 
||  
*  
+
* some tests on our model
 
||
 
||
*  
+
* make some more efficient attempts
 +
* ——remove rhythm and pitch encoders
 +
* ——increase distance between speakers
 +
* ——improve content encoder
 +
* ——make use of speaker label
 
||
 
||
 
*   
 
*   
第65行: 第74行:
 
|Chen Chen
 
|Chen Chen
 
||  
 
||  
*  
+
* pre-process audio data & train GAN with wav2vec2 output data directly
 
||
 
||
*  
+
* use kmeans and pca clustering wav2vec2 output to build better segment representation
 
||
 
||
 
*   
 
*   
第109行: 第118行:
 
|Zixi Yan
 
|Zixi Yan
 
||   
 
||   
*  
+
* Training wav2vec model
 
||
 
||
 
*
 
*
第119行: 第128行:
 
|Sirui Li
 
|Sirui Li
 
||   
 
||   
*  
+
* Fine-tune the wav2vec model
 
||
 
||
*  
+
* Comparing Tibetan and Chinese fine-tune results
 
||
 
||
 
*   
 
*   
第130行: 第139行:
 
|Haoyu Jiang
 
|Haoyu Jiang
 
||  
 
||  
*  
+
* Face sampling in CNCeleb dataset
 +
* Filter videos without the target's face
 
||
 
||
 
*  
 
*  
第137行: 第147行:
 
|-
 
|-
  
 
|-
 
|Ruihai Hou
 
||
 
*
 
||
 
*
 
||
 
 
|-
 
  
 
|-
 
|-
 
|Renmiao Chen
 
|Renmiao Chen
 
||  
 
||  
*  
+
* Sample some audio,listen and analyze
 
||
 
||
*  
+
* divide data
 
||
 
||
 
*   
 
*   

2021年12月13日 (一) 10:57的最后版本

People This Week Next Week Task Tracking (DeadLine)
Dong Wang
  • Refine spoof paper
  • Prepare talk for information theory in NN
  • Prepare talk for representation investigation.
  • Finish poof paper
Yunqi Cai
  • review papers about CQDs
  • Verify the deconvolution of infrared and visible faces
  • Verify infrared and visible image fusion based on GLOW model
  • Arrange research plans for interns
Lantian Li
  • Finish course on AI.
  • Study speaker separation and think about structural embedding.
  • Finish ETM response.
  • Exps of hard trials.
Ying Shi
  • Report about e2e kws
  • speech engrave (garbage node, sil training data, text to speech attention)
  • analyse fenyinta test data [here]
  • more analyse about speech engrave(speech to text attention)
  • speech engrave (text to speech attention)
Haoran Sun
  • some tests on our model
  • make some more efficient attempts
  • ——remove rhythm and pitch encoders
  • ——increase distance between speakers
  • ——improve content encoder
  • ——make use of speaker label
Chen Chen
  • pre-process audio data & train GAN with wav2vec2 output data directly
  • use kmeans and pca clustering wav2vec2 output to build better segment representation
Pengqi Li
  • reproduce a series of CAM method on speaker classification
Qingyang Zhu
Weida Liang
  • Finish the first version on improved exemplar autoencoder with cycle loss
  • Rethink the theory analysis part
  • Test on never-before-seen speaker conversion
  • Review the code of wav2vec, StarGAN and PPG based GAN
Zixi Yan
  • Training wav2vec model
Sirui Li
  • Fine-tune the wav2vec model
  • Comparing Tibetan and Chinese fine-tune results
Haoyu Jiang
  • Face sampling in CNCeleb dataset
  • Filter videos without the target's face
Renmiao Chen
  • Sample some audio,listen and analyze
  • divide data