“ASR Status Report 2016-11-28”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
 
(5位用户的30个中间修订版本未显示)
第1行: 第1行:
 
{| class="wikitable"
 
{| class="wikitable"
 
!Date!!People !! Last Week !! This Week
 
!Date!!People !! Last Week !! This Week
 +
|-
 +
| rowspan="6"|2016.11.28
  
  
 
+
|Yanqing Wang
|-
+
| rowspan="5"|2016.11.14
+
|Hang Luo 
+
 
||  
 
||  
*   
+
fighting with building GUI interface of one-class-SVM in Visual Studio but not finished yet
 +
** configure LibSVM toolkit in Visual Studio
 +
** implement the basic functions of LibSVM in Visual Studio
 +
** learn to create Qt project using Visual Studio and Qt Creator
 
||  
 
||  
*  
+
* finish building GUI interface of one-class-SVM in Visual Studio
 +
*  make the GUI interface display dynamically with the change of given data
 
|-
 
|-
  
  
 +
 +
 +
|-
 +
|Hang Luo
 +
||
 +
*  Do experiments about joint learning including:
 +
** Fixed language model and give its information to speech model
 +
** Try smaller language model and find its result
 +
*  Inter Speech paper talk about multi-task and highway connection
 +
||
 +
*  Continue to do experiments and read paper about language recognition model
 +
|-
  
  
第22行: 第37行:
 
* cnn visualization
 
* cnn visualization
 
* paper reading
 
* paper reading
 +
* ML-book done
 +
* DNN with different activation function (Sigmoid Tanh Relu pnorm)
 
||  
 
||  
 
* cnn visualization
 
* cnn visualization
第31行: 第48行:
 
|Yixiang Chen   
 
|Yixiang Chen   
 
||  
 
||  
*
+
* Continue replay detection (Freq-Weighting and Mel-Weighting).
 +
* Pooling replay data for UBM training
 
||  
 
||  
*  
+
* Continue replay detection (Change data set and Warping)
 
|-
 
|-
  
第41行: 第59行:
 
|Lantian Li   
 
|Lantian Li   
 
||  
 
||  
*
+
* LRE on AP16-OL7. [http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=tangzy&step=view_request&cvssid=574]
 +
** 'StatisticsComponent'
 +
** The effect of Vad / Padding.
 +
* Replay detection.
 +
** performance-driven based Freq-Weighting
 
||  
 
||  
*  
+
* LRE task.
 +
* Freq-Warping and CNN-training.
 
|-
 
|-
  
第51行: 第74行:
 
|Zhiyuan Tang  
 
|Zhiyuan Tang  
 
||  
 
||  
*
+
* A speech named 'Deep Learning in Speech Recognition' in Chengdu;
 +
* Decoding with language mask seems helpless, not concluded.
 
||  
 
||  
*  
+
* use language mask in a proper way.
 +
* prepare materials for paper accepted by TASLP. 
 
|-
 
|-
  
 
|-
 
|Yanqing Wang
 
||
 
 
||
 
*
 
 
|}
 
|}
 
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
{| class="wikitable"
 
{| class="wikitable"
 
!Date!!People !! Last Week !! This Week
 
!Date!!People !! Last Week !! This Week
 
 
 
 
|-
 
|-
 
| rowspan="5"|2016.11.21
 
| rowspan="5"|2016.11.21

2016年12月1日 (四) 01:08的最后版本

Date People Last Week This Week
2016.11.28


Yanqing Wang
  • fighting with building GUI interface of one-class-SVM in Visual Studio but not finished yet
    • configure LibSVM toolkit in Visual Studio
    • implement the basic functions of LibSVM in Visual Studio
    • learn to create Qt project using Visual Studio and Qt Creator
  • finish building GUI interface of one-class-SVM in Visual Studio
  • make the GUI interface display dynamically with the change of given data
Hang Luo
  • Do experiments about joint learning including:
    • Fixed language model and give its information to speech model
    • Try smaller language model and find its result
  • Inter Speech paper talk about multi-task and highway connection
  • Continue to do experiments and read paper about language recognition model
Ying Shi
  • some work about kazak speech recognition
  • cnn visualization
  • paper reading
  • ML-book done
  • DNN with different activation function (Sigmoid Tanh Relu pnorm)
  • cnn visualization
Yixiang Chen
  • Continue replay detection (Freq-Weighting and Mel-Weighting).
  • Pooling replay data for UBM training
  • Continue replay detection (Change data set and Warping)
Lantian Li
  • LRE on AP16-OL7. [1]
    • 'StatisticsComponent'
    • The effect of Vad / Padding.
  • Replay detection.
    • performance-driven based Freq-Weighting
  • LRE task.
  • Freq-Warping and CNN-training.
Zhiyuan Tang
  • A speech named 'Deep Learning in Speech Recognition' in Chengdu;
  • Decoding with language mask seems helpless, not concluded.
  • use language mask in a proper way.
  • prepare materials for paper accepted by TASLP.

Date People Last Week This Week
2016.11.21 Hang Luo
  • Explore the language recognition models including:
  • Evaluate the model in the aspect of sentence and frame, find the accuracy is very high.
  • Minimize the language model, train it single and joint with speech model, evaluate its result.
  • Continue doing the basic explore of joint training.
  • Read paper about multi-language recognition models and others.
Ying Shi
  • fighting with kazak speech recognition system:because the huge size of HCLG.fst the decoding job always make the sever done.

There are several method I have tried

  • change the size or word list and corpus this method not worked very well
  • prune the LM .And the parameter been used to prune the LM is 2e-7 the size of LM reduce from 290M to 60M but the result about wer is very poor
  • I have upload some result about several experiment to CVSS[2]
  • there are too much private affairs about myself so the job about visualization last week has been delayed I will try my best to finish it the week



Yixiang Chen
  • Learn MFCC extraction mechanism.
  • Read kaldi computer-feature code and find how to change MFCC.
  • Frequency-weighting based feature extraction.
  • Continue replay detection (Freq-Weighting and Freq-Warping).
Lantian Li
  • Joint-training on SRE and LRE (LRE task). [3]
    • Tdnn is better than LSTM.
    • LRE is a long-term task.
  • Briefly overview Interspeech SRE-related papers.
  • CSLT-Replay detection.
    • Baseline done (Freq / Mel domain).
    • performance-driven based Freq-Weighting and Freq-Warping --> Yixiang.
  • LRE task.
  • Replay detection.
Zhiyuan Tang
  • report for Weekly Reading (a brief review of interspeech16), just prepared;
  • language scores as decoding mask (1.multiply probability, very bad; 2.add log-softmax, a little bad)
  • training with mask failed
  • training with shared layers;
  • explore single tasks.