“2014-11-25”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
tag LM
Lr讨论 | 贡献
tag LM
第211行: 第211行:
 
====tag LM====
 
====tag LM====
  
* different weight  
+
* different weight [http://cslt.riit.tsinghua.edu.cn/cgi-bin/cvss/cvss_request.pl?account=lr&step=view_request&cvssid=304 2014-Nov-23,Monday]
 
:*  
 
:*  
 
{| border="2px"
 
{| border="2px"
第280行: 第280行:
 
|-
 
|-
 
|}
 
|}
:*  
+
:* conclusion:
  conclusion:
+
 
   1. compare experiment 3  with experiment 5:
 
   1. compare experiment 3  with experiment 5:
 
     same jsgf file, but the  tag number in corpus if different, we can find that when add  
 
     same jsgf file, but the  tag number in corpus if different, we can find that when add  
第288行: 第287行:
 
   same tag number in corpus, but different jsgf size, we can find that different jsgf size have the  
 
   same tag number in corpus, but different jsgf size, we can find that different jsgf size have the  
 
   same optimal weight.
 
   same optimal weight.
  * problem
 
:* check the code. the result is different on different work folder.(done)
 
:* because the am is different
 
 
* need to do
 
* need to do
:* check the relation that between weight and size of dict.
+
:* tag Probability should test add the weight(hanzhenglong) and handover to hanzhenglong ('''this week''')
:* the short term should be punished.
+
:* make a summary about tag-lm and journal paper(wxx and yuanb)('''two weeks''').
:* make a summary about tag-lm .
+
  
 
====RNN LM====
 
====RNN LM====

2014年11月24日 (一) 05:13的版本

Speech Processing

AM development

Environment

  • Already buy 3 760GPU
  • grid-9 760GPU crashed again; random freeze after s ; try to investigate the reason
  • GPU problems on grid-17?
  • disk (/work2) problem on grid-15

Sparse DNN

  • Performance improvement found when pruned slightly
  • need retraining for unpruned one; training loss
  • The result of AURORA 4 will be available soon.
  • details at http://liuc.cslt.org/pages/sparse.html

RNN AM

  • Initial nnet seems not very well, need to be pre-trained or test lower learn-rate.
  • For AURORA 4 1h/epoch, model train done.
  • Using AURORA 4 short-sentence with a smaller number of targets.(+)
  • Adjusting the learning rate.(+)
  • Trying toolkit of Microsoft.(+)
  • details at http://liuc.cslt.org/pages/rnn.html

A new nnet training scheduler

Drop out & Rectification & convolutive network

  • Drop out
  • dataset:wsj, testset:eval92
       std |  dropout0.4 | dropout0.5 | dropout0.6 | dropout0.7 | dropout0.7_iter7(maxTr-Acc) | dropout0.8 | dropout0.8_iter7(maxTr-Acc)
    ------------------------------------------------------------------------------------------------------------------------------------ 
       4.5 |     5.39    |    4.80    |   4.75     |  4.36      |  4.39                       |    4.55    |    4.71           
    • Frame-accuarcy seems not consistent with WER. Using the train-data as cv, verify the learning ability of the model.
   Seems in one nnet model the train top frame accuracy is not consistent with the WER. 
    • Decode test_clean_wv1 dataset.
  • AURORA4 dataset
  (1) Train: train_nosiy
   drop-retention/testcase(WER) | test_clean_wv1  | test_airport_wv1 | test_babble_wv1 | test_car_wv1 
   ---------------------------------------------------------------------------------------------------------
          std-baseline          |  9.60           |  11.41           |  11.63          |  8.64
   ---------------------------------------------------------------------------------------------------------
             dp-0.3             |  12.91          |  16.55           |  15.37          |  12.60
   ---------------------------------------------------------------------------------------------------------
             dp-0.4             |  11.48          |  14.43           |  13.23          |  11.04
   ---------------------------------------------------------------------------------------------------------
             dp-0.5             |  10.53          |  13.00           |  12.89          |  10.24
   ---------------------------------------------------------------------------------------------------------
             dp-0.6             |  10.02          |  12.32           |  11.81          |  9.29
   ---------------------------------------------------------------------------------------------------------
             dp-0.7             |  9.65           |  12.01           |  12.09          |  8.89
   ---------------------------------------------------------------------------------------------------------
             dp-0.8             |  9.79           |  12.01           |  11.77          |  8.91
   ---------------------------------------------------------------------------------------------------------
             dp-1.0             |  9.94           |  11.33           |  12.05          |  8.32
   ---------------------------------------------------------------------------------------------------------
     baseline_dp0.4_lr0.008     |  9.52           |  12.01           |  11.75          |  9.44
  ---------------------------------------------------------------------------------------------------------
     baseline_dp0.4_lr0.0001    |  9.92           |  14.22           |  13.59          |  10.24
  ---------------------------------------------------------------------------------------------------------
     baseline_dp0.4_lr0.00001   |  9.06           |  13.27           |  13.14          |  9.33
  ---------------------------------------------------------------------------------------------------------
     baseline_dp0.8_lr0.008     |  9.16           |  11.23           |  11.42          |  8.49
  ---------------------------------------------------------------------------------------------------------
     baseline_dp0.8_lr0.0001    |  9.22           |  11.52           |  11.77          |  8.82
  ---------------------------------------------------------------------------------------------------------
     baseline_dp0.8_lr0.00001   |  9.12           |  11.27           |  11.65          |  8.68
  ---------------------------------------------------------------------------------------------------------
       dp-0.4_follow-std-lr     |  11.33          |  14.60           |  13.50          |  10.95
  ---------------------------------------------------------------------------------------------------------
       dp-0.8_follow-std-lr     |  9.77           |  12.01           |  11.79          |  8.93
  ---------------------------------------------------------------------------------------------------------
         dp-0.4_4-2048          |  11.69          |  16.13           |  14.24          |  11.98
  ---------------------------------------------------------------------------------------------------------
         dp-0.8_4-2048          |  9.46           |  11.60           |  11.98          |  8.78
  ---------------------------------------------------------------------------------------------------------
    • Test with AURORA4 of 7000 (clean + noisy).
    • Follow the standard DNN training learn-rate to avoid the different learn-rate changing time of various DNN training. Similar performance is obtained.
    • Find and test unknown noise test-data.(+)
    • Have done the droptout on normal trained XEnt NNET , eg wsj(learn-rate:1e-4/1e-5). Seems small learn-rate get the balance of accuracy and train-time.
    • Draft the dropout-DNN weight distribution. (++)
  • Rectification
  • Combine drop out and rectifier.(+)
  • Change the learn-rate in the middle of the training, Modify the train_nnet.sh script(Liu Chao).
  • MaxOut
  • 6min/epoch
1) AURORA4 -15h
   NOTE: gs==groupsize
 (1) Train: train_clean
        model/testcase(WER)    | test_clean_wv1  | test_airport_wv1 | test_babble_wv1 | test_car_wv1 
   ---------------------------------------------------------------------------------------------------------
          std-baseline         |  6.04           |  29.91           |  27.76          |  16.37
   ---------------------------------------------------------------------------------------------------------
          lr0.008_gs6          |                             - 
   ---------------------------------------------------------------------------------------------------------
         lr0.008_gs10          |                             - 
   ---------------------------------------------------------------------------------------------------------
         lr0.008_gs20          |                             - 
   ---------------------------------------------------------------------------------------------------------
      lr0.008_l1-0.01          |                             - 
   ---------------------------------------------------------------------------------------------------------
       lr0.008_l1-0.001        |                             - 
   ---------------------------------------------------------------------------------------------------------
      lr0.008_l1-0.0001        |                             - 
   ---------------------------------------------------------------------------------------------------------
    lr0.008_l1-0.000001        |                             - 
   ---------------------------------------------------------------------------------------------------------
        lr0.008_l2-0.01        |                             - 
   ---------------------------------------------------------------------------------------------------------
           lr0.006_gs10        |                             - 
   ---------------------------------------------------------------------------------------------------------
           lr0.004_gs10        |                             - 
   ---------------------------------------------------------------------------------------------------------
          lr0.002_gs10         |  6.21           |  28.48           |  27.30          |  16.37
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs1          |                             -
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs2          |                             -
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs4          |                             -
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs6          |  6.04           |  25.17           |  24.31          |  14.19
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs8          |  5.85           |  25.72           |  24.35          |  14.28
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs10         |  6.23           |  27.04           |  25.51          |  14.22
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs15         |  5.94           |  30.10           |  27.53          |  19.00
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs20         |  6.32           |  28.10           |  26.47          |  16.98
   ---------------------------------------------------------------------------------------------------------
  • pretraining based maxout
  • P-norm


  • Convolutive network (+)
  • AURORA 4
                 |  wer | hid-layers | hid-dim | delta-order | splice | lda-dim | learn-rate	| pooling | TBA
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_baseline| 6.70 |     4      | 1200	|      0      |    4   |   198   |   0.008	|   3     |patch-dim1 7 
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_1000_3  | 6.61 |     4      | 1000	|      0      |    4   |   198   |   0.008	|   3     |patch-dim1 7 
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_1400_3  | 6.61 |     4      | 1400	|      0      |    4   |   198   |   0.008	|   3     |patch-dim1 7 
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_1200_4  | 6.91 |     4      | 1200	|      0      |    4   |   198   |   0.008	|   4     |patch-dim1 6 
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_1200_2  | -    |     4      | 1200	|      0      |    4   |   198   |   0.008	|   2     |patch-dim1 8 
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_1200_3  | 6.66 |     5      | 1200	|      0      |    4   |   198   |   0.008	|   3     |patch-dim1 7 
-----------------------------------------------------------------------------------------------------------------------
  • READ paper

Denoising & Farfield ASR

  • ICASSP paper submitted.
  • HOLD

VAD

  • Frame energy feature extraction, done
  • Harmonics and Teager energy features being investigation
  • Previous results to be organized for a paper

Speech rate training

  • Data ready on tencent set; some errors on speech rate dependent model.
  • Retrain new model

Scoring

  • Timber Comparison done.
  • harmonics based timber comparison: frequency based feature is better
  • GMM based timber comparison is done. Similar to speaker recognition
  • TODO: Code checkin and technique report.

Confidence

  • Reproduce the experiments on fisher dataset.
  • Use the fisher DNN model to decode all-wsj dataset
  • preparing scoring for puqiang data

Speaker ID

  • Preparing GMM-based server.
  • EER ~ 11.2% (GMM-based system)
  • test different number of components; fast i-vector computing

Language ID

  • GMM-based language is ready.
  • Delivered to Jietong

Emotion detection

  • Sinovoice is implementing the server


Text Processing

LM development

Domain specific LM

  • domain lm(need to discuss with xiaoxi)
  • embedded language model(this week)
  • train some more LMs with Zhenlong (dianzishu sogou bbs chosen)("need result").
  • keep on training sogou2T lm(14/16 on 3rd iteration).(this week)
  • new dict.
  • handover of this work to hanzhenglong, give a simple docuemnt(this week)

tag LM

different weight
method tag-jsgf corpus weight wer ser add_wer
experiment 3 500(490 less frequent and 10 unseen) 500 0.1 16.72 77.92 -
0.3 15.42 71.25 -
0.5 15.40 69.58 -
0.7 15.28 68.75 -
0.8 15.38 68.33 -
1 15.98 69.17 -
2 19.08 70.83 -
experiment 4 100(90 less frequent and 10 unseen) 100 0.008 15.28 69.58 -
0.02 14.84 69.58 -
0.05 15.11 69.58 -
0.1 15.30 69.75 -
0.3 16.01 70.42 -
experiment 5 500 100 0.01 17.57 78.75 -
0.05 16.84 77.08 -
0.08 16.59 76.25 -
0.15 16.76 75.42 -
experiment 6 1280 500 0.1 17.42 77.92 -
0.5 15.20 69.17 -
0.8 15.30 68.33 -
1 15.69 69.58 -
  • conclusion:
 1. compare experiment 3  with experiment 5:
   same jsgf file, but the  tag number in corpus if different, we can find that when add 
 more tag to corpus, the optimal weight is larger.
 2. compare experiment 3 with experiment 6:
  same tag number in corpus, but different jsgf size, we can find that different jsgf size have the 
 same optimal weight.
  • need to do
  • tag Probability should test add the weight(hanzhenglong) and handover to hanzhenglong (this week)
  • make a summary about tag-lm and journal paper(wxx and yuanb)(two weeks).

RNN LM

  • rnn
  • test RNNLM on Chinese data from jietong-data
  • check the rnnlm code about how to Initialize and update learning rate.
  • lstm+rnn
  • check the lstm-rnnlm code about how to Initialize and update learning rate.

Word2Vector

W2V based doc classification

  • Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.
  • Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation
  • SSA-based local linear mapping still on running.
  • k-means classes change to 2.
  • Knowledge vector started
  • give a basic report
  • Character to word conversion
  • prepare the task: word similarity
  • prepare the dict.

Translation

  • v4.0 demo released
  • cut the dict and use new segment-tool

QA

deatil:[1]

Spell mistake

  • retrain the ngram model(caoli)
  • prepare the test and development set(caoli)
  • need discuss it with duxk

improve fuzzy match

  • add Synonyms similarity using MERT-4 method(hold)

improve lucene search

  • using MERT-4 method to get good value of multi-feature.like IDF,NER,baidu_weight,keyword etc.(liurong this month)

Multi-Scene Recognition

  • add the triples search to QA engine
  • demo (liurong two week)

.

  • new inter will install SEMPRE