“2014-11-17”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“==Speech Processing == === AM development === ==== Environment ==== * Already buy 3 760GPU * grid-9 760GPU crashed again ==== Sparse DNN ==== * Performance improve...”为内容创建页面)
 
Lr讨论 | 贡献
Text Processing
第213行: 第213行:
  
 
*  new dict.
 
*  new dict.
:* Tested the earlier vocabulary on 6000.txt with ppl.
+
:* segmented baiduzhidao,baiduhi,weibo corpus with the new dictionary 150576. Hanzhenglong test it.
                old150K      new166K      new150K
+
    baiduzhidao     394          369          333
+
    baiduhi         217          190          188
+
:* Built new 100K,150K,200K vocabulary
+
:* Had fixed some bugs in sogou dict spider.
+
:* new toolkit:find method to update the new dict. can get new wordlist from sougou and get word information from baidu.('''two week''')
+
  
 
====tag LM====
 
====tag LM====
  
* set new test  
+
* set new test and get some good result[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/11-16_Bin_Yuan#Result]
:*
+
* problem
{| border="2px"
+
:* fix the bug about "#0"
|+ no "北京" in corpus with tag-lm
+
:* check the code. the result is different on different work folder.
|-
+
* need to do
! method !!baeline  !! weight0.1 !! weight0.5 !! weight1 !! weight2 !! weight3
+
:* check the relation that between weight and size of dict.
|-
+
:* the short term should be punished.
! wer
+
:* make a summary about tag-lm .
| 56.58 || 69.49 || 62.23 || 58.03 || 56.90 || -
+
|-
+
!"北京"
+
|6/10||4/10||4/10||2/10||1/10||0
+
|-
+
! detail
+
|288 ins, 5075 del, 3178 sub||196 ins, 6016 del, 4278 sub||190 ins, 5870 del, 3334 sub||243 ins, 5294 del, 3223 sub||344 ins, 4558 del, 3687 sub||-
+
|-
+
|}
+
 
+
* mix seed lm with big lm, test address-tag on big lm
+
  
 
====RNN LM====
 
====RNN LM====
 
*rnn
 
*rnn
:* RNNLM=>ALPA make a report
 
 
:* test RNNLM on Chinese data from jietong-data
 
:* test RNNLM on Chinese data from jietong-data
:* check the rnnlm code.
+
:* check the rnnlm code about how to Initialize and update learning rate.
 
*lstm+rnn
 
*lstm+rnn
:* check the lstm-rnnlm code
+
:* check the lstm-rnnlm code about how to Initialize and update learning rate.
  
 
===Word2Vector===
 
===Word2Vector===
第261行: 第243行:
  
 
* Knowledge vector started
 
* Knowledge vector started
:* format the data
+
:* give a basic report
:* yuanbin will continue this work with help of xingchao.
+
  
 
* Character to word conversion
 
* Character to word conversion
 
:* prepare the task: word similarity
 
:* prepare the task: word similarity
 
:* prepare the dict.
 
:* prepare the dict.
 
* Google word vector train
 
:* some ideal will discuss on weekly report.
 
  
 
===Translation===
 
===Translation===
第279行: 第257行:
 
deatil:[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/Hulan-2014-11-06]
 
deatil:[http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/Hulan-2014-11-06]
 
====Spell mistake====
 
====Spell mistake====
:* retrain the ngram model('''caoli''')
+
* retrain the ngram model('''caoli''')
:* prepare the test and development set('''caoli''')
+
* prepare the test and development set('''caoli''')
 +
:*
  
 
====improve fuzzy match====
 
====improve fuzzy match====
* add Synonyms similarity using MERT-4 method
+
* add Synonyms similarity using MERT-4 method(hold)
  
 
====improve lucene search====
 
====improve lucene search====
* our vsm method
 
{| border="2px"
 
|+ different result in lucene
 
|-
 
! method !!lucene  !! vsm_idf(haiguan) !! VSM_idf(baidu) !! vsm_idf(tain) !! vsm_idf(calculate)
 
|-
 
! Accary
 
| 0.6628 || 0.6228 || 0.6197 || 0.5827 || 0.5426
 
|-
 
|}
 
* lucene top
 
:* top10(82.95%),top20(86.34),top50(90.23%),top100(94.11%),top200(96.18%),top1000(97.31%),top2000(97.87%),top5000(98.75%),top10000(99.06)
 
:* test the result of top(100,200,1000) in full qa(lucene+fuzzymatch)('''caoli''')
 
 
* lucene Optimization(liurong)
 
:* rewrite the method to select the 50 standard question not same template.(liurong)
 
:* check the word segment for template.(liurong)
 
:* boost the query keyword using IDF
 
{| border="2px"
 
|+ boost keyword  in lucene
 
|-
 
! method !!Default  !! idf_train !! idf_train_norm!! idf_baidu !! idf_baidu_norm
 
|-
 
! Accary
 
| 0.66228 ||  0.651629 ||0.57644|| 0.647869|| 0.65288
 
|-
 
|}
 
 
:* using MERT-4 method to get good value of multi-feature.like IDF,NER,baidu_weight,keyword etc.('''liurong this month''')
 
:* using MERT-4 method to get good value of multi-feature.like IDF,NER,baidu_weight,keyword etc.('''liurong this month''')
  
 
====Multi-Scene Recognition====
 
====Multi-Scene Recognition====
 
* add the triples search to QA engine  
 
* add the triples search to QA engine  
:* discuss the detail and give a report.('''liurong''')
 
 
* demo ('''liurong two week''')
 
* demo ('''liurong two week''')
 
.
 
.
 
* new inter will install SEMPRE
 
* new inter will install SEMPRE

2014年11月17日 (一) 08:13的版本

Speech Processing

AM development

Environment

  • Already buy 3 760GPU
  • grid-9 760GPU crashed again

Sparse DNN

RNN AM

  • Initial nnet seems not very well, need to be pre-trained or test lower learn-rate.
  • For AURORA 4 1h/epoch, model train done.
  • Using AURORA 4 short-sentence with a smaller number of targets.(+)
  • Adjusting the learning rate.(+)
  • Trying toolkit of Microsoft.(+)
  • details at http://liuc.cslt.org/pages/rnn.html

A new nnet training scheduler

Noise training

  • Paper has been submitted.

Drop out & Rectification & convolutive network

  • Drop out
  • dataset:wsj, testset:eval92
       std |  dropout0.4 | dropout0.5 | dropout0.6 | dropout0.7 | dropout0.7_iter7(maxTr-Acc) | dropout0.8 | dropout0.8_iter7(maxTr-Acc)
    ------------------------------------------------------------------------------------------------------------------------------------ 
       4.5 |     5.39    |    4.80    |   4.75     |  4.36      |  4.39                       |    4.55    |    4.71           
    • Frame-accuarcy seems not consistent with WER. Using the train-data as cv, verify the learning ability of the model.
   Seems in one nnet model the train top frame accuracy is not consistent with the WER. 
    • Decode test_clean_wv1 dataset.
  • AURORA4 dataset
  (1) Train: train_nosiy
   drop-retention/testcase(WER) | test_clean_wv1  | test_airport_wv1 | test_babble_wv1 | test_car_wv1 
   ---------------------------------------------------------------------------------------------------------
          std-baseline          |  9.60           |  11.41           |  11.63          |  8.64
   ---------------------------------------------------------------------------------------------------------
             dp-0.3             |  12.91          |  16.55           |  15.37          |  12.60
   ---------------------------------------------------------------------------------------------------------
             dp-0.4             |  11.48          |  14.43           |  13.23          |  11.04
   ---------------------------------------------------------------------------------------------------------
             dp-0.5             |  10.53          |  13.00           |  12.89          |  10.24
   ---------------------------------------------------------------------------------------------------------
             dp-0.6             |  10.02          |  12.32           |  11.81          |  9.29
   ---------------------------------------------------------------------------------------------------------
             dp-0.7             |  9.65           |  12.01           |  12.09          |  8.89
   ---------------------------------------------------------------------------------------------------------
             dp-0.8             |  9.79           |  12.01           |  11.77          |  8.91
   ---------------------------------------------------------------------------------------------------------
             dp-1.0             |  9.94           |  11.33           |  12.05          |  8.32
   ---------------------------------------------------------------------------------------------------------
     baseline_dp0.4_lr0.008     |  9.52           |  12.01           |  11.75          |  9.44
  ---------------------------------------------------------------------------------------------------------
     baseline_dp0.4_lr0.0001    |  9.92           |  14.22           |  13.59          |  10.24
  ---------------------------------------------------------------------------------------------------------
     baseline_dp0.4_lr0.00001   |  9.06           |  13.27           |  13.14          |  9.33
  ---------------------------------------------------------------------------------------------------------
     baseline_dp0.8_lr0.008     |  9.16           |  11.23           |  11.42          |  8.49
  ---------------------------------------------------------------------------------------------------------
     baseline_dp0.8_lr0.0001    |  9.22           |  11.52           |  11.77          |  8.82
  ---------------------------------------------------------------------------------------------------------
     baseline_dp0.8_lr0.00001   |  9.12           |  11.27           |  11.65          |  8.68
  ---------------------------------------------------------------------------------------------------------
       dp-0.4_follow-std-lr     |  11.33          |  14.60           |  13.50          |  10.95
  ---------------------------------------------------------------------------------------------------------
       dp-0.8_follow-std-lr     |  9.77           |  12.01           |  11.79          |  8.93
  ---------------------------------------------------------------------------------------------------------
         dp-0.4_4-2048          |  11.69          |  16.13           |  14.24          |  11.98
  ---------------------------------------------------------------------------------------------------------
         dp-0.8_4-2048          |  9.46           |  11.60           |  11.98          |  8.78
  ---------------------------------------------------------------------------------------------------------
    • Test with AURORA4 of 7000 (clean + noisy).
    • Follow the standard DNN training learn-rate to avoid the different learn-rate changing time of various DNN training. Similar performance is obtained.
    • Find and test unknown noise test-data.(+)
    • Have done the droptout on normal trained XEnt NNET , eg wsj(learn-rate:1e-4/1e-5). Seems small learn-rate get the balance of accuracy and train-time.
    • Draft the dropout-DNN weight distribution. (++)
  • Rectification
  • Combine drop out and rectifier.(+)
  • Change the learn-rate in the middle of the training, Modify the train_nnet.sh script(Liu Chao).
  • MaxOut
  • 6min/epoch
1) AURORA4 -15h
   NOTE: gs==groupsize
 (1) Train: train_clean
        model/testcase(WER)    | test_clean_wv1  | test_airport_wv1 | test_babble_wv1 | test_car_wv1 
   ---------------------------------------------------------------------------------------------------------
          std-baseline         |  6.04           |  29.91           |  27.76          |  16.37
   ---------------------------------------------------------------------------------------------------------
          lr0.008_gs6          |                             - 
   ---------------------------------------------------------------------------------------------------------
         lr0.008_gs10          |                             - 
   ---------------------------------------------------------------------------------------------------------
         lr0.008_gs20          |                             - 
   ---------------------------------------------------------------------------------------------------------
      lr0.008_l1-0.01          |                             - 
   ---------------------------------------------------------------------------------------------------------
       lr0.008_l1-0.001        |                             - 
   ---------------------------------------------------------------------------------------------------------
      lr0.008_l1-0.0001        |                             - 
   ---------------------------------------------------------------------------------------------------------
    lr0.008_l1-0.000001        |                             - 
   ---------------------------------------------------------------------------------------------------------
        lr0.008_l2-0.01        |                             - 
   ---------------------------------------------------------------------------------------------------------
           lr0.006_gs10        |                             - 
   ---------------------------------------------------------------------------------------------------------
           lr0.004_gs10        |                             - 
   ---------------------------------------------------------------------------------------------------------
          lr0.002_gs10         |  6.21           |  28.48           |  27.30          |  16.37
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs1          |                             -
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs2          |                             -
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs4          |                             -
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs6          |  6.04           |  25.17           |  24.31          |  14.19
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs8          |  5.85           |  25.72           |  24.35          |  14.28
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs10         |  6.23           |  27.04           |  25.51          |  14.22
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs15         |  5.94           |  30.10           |  27.53          |  19.00
   ---------------------------------------------------------------------------------------------------------
          lr0.001_gs20         |  6.32           |  28.10           |  26.47          |  16.98
   ---------------------------------------------------------------------------------------------------------
  • P-norm
  • Convolutive network (+)
  • AURORA 4
                 |  wer | hid-layers | hid-dim | delta-order | splice | lda-dim | learn-rate	| pooling | TBA
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_baseline| 6.70 |     4      | 1200	|      0      |    4   |   198   |   0.008	|   3     |patch-dim1 7 
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_1000_3  | 6.61 |     4      | 1000	|      0      |    4   |   198   |   0.008	|   3     |patch-dim1 7 
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_1400_3  | 6.61 |     4      | 1400	|      0      |    4   |   198   |   0.008	|   3     |patch-dim1 7 
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_1200_4  | 6.91 |     4      | 1200	|      0      |    4   |   198   |   0.008	|   4     |patch-dim1 6 
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_1200_2  | -    |     4      | 1200	|      0      |    4   |   198   |   0.008	|   2     |patch-dim1 8 
-----------------------------------------------------------------------------------------------------------------------
 cnn_std_1200_3  | 6.66 |     5      | 1200	|      0      |    4   |   198   |   0.008	|   3     |patch-dim1 7 
-----------------------------------------------------------------------------------------------------------------------
  • READ paper

Denoising & Farfield ASR

  • ICASSP paper submitted.
  • HOLD

VAD

  • Frame energy feature extraction, done
  • Harmonics and Teager energy features being investigation
  • Previous results to be organized for a paper

Speech rate training

  • 100h random select from 1000h tec dataset
  • baseline and ROS NNet train done, will decoding soon
  • Seems ROS model is superior to the normal one with faster speech

low resource language AM training

  • HOLD
  • Uyghur language model has been released to JT. Done.

Scoring

  • Timber Comparison on testing

Confidence

  • Reproduce the experiments on fisher dataset.
  • Use the fisher DNN model to decode all-wsj dataset
  • preparing scoring for puqiang data

Speaker ID

  • Preparing GMM-based server.
  • EER ~ 11.2% (GMM-based system)
  • test different number of components; fast i-vector computing

Language ID

  • GMM-based language is ready.
  • Delivered to Jietong

Emotion detection

  • Sinovoice is implementing the server


Text Processing

LM development

Domain specific LM

  • domain lm
  • merger weibo、baiduhi and baiduzhidao lm and test (need result)
  • confirm the size of alpa with xiaomin for business application.(like e-13)
  • get the general test data from miaomin .this test set may get from online.
  • trained a new lm: mobile
  • find the optimal lambda for interpolating following LMs: baidu_hi, mobile, sichuanmobile
  • train some more LMs with Zhenlong
  • keep on training sogou2T lm


  • new dict.
  • segmented baiduzhidao,baiduhi,weibo corpus with the new dictionary 150576. Hanzhenglong test it.

tag LM

  • set new test and get some good result[1]
  • problem
  • fix the bug about "#0"
  • check the code. the result is different on different work folder.
  • need to do
  • check the relation that between weight and size of dict.
  • the short term should be punished.
  • make a summary about tag-lm .

RNN LM

  • rnn
  • test RNNLM on Chinese data from jietong-data
  • check the rnnlm code about how to Initialize and update learning rate.
  • lstm+rnn
  • check the lstm-rnnlm code about how to Initialize and update learning rate.

Word2Vector

W2V based doc classification

  • Initial results variable Bayesian GMM obtained. Performance is not as good as the conventional GMM.
  • Non-linear inter-language transform: English-Spanish-Czch: wv model training done, transform model on investigation
  • SSA-based local linear mapping still on running.
  • k-means classes change to 2.
  • Knowledge vector started
  • give a basic report
  • Character to word conversion
  • prepare the task: word similarity
  • prepare the dict.

Translation

  • v4.0 demo released
  • cut the dict and use new segment-tool

QA

deatil:[2]

Spell mistake

  • retrain the ngram model(caoli)
  • prepare the test and development set(caoli)

improve fuzzy match

  • add Synonyms similarity using MERT-4 method(hold)

improve lucene search

  • using MERT-4 method to get good value of multi-feature.like IDF,NER,baidu_weight,keyword etc.(liurong this month)

Multi-Scene Recognition

  • add the triples search to QA engine
  • demo (liurong two week)

.

  • new inter will install SEMPRE