Xinsong-beamforming-result

来自cslt Wiki
跳转至: 导航搜索

far field SNR : 14dB restaurant training data reorded in 2016.4.22 has 250 sentences and test data has 11 sentences. 16k and 16bit data

AM: 10000h 7*2048 MPE: LM: 1e-7 5-gram

ID3: test data which are involved in the DAE training

baseline:
---------------------------------------------------
                | test random wer  | test ID3 wer(within trainingset)
---------------------------------------------------
    c1          |    39.74         |   36.89
---------------------------------------------------
    c2          |    30.77         |   34.95
---------------------------------------------------
    c3          |    34.62         |   37.38
---------------------------------------------------
    c4          |    42.31         |   36.89
---------------------------------------------------
   near         |    21.79         |   4.37
---------------------------------------------------

beamforming:
---------------------------------------------------
   DS_post      |    26.92         |  29.13
----------------------------------------------------
   SD_post      |    23.08         |   26.70         (second)
----------------------------------------------------
  MVDR_post     |    26.92        |    28.64
----------------------------------------------------
sino beamforming|
_from_xiaoming  |     26.92       |    31.07               
----------------------------------------------------

Four-channel cnn_tdnn model: 160 fbank 

----------------------------------------------------
 tdnn_dae_1*1024|    38.46         |  10.19
 _tdnn_lr_0.008 |
-----------------------------------------------------
  cnn_tdnn_daa_ |
1*128_cnn_1*1024|   39.74          |  5.34
_tdnn_lr_0.008  |
-----------------------------------------------------
 cnn_tdnn_dae_  |
1*128_cnn_1*512 |   48.72          |  7.28
_tdnn_lr_0.008  |
-----------------------------------------------------
  cnn_tdnn_dae_ |
1*64_cnn_1*1024 |   39.74          |  7.28
_tdnn_lr_0.008  |
-----------------------------------------------------
 cnn_tdnn_dae_  |
2*64_cnn_1*1024 |   50.00          |  25.24   
_tdnn_lr_0.008  |
-----------------------------------------------------
 cnn_tdnn_dae_  |
2*64_cnn_1*1024 |    44.87         |  23.30  
_tdnn_lr_0.008_ |
nopooling       |
-----------------------------------------------------
  cnn_tdnn_dae_ |
1*32_cnn_1*1024 |   50.00          |  8.25
_tdnn_lr_0.008  |
------------------------------------------------------
 cnn_tdnn_dae_  |
2*32_cnn_1*1024 |    46.15         |  33.98   
_tdnn_lr_0.008  |
------------------------------------------------------
  cnn_tdnn_dae_ |
2*32_cnn_1*1024 |    47.44         |  28.16     
_tdnn_lr_0.008_ |
nopooling       |
------------------------------------------------------
  cnn_tdnn_dae_ |
1*16_cnn_1*1024 |     46.15        |  8.25   
_tdnn_lr_0.008  |
-------------------------------------------------------
  cnn_tdnn_dae_ |
2*16_cnn_1*1024 |     62.82        |   17.96
_tdnn_lr_0.008  |
-------------------------------------------------
  cnn_tdnn_dae_ |  
2*16_cnn_1*1024 |     50.00        |   22.82 
_tdnn_lr_0.008_ |
nopooling       |
-------------------------------------------------
 cnn_tdnn_dae_  |
2*16_cnn_1*256  |      48.72       |  16.99 
_tdnn_lr_0.008  |
-------------------------------------------------
  cnn_tdnn_dae_ |  
2*16_cnn_1*256  |    48.72         |   19.90
_tdnn_lr_0.008_ |
nopooling       |
-------------------------------------------------


Single channel cnn_tdnn: 40 Fbanks
------------------------------------------------------
 cnn_tdnn_mix_dae_|    
1*128_cnn_1*1024  |  25.64         |  9.22 
_tdnn_lr_0.008_c1 |
------------------------------------------------------
 cnn_tdnn_mix_dae_|    
1*128_cnn_1*1024  |  28.21         |  7.28
_tdnn_lr_0.008_c2 |
------------------------------------------------------
cnn_tdnn_mix_dae_ |    
1*128_cnn_1*1024  |  20.51         |  4.37                   (first)
_tdnn_lr_0.008_c3 |
------------------------------------------------------
cnn_tdnn_mix_dae_ |    
1*128_cnn_1*1024  |  21.79         |  5.34
_tdnn_lr_0.008_c4 |
-------------------------------------------------------