“2013-07-22”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以内容“== Data sharing == * LM count files still undelivered! == DNN progress == === Experiments === * Sparse DNN. 1200-1200-1200-3536 1200-1200-1200-353...”创建新页面)
 
第9行: 第9行:
  
  
1200-1200-1200-3536                1200-1200-1200-3536-sparse0.3 (sparsity 1/5)
+
:1200-1200-1200-3536                1200-1200-1200-3536-sparse0.3 (sparsity 1/5)
original atlas:  RT 2.3                        RT 2.3
+
:original atlas:  RT 2.3                        RT 2.3
atlas sparse:    RT 54                          RT 14   
+
:atlas sparse:    RT 54                          RT 14   
NIST smatmat:    RT 27.3                        RT 5.98
+
:NIST smatmat:    RT 27.3                        RT 5.98
  
  
800-800-800-2108                    800-800-800-2108-sparse0.3 (sparsity 2/5):
+
:800-800-800-2108                    800-800-800-2108-sparse0.3 (sparsity 2/5):
original atlas: RT 1.1                          RT 1.1
+
:original atlas: RT 1.1                          RT 1.1
NIST smatmat:  RT 11.9                        RT 5.5
+
:NIST smatmat:  RT 11.9                        RT 5.5
  
 
Conclusions:
 
Conclusions:
1. the atlas works well for both non-sparse and sparse.
+
# the atlas works well for both non-sparse and sparse.
2. sparsity does not work if the sparsity rate is low. It looks the sparsity computing can  
+
# sparsity does not work if the sparsity rate is low. It looks the sparsity computing can  
 
outperform the non-sparsity computing only if the sparsity rate is higher than 1/15.
 
outperform the non-sparsity computing only if the sparsity rate is higher than 1/15.
3. In another words, to employ sparsity, the cost that first should be taken is the error rate  
+
# In another words, to employ sparsity, the cost that first should be taken is the error rate  
 
increase with the 1/15 compression.
 
increase with the 1/15 compression.
4. The sparse approach seems more useful for storage: if the sparsity is higher than 1/2, then the  
+
# The sparse approach seems more useful for storage: if the sparsity is higher than 1/2, then the  
 
storage of CSR/CSC will start to save storage.
 
storage of CSR/CSC will start to save storage.
5. Possibly unit-based sparsity instead of weight sparsity.  
+
# Possibly unit-based sparsity instead of weight sparsity.  
  
 
=== Tencent exps ===
 
=== Tencent exps ===
第40行: 第40行:
 
*Tested various PS models:
 
*Tested various PS models:
  
ID            model        feature    WER      RT          storage
+
:ID            model        feature    WER      RT          storage
  
semi_10000    semi HMM    s2-4x    6.30%      0.80          10.2M
+
:semi_10000    semi HMM    s2-4x    6.30%      0.80          10.2M
semi_5000    semi HMM    s2-4x    6.70%      0.74          5.2M
+
:semi_5000    semi HMM    s2-4x    6.70%      0.74          5.2M
semi_5000    semi HMM    1c-d-dd  9.11%      0.91          1.3M
+
:semi_5000    semi HMM    1c-d-dd  9.11%      0.91          1.3M
ptm_5000      PTM HMM    s2-4x    6.47%      2.15          1.3M
+
:ptm_5000      PTM HMM    s2-4x    6.47%      2.15          1.3M
  
 
So there is not a perfect which wins in terms all the criteria. Looks like semi-5000 is an acceptable trade-off.
 
So there is not a perfect which wins in terms all the criteria. Looks like semi-5000 is an acceptable trade-off.

2013年7月22日 (一) 12:29的版本

Data sharing

  • LM count files still undelivered!

DNN progress

Experiments

  • Sparse DNN.


1200-1200-1200-3536 1200-1200-1200-3536-sparse0.3 (sparsity 1/5)
original atlas: RT 2.3 RT 2.3
atlas sparse: RT 54 RT 14
NIST smatmat: RT 27.3 RT 5.98


800-800-800-2108 800-800-800-2108-sparse0.3 (sparsity 2/5):
original atlas: RT 1.1 RT 1.1
NIST smatmat: RT 11.9 RT 5.5

Conclusions:

  1. the atlas works well for both non-sparse and sparse.
  2. sparsity does not work if the sparsity rate is low. It looks the sparsity computing can

outperform the non-sparsity computing only if the sparsity rate is higher than 1/15.

  1. In another words, to employ sparsity, the cost that first should be taken is the error rate

increase with the 1/15 compression.

  1. The sparse approach seems more useful for storage: if the sparsity is higher than 1/2, then the

storage of CSR/CSC will start to save storage.

  1. Possibly unit-based sparsity instead of weight sparsity.

Tencent exps

GPU & CPU merge

  1. Hold


Embedded progress

  • Tested various PS models:
ID model feature WER RT storage
semi_10000 semi HMM s2-4x 6.30% 0.80 10.2M
semi_5000 semi HMM s2-4x 6.70% 0.74 5.2M
semi_5000 semi HMM 1c-d-dd 9.11% 0.91 1.3M
ptm_5000 PTM HMM s2-4x 6.47% 2.15 1.3M

So there is not a perfect which wins in terms all the criteria. Looks like semi-5000 is an acceptable trade-off.