“Gigabye LM”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
2. pruning the 4k 3gram LM.
3. word-based 3-gram
 
(5位用户的15个中间修订版本未显示)
第29行: 第29行:
 
== 2. pruning the 4k 3gram LM. ==
 
== 2. pruning the 4k 3gram LM. ==
  
{|
+
{|class="wikitable"
! Model ||2gram ||3gram  ||            size  ||      ppl
+
! Model ||2gram ||3gram  ||            size  ||      ppl || fst size
 
|-
 
|-
| 1          ||          1e-7    ||  1e-7          ||    30M     logprob= -8.55982e+06 ppl= 102.796 ppl1= 111.532
+
| 1          ||          1e-7    ||  1e-7          ||    30M   ||  ppl= 102.796 || 860M
 
|-
 
|-
|2          ||          1e-6    ||  1e-6          ||    5M       logprob= -9.26982e+06 ppl= 150.96   ppl1= 164.9
+
|2          ||          1e-6    ||  1e-6          ||    5M     ||  ppl= 150.96 ||  152M
 
|-
 
|-
| 3          ||          1e-7    ||  1e-6          ||    11M     logprob= -9.09681e+06 ppl= 137.467 ppl1= 149.913
+
| 3          ||          1e-7    ||  1e-6          ||    11M   ||  ppl= 137.467 ||  224M
 
|-
 
|-
 
|}
 
|}
第62行: 第62行:
 
|10k:  -  ||    770M            ||  193M    ||  135M
 
|10k:  -  ||    770M            ||  193M    ||  135M
 
|-
 
|-
|20k: -  ||    -             ||  -    || -
+
|20k: -  ||    -                     ||  217M    || 142M
 +
|-
 +
|}
 +
 
 +
Test is performed on 863 M49, LDA+LLT (tri2b), in terms of character error rate (CER). The NUM part is deleted from the decoding result. The pair after CER represents (1/acweight, t/utt).
 +
 
 +
{| class="wikitable"
 +
!-        !! th-6 !!th-7/6 !!th-7
 +
|-
 +
|10k   ||23.77(13,0.92)||  22.41(11,0.93)|| 21.96(11,0.93)
 +
|-
 +
|20k    ||21.92(13,0.99)||  20.33(12,0.97)|| 19.38(12,0.96)
 +
|-
 +
|}
 +
 
 +
Results with LDA+MLLT+MMI
 +
 
 +
{| class="wikitable"
 +
!-        !! th-6 !!th-7/6 !!th-7
 +
|-
 +
|10k    || 22.95(13, 1.0)||  21.83(13,1.0)|| 21.41(10, 0.98)
 +
|-
 +
|20k    || 20.71(11, 1.1)||  19.26(11, 1.1)|| 18.44(10, 1.1)
 +
|-
 +
|}
 +
 
 +
 
 +
Results with LDA+MLLT+bMMI
 +
 
 +
{| class="wikitable"
 +
!-        !! th-6 !!th-7/6 !!th-7
 +
|-
 +
|10k    || 22.68(10,1.0) || 21.46(10,1.0) || 20.96(10,1.0)
 +
|-
 +
|20k    || 20.39(12, 1.1) || 18.97(11,1.1)|| 18.23(10,1.1)
 
|-
 
|-
 
|}
 
|}

2012年9月14日 (五) 08:23的最后版本

1. very initial, without any prunning, character based. Here is the size and perplexity.

The training is with Gigabytes except the cna data, and ppl testing is based on a sub set from the cna data (big52gb applied)

2gram:

25M 2gram.4000.gz: 0 zeroprobs, logprob= -9.39983e+06 ppl= 161.965 ppl1= 177.141


3gram:

47M 3gram.500.gz:0 zeroprobs, logprob= -6.34868e+06 ppl= 85.1361 ppl1= 94.2525

117M 3gram.1000.gz  :0 zeroprobs, logprob= -7.43809e+06 ppl= 80.6408 ppl1= 87.7439

195M 3gram.2000.gz:0 zeroprobs, logprob= -7.95872e+06 ppl= 79.9875 ppl1= 86.5196

221M 3gram.3000.gz:0 zeroprobs, logprob= -8.04799e+06 ppl= 80.2418 ppl1= 86.7277

229M 3gram.4000.gz:0 zeroprobs, logprob= -8.15697e+06 ppl= 82.6585 ppl1= 89.3392

4gram:

205M 4gram.500.gz:0 zeroprobs, logprob= -6.25395e+06 ppl= 79.6739 ppl1= 88.0716

472M 4gram.1000.gz:0 zeroprobs, logprob= -7.21607e+06 ppl= 70.737 ppl1= 76.774


2. pruning the 4k 3gram LM.

Model 2gram 3gram size ppl fst size
1 1e-7 1e-7 30M ppl= 102.796 860M
2 1e-6 1e-6 5M ppl= 150.96 152M
3 1e-7 1e-6 11M ppl= 137.467 224M

3. word-based 3-gram

tri-gram size:

org th-7 th-7/6 th-6
10k: 52M 23M 8M 4M
20k: 57M 24M 9M 4M

final fst size:

org th-7 th-7/6 th-6
10k: - 770M 193M 135M
20k: - - 217M 142M

Test is performed on 863 M49, LDA+LLT (tri2b), in terms of character error rate (CER). The NUM part is deleted from the decoding result. The pair after CER represents (1/acweight, t/utt).

- th-6 th-7/6 th-7
10k 23.77(13,0.92) 22.41(11,0.93) 21.96(11,0.93)
20k 21.92(13,0.99) 20.33(12,0.97) 19.38(12,0.96)

Results with LDA+MLLT+MMI

- th-6 th-7/6 th-7
10k 22.95(13, 1.0) 21.83(13,1.0) 21.41(10, 0.98)
20k 20.71(11, 1.1) 19.26(11, 1.1) 18.44(10, 1.1)


Results with LDA+MLLT+bMMI

- th-6 th-7/6 th-7
10k 22.68(10,1.0) 21.46(10,1.0) 20.96(10,1.0)
20k 20.39(12, 1.1) 18.97(11,1.1) 18.23(10,1.1)