“2013-04-26”版本间的差异
来自cslt Wiki
(以内容“==Data sharing== * AM/lexicon/LM are shared. * LM count files are still in transfering. ==DNN progress== ===400 hour DNN training=== {| class="wikitable" !Test Set!!...”创建新页面) |
|||
第36行: | 第36行: | ||
{class="wikitable" | {class="wikitable" | ||
− | !Feature !! GMM-bMMI !! DNN !! DNN-MMI | + | !Feature !! GMM-bMMI !! DNN !! DNN-MMI!! |
|PLP(-5,+5) || 38.4 || 26.5 || 23.8 || | |PLP(-5,+5) || 38.4 || 26.5 || 23.8 || | ||
|- | |- |
2013年4月26日 (五) 05:30的版本
目录
Data sharing
- AM/lexicon/LM are shared.
- LM count files are still in transfering.
DNN progress
400 hour DNN training
Test Set | Tencent Baseline | bMMI | fMMI | BN | Hybrid |
---|---|---|---|---|---|
1900 | 8.4 | 7.65 | 7.35 | 6.57 | |
2044 | 22.4 | 24.44 | 24.03 | 21.77 | |
online1 | 35.6 | 34.66 | 34.33 | 31.44 | |
online2 | 29.6 | 27.23 | 26.80 | 24.10 | |
map | 24.5 | 27.54 | 27.69 | 23.79 | |
notepad | 16 | 19.81 | 21.75 | 15.81 | |
general | 36 | 38.52 | 38.90 | 33.61 | |
speedup | 26.8 | 27.88 | 26.81 | 22.82 |
- Tencent baseline is with 700h online data+ 700h 863 data, HLDA+MPE, 88k lexicon
- Our results are with 400 hour AM, 88k LM. ML+bMMI
Tencent test result
- AM: 70h training data(2 day, 15 machines, 10 threads)
- LM: 88k LM
- Test case: general
{class="wikitable" !Feature !! GMM-bMMI !! DNN !! DNN-MMI!! |PLP(-5,+5) || 38.4 || 26.5 || 23.8 || |- |PLP+LDA+MLLT(-5,+5) || 38.4 ||28.7 || |- |}
GPU & CPU merge
- Invesigate the possibility to merge GPU and CPU code. Try to find out an easier way. (1 week)
L-1 sparse initial training
- Start to investigating.
Kaldi/HTK merge
- HTK2Kaldi: the tool with Kaldi does not work.
- Kaldi2HTK: done with implementation. Testing?
Embedded progress
- Some large performance (speed) degradation with the embedded platform(1/60).
- Planning for sparse DNN.
- QA LM training, still failed. Mengyuan need more work on this.