2013-07-26
来自cslt Wiki
目录
Data sharing
- LM count files still undelivered!
DNN progress
Experiments
- Discriminative DNN
Use sequential MPE (with the DNN-based alignment/denlattices). The network structure: 100 + 4 X 8000 + 2100:
cross-entropy (original) | MPE (it1) | MPE (it2) | MPE (it3) | MPE (it4) |
---|---|---|---|---|
map | 22.98 | 23.91 |
- Sparse DNN on the ARM board
1200-1200-1200-3536 1200-1200-1200-3536-sparse0.3 (sparsity 1/5) original atlas: RT 2.3 RT 2.3 atlas sparse: RT 54 RT 14 NIST smatmat: RT 27.3 RT 5.98
800-800-800-2108 800-800-800-2108-sparse0.3 (sparsity 2/5): original atlas: RT 1.3 RT 1.1 NIST smatmat: RT 11.9 RT 5.5 600-600-600-1500 original atlas: RT 0.9 NIST smatmat: RT 6.5
*To be done: # Try SuiteSparse lib # Test accuracy on large data set
Tencent exps
GPU & CPU merge
- Hold
Confidence estimation
- We are interested in confidence estimation from DNN output directly. This confidence is naturally 'posterior' and does not rely on graphs so simply to generalize, e.g., when examine which output is the best from multiple decoders.
- The first design employs the best path in decoding/alignment, based on the state-posterior matrix directly. Code finished and the intuitive testing seems ok.
- To be done:
- Large-scale test.
- CI Phone posterior-based (instead of state posterior-based) full path(instead of best path) confidence estimation.
- Due to the DNN-based confidence which is independent of decoding graphs, it is simple to compare or integrate results from various graphs. E.g., decoding can be performed on a general big graph and a user-specific graph, and then compare the confidence of the two results and select the best one.
- To be done
- Coding finished.
- Debuging and Test next week.
Embedded progress
- The DNN FE is now 0.7 RT, and so can be employed in simple grammar tasks.
- To be done
- Shrink the NN structure (4 layer to 2 layer), and test the performance
- The Kaldi decoder costs a lot when the graph is large. Need to improve the indexing of the FST structure.
- Integrate the DNN FE with the pocket-sphinx decoder.