“2013-06-07”版本间的差异
来自cslt Wiki
(以内容“== Data sharing == * LM count files still undelivered! == DNN progress == === Experiments === * sparse DNN: sticky training (retrain the nnet while keeping the spars...”创建新页面) |
|||
| (相同用户的4个中间修订版本未显示) | |||
| 第21行: | 第21行: | ||
|} | |} | ||
| − | The | + | Conclusion: The extremely sparse network can largely pertain the performance of DNN. The structure seems more important than the parameter tuning based on the structure. |
* fixed-point DNN forwarding | * fixed-point DNN forwarding | ||
| − | + | #working on migrating the Atlas lib to ARM. | |
| + | #working on atlas/mkl independent implementation. | ||
=== Tencent exps === | === Tencent exps === | ||
| − | + | *6000h model training, could be finished on 25th approximately. | |
| + | *working on sequential DNN DT: refer to "Error Back Propagation For Sequence Training of Context-Dependent Deep Networks For Conversation Speech Transcription" | ||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
=== GPU & CPU merge === | === GPU & CPU merge === | ||
| 第59行: | 第38行: | ||
#on progress. | #on progress. | ||
| − | + | == RNN LM progress == | |
| − | == | + | *Initial work started. 100M data with a 10k vocabulary obtained a perplexity 180. |
| − | + | *More exploration continuous. | |
| − | * | + | |
| − | * | + | |
| − | + | ||
| − | + | ||
== Embedded progress == | == Embedded progress == | ||
| 第71行: | 第46行: | ||
*Status: | *Status: | ||
: check the reference, and change the compiling options | : check the reference, and change the compiling options | ||
| − | : the large-scale AM training based on the Tencent 400h data is done. | + | : the large-scale AM training based on the Tencent 400h data is done, continuous HMM. |
| − | + | ||
{| class="wikitable" | {| class="wikitable" | ||
| − | ! | + | ! Sys !! WER !! RT |
|- | |- | ||
| − | | | + | | SP model || 8.81 || 0.07 |
|- | |- | ||
| − | | | + | | Tencent tone || 6.33 || 0.40 |
| − | |- | + | |- |
| − | | | + | | Tencent notone || 5.04 || 0.31 |
|- | |- | ||
|} | |} | ||
| − | |||
*To be done | *To be done | ||
:# large scale parallel training. | :# large scale parallel training. | ||
:# NN based engine(dynamic and static). | :# NN based engine(dynamic and static). | ||
| + | :# Semi-continuous model with the Tencent data | ||
| + | :# Debug on external an ARM board. | ||
2013年6月7日 (五) 08:40的最后版本
目录
Data sharing
- LM count files still undelivered!
DNN progress
Experiments
- sparse DNN: sticky training (retrain the nnet while keeping the sparsness)
zero small values(test set: 1900), with extremely sparseness:
| threshold | 0 | 0.2 | 0.3 | 0.4 | 0.5 |
|---|---|---|---|---|---|
| shrinkage% | 0.0 | 66.4 | 81.6 | 0.90 | 0.94 |
| without sticky: WER | 7.55 | 9.46 | 53.23 | 98.99 | - |
| with sticky: WER | 7.55 | 7.56 | 7.87 | 8.81 | 9.87 |
Conclusion: The extremely sparse network can largely pertain the performance of DNN. The structure seems more important than the parameter tuning based on the structure.
- fixed-point DNN forwarding
- working on migrating the Atlas lib to ARM.
- working on atlas/mkl independent implementation.
Tencent exps
- 6000h model training, could be finished on 25th approximately.
- working on sequential DNN DT: refer to "Error Back Propagation For Sequence Training of Context-Dependent Deep Networks For Conversation Speech Transcription"
GPU & CPU merge
- on progress.
RNN LM progress
- Initial work started. 100M data with a 10k vocabulary obtained a perplexity 180.
- More exploration continuous.
Embedded progress
- Status:
- check the reference, and change the compiling options
- the large-scale AM training based on the Tencent 400h data is done, continuous HMM.
| Sys | WER | RT |
|---|---|---|
| SP model | 8.81 | 0.07 |
| Tencent tone | 6.33 | 0.40 |
| Tencent notone | 5.04 | 0.31 |
- To be done
- large scale parallel training.
- NN based engine(dynamic and static).
- Semi-continuous model with the Tencent data
- Debug on external an ARM board.