2013-11-22

来自cslt Wiki
跳转至: 导航搜索

Data sharing

  • LM count files still undelivered!

AM development

Sparse DNN

  • Optimal Brain Damage(OBD).
  1. Online OBD held.
  2. OBD + L1 norm start to investigation.


Engine optimization

  • Graph search costs much. AM computing : graph search = 35: 65, for 1e-5 LM graph.
  • Add AM score cache: AM computing + graph search = 25 : 75, reduced 5%.
  • Investigating compact FST. Apply const FST first, and then try to implement LOUDS FST.
  • Investigate spare computing

English NN training

  • WSJ+Chinglish data training, done
  • Prepare to test

Tencent exps

N/A


LM development

NN LM

  • NN-based CSLM normalization. Normalize 10 NN networks by an additional hidden layer.
  • 10240 X 10240 matrix, extremely slow. 5M data finished. PPL 289.
  • Change the network to 10240 X 100 + 10240 X 100. 5M finished. PPL 241. 50M data: 224.
  • 500 M running.

Embedded development

  • Will start to implement the embedded stream mode.

Speech QA

  • Database migration, done.
  • Client bugs fixed.
  • Recording scheduled.
  • Text-based test completed. Correct answer/no answer/incorrect answer: 13:5:2