comparing approaches to convert recurrent neural networks into backoff language models for efficient decoding[1]