140512-Xi Ma

来自cslt Wiki
2014年5月12日 (一) 01:51Mx讨论 | 贡献的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航搜索

Last Week:

1.Extract the corpus of related areas from the original corpus by keyword.

2.Mark the pinyin for the keyword list.

This Week:

1. Testing ppl of each sentence from the original corpus and extracting sentences of less than a specific ppl form a new training set.

2. Train language model by using new training set and test the ppl.