|
|
第1行: |
第1行: |
− | Works in This week:
| |
| | | |
− | 1.Finish training word embeddings via 5 models :
| |
− | using EnWiki dataset(953M):
| |
− | CBOW,Skip-Gram(SG)
| |
− | using text8 dataset(95.3M):
| |
− | CBOW,Skip-Gram(SG),C&W,GloVe,LBL and Order(count-based)
| |
− |
| |
− | 2.Use tasks to measure quality of the word vectors with various dimensions:
| |
− | word similarity(ws)
| |
− | the TOEFL set
| |
− | analogy task
| |
− | text classification
| |
− | named entity recognition(ner)
| |
− | sentence-level sentiment classification (based on convolutional neural networks),just call it 'cnn'
| |
− | part-of-speech tagging(pos)
| |