“Asr-nsfc-weekly-2016-11-21”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
(以“ {| class="wikitable" !Date!!People !! Last Week !! This Week |- | rowspan="5"|2016.11.21 |- |清华 || * 哈语声学模型(TDNN)训练完毕[http://192.168.0.51...”为内容创建页面)
 
 
(3位用户的4个中间修订版本未显示)
第8行: 第8行:
 
|清华
 
|清华
 
||  
 
||  
*  哈语声学模型(TDNN)训练完毕[http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=tangzy&step=view_request&cvssid=576]
+
*  哈语声学模型(TDNN)训练完毕
*  需新大检查哈语语音数据
+
*  哈语语料训练语言模型,并用于解码[http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=tangzy&step=view_request&cvssid=576]:语料与训练集的domain符合度有待检查
*  需新大上传语言模型语料
+
 
||  
 
||  
完成训练哈语声学模型
+
获取合适的语料训练语言模型,完成baseline
 
|-
 
|-
 
|-
 
|-
 
|新大
 
|新大
 
||  
 
||  
*  
+
* Kazak language vocabulary extraction and acoustic dictionary building tool are finished, can be shared along with Uyghur tools
 +
* There are still never ending problems with Kazak acoustic rules and spelling, we spent a lot of time on correcting.
 +
* We built tools of character and acoustic layers of corpora compilation for Uyghur and Kazak. We hope to have a common structure for every language.
 
||  
 
||  
*  
+
* We plan to conclude a relatively complete syllable structure and use it for spell checking and correcting various corpora.
 +
* We appointed students to make a double directional text-numeric transformation tool.
 +
* A third layer of morphological analyzer tool is designed for multilingual purpose first for Uyghur and Kazak language, mainly for sub-word analysis.
 +
|-
 
|-
 
|-
 
|民大
 
|民大
 
||  
 
||  
*   
+
确定了口语发音文本
 +
*  校对拉萨话发音词典300条
 
||  
 
||  
*   
+
* 选择书面语发音文本
 +
*  藏语拉萨话发音词典的校对
 +
*  蒙语词典录入  
 
|-
 
|-
 
|}
 
|}

2016年11月21日 (一) 14:10的最后版本


Date People Last Week This Week
2016.11.21
清华
  • 哈语声学模型(TDNN)训练完毕
  • 哈语语料训练语言模型,并用于解码[1]:语料与训练集的domain符合度有待检查
  • 获取合适的语料训练语言模型,完成baseline
新大
  • Kazak language vocabulary extraction and acoustic dictionary building tool are finished, can be shared along with Uyghur tools
  • There are still never ending problems with Kazak acoustic rules and spelling, we spent a lot of time on correcting.
  • We built tools of character and acoustic layers of corpora compilation for Uyghur and Kazak. We hope to have a common structure for every language.
  • We plan to conclude a relatively complete syllable structure and use it for spell checking and correcting various corpora.
  • We appointed students to make a double directional text-numeric transformation tool.
  • A third layer of morphological analyzer tool is designed for multilingual purpose first for Uyghur and Kazak language, mainly for sub-word analysis.
民大
  • 确定了口语发音文本
  • 校对拉萨话发音词典300条
  • 选择书面语发音文本
  • 藏语拉萨话发音词典的校对
  • 蒙语词典录入

Date People Last Week This Week
2016.11.14
清华
  • 哈语模型初始模型训练完成
  • 需新大检查哈语语音数据
  • 需新大上传语言模型语料
  • 完成训练哈语声学模型
新大
  • 完成哈萨克语的文字代码转换,代码归一化等。
  • 完成了6480个哈萨克词条的人工发音辞典。
  • 同时修改了哈萨克“语音-文本”语料中的一些拼写错误。
  • 要完成哈萨克语言的发音辞典,包括人工和自动部分。
  • 对哈萨克语音语料种的“语音-文本”不对齐,缺失等进行检查,并修改。
  • 检查哈萨克语言模型语料中的问题,并为建立LM模型做准备。
民大
  • 口语发音文本的选择
  • 确定口语发音文本
  • 藏语拉萨话发音词典的校对
  • 蒙语词典录入