“ASR Status Report 2017-12-25”版本间的差异
来自cslt Wiki
(以“ {| class="wikitable" !Date!!People !! Last Week !! This Week !! Task Tracking |- | rowspan="8"|2017.12.18 |- |Ying Shi || * Finish the Voice-printer program *...”为内容创建页面) |
|||
(4位用户的18个中间修订版本未显示) | |||
第1行: | 第1行: | ||
+ | |||
+ | {| class="wikitable" | ||
+ | !Date!!People !! Last Week !! This Week !! Task Tracking | ||
+ | |- | ||
+ | | rowspan="8"|2017.12.25 | ||
+ | |||
+ | |||
+ | |||
+ | |- | ||
+ | |Miao Zhang | ||
+ | || | ||
+ | * | ||
+ | || | ||
+ | * Read the 16k model script | ||
+ | * The cough recognition codes left by Xiaofei | ||
+ | || | ||
+ | * check the trivial database, make it more reasonable | ||
+ | * test the 16k model on the database | ||
+ | |- | ||
+ | |||
+ | |||
+ | |||
+ | |- | ||
+ | |Ying Shi | ||
+ | || | ||
+ | * some function for voice-printer | ||
+ | ** speaker vector per utterance [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/6/63/SpkerVector2.png here] | ||
+ | ** speaker vector minus base speaker vector [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/6/6b/Spkear_vector.png here] | ||
+ | * CTC for Haibo Wang (Token accuracy on train set 92.80%, on cv set 89.74%) haven't test on test set | ||
+ | * QRcode | ||
+ | ** speaker vector merge phone grayscale [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/f/f3/Speaker_factor_gray.png here] | ||
+ | ** speaker vector merge phone black-and-white map [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/9/97/1514176866%281%29.png here] | ||
+ | ** speaker vector merge phone black-and-white map minus base vector [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/4/4e/SpeakerQrCode2.png here] | ||
+ | * ivector baseline for kazak-uyghur LRE performance is 81.85% (Utt level) | ||
+ | || | ||
+ | * Finish voice-checker copyright and submit the copyright in this Wednesday | ||
+ | || | ||
+ | * | ||
+ | |- | ||
+ | |||
+ | |||
+ | |||
+ | |- | ||
+ | |Lantian Li | ||
+ | || | ||
+ | * Complete the recipe for `VV_FACTOR`. | ||
+ | * 16K and 8K deep speaker model comparison.[http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=lilt&step=view_request&cvssid=646] | ||
+ | || | ||
+ | * Patent for `VV_QuickMark`. | ||
+ | * Complete the demo for `VV_FACTOR`.[Assign to Shouyi Dai] | ||
+ | * Phonetic speaker embedding. | ||
+ | * Overlap training for speaker features. | ||
+ | || | ||
+ | * | ||
+ | |- | ||
+ | |||
+ | |||
+ | |- | ||
+ | |Zhiyuan Tang | ||
+ | || | ||
+ | * word level pronunciation accuracy based on likelihood (tell which word is well pronounced as '0' or badly pronounced '1') | ||
+ | || | ||
+ | * model adaptation | ||
+ | * if possible, an alpha version Parrot for test inside lab to collect some data for better configurature | ||
+ | || | ||
+ | |- | ||
+ | |||
+ | |||
+ | |} | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | |||
{| class="wikitable" | {| class="wikitable" |
2017年12月25日 (一) 06:53的最后版本
Date | People | Last Week | This Week | Task Tracking |
---|---|---|---|---|
2017.12.25
| ||||
Miao Zhang |
|
|
| |
Ying Shi |
|
|
| |
Lantian Li |
|
|
| |
Zhiyuan Tang |
|
|
Date | People | Last Week | This Week | Task Tracking |
---|---|---|---|---|
2017.12.18
| ||||
Ying Shi |
|
|
| |
Lantian Li |
|
|
| |
Zhiyuan Tang |
|
|