“ASR Status Report 2017-12-25”版本间的差异
来自cslt Wiki
| (3位用户的13个中间修订版本未显示) | |||
| 第4行: | 第4行: | ||
|- | |- | ||
| rowspan="8"|2017.12.25 | | rowspan="8"|2017.12.25 | ||
| + | |||
|- | |- | ||
| − | | | + | |Miao Zhang |
|| | || | ||
* | * | ||
| + | || | ||
| + | * Read the 16k model script | ||
| + | * The cough recognition codes left by Xiaofei | ||
| + | || | ||
| + | * check the trivial database, make it more reasonable | ||
| + | * test the 16k model on the database | ||
| + | |- | ||
| + | |||
| + | |||
| + | |||
| + | |- | ||
| + | |Ying Shi | ||
| + | || | ||
| + | * some function for voice-printer | ||
| + | ** speaker vector per utterance [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/6/63/SpkerVector2.png here] | ||
| + | ** speaker vector minus base speaker vector [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/6/6b/Spkear_vector.png here] | ||
| + | * CTC for Haibo Wang (Token accuracy on train set 92.80%, on cv set 89.74%) haven't test on test set | ||
| + | * QRcode | ||
| + | ** speaker vector merge phone grayscale [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/f/f3/Speaker_factor_gray.png here] | ||
| + | ** speaker vector merge phone black-and-white map [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/9/97/1514176866%281%29.png here] | ||
| + | ** speaker vector merge phone black-and-white map minus base vector [http://cslt.riit.tsinghua.edu.cn/mediawiki/images/4/4e/SpeakerQrCode2.png here] | ||
| + | * ivector baseline for kazak-uyghur LRE performance is 81.85% (Utt level) | ||
|| | || | ||
| − | * | + | * Finish voice-checker copyright and submit the copyright in this Wednesday |
|| | || | ||
* | * | ||
| 第36行: | 第59行: | ||
|Zhiyuan Tang | |Zhiyuan Tang | ||
|| | || | ||
| − | * | + | * word level pronunciation accuracy based on likelihood (tell which word is well pronounced as '0' or badly pronounced '1') |
|| | || | ||
| − | * | + | * model adaptation |
| + | * if possible, an alpha version Parrot for test inside lab to collect some data for better configurature | ||
| + | || | ||
| + | |- | ||
| + | |||
| + | |||
| + | |} | ||
| + | |||
| + | |||
| + | ---- | ||
| + | |||
| + | |||
| + | |||
| + | {| class="wikitable" | ||
| + | !Date!!People !! Last Week !! This Week !! Task Tracking | ||
| + | |- | ||
| + | | rowspan="8"|2017.12.18 | ||
| + | |||
| + | |||
| + | |- | ||
| + | |Ying Shi | ||
| + | || | ||
| + | * Finish the Voice-printer program | ||
| + | * Apply the software copyright of Voice-printer | ||
| + | * APSIPA 2017 | ||
| + | || | ||
| + | * Finish the software copyright of Voice-checker | ||
| + | * Baseline of similar language recongnition system(i-vector, DNN, PTN) | ||
| + | || | ||
| + | * focus on function other than UI | ||
| + | * i-vector LID first | ||
| + | |- | ||
| + | |||
| + | |||
| + | |||
| + | |- | ||
| + | |Lantian Li | ||
| + | || | ||
| + | * Optimize the demo of `VV_Seg` and `VV_QuickMark`. | ||
| + | * Phone-aware scorning on deep speaker feature. [http://192.168.0.51:5555/cgi-bin/cvss/cvss_request.pl?account=lilt&step=view_request&cvssid=643] | ||
| + | || | ||
| + | * Phone-aware scorning. | ||
| + | * Overlap training for speaker features. | ||
| + | || | ||
| + | * test on trivial dataset | ||
| + | |- | ||
| + | |||
| + | |||
| + | |- | ||
| + | |Zhiyuan Tang | ||
| + | || | ||
| + | * easy-to-read interfaces for Parrot | ||
| + | || | ||
| + | * phone-level likelihood for detail diagnosis and an alpha version Parrot for test inside lab | ||
|| | || | ||
|- | |- | ||
2017年12月25日 (一) 06:53的最后版本
| Date | People | Last Week | This Week | Task Tracking |
|---|---|---|---|---|
| 2017.12.25
| ||||
| Miao Zhang |
|
|
| |
| Ying Shi |
|
|
| |
| Lantian Li |
|
|
| |
| Zhiyuan Tang |
|
|
| Date | People | Last Week | This Week | Task Tracking |
|---|---|---|---|---|
| 2017.12.18
| ||||
| Ying Shi |
|
|
| |
| Lantian Li |
|
|
| |
| Zhiyuan Tang |
|
|