“Asr-project-segment”版本间的差异
来自cslt Wiki
(→Demonstration) |
|||
第10行: | 第10行: | ||
==Demonstration== | ==Demonstration== | ||
− | + | ||
[[文件:Seg.png]] | [[文件:Seg.png]] |
2017年11月30日 (四) 11:56的最后版本
Introduction
Speaker segmentation is important for many applications, among which include speaker-dependent adaptation, telephone archive analysis. Traditional approaches include ergodic HMM re-estimation, turn point detection and clustering, i-vector clustering. All these methods, however, are highly vulnerable for noise corruptions, speech overlapping, data imbalance.
We developed a deep segmentation approach that is based on deep learning approach that can analysis the true underlying speaker properties of speech signals, and then use simple clustering methods to achieve very high accuracy in segmentation.