Wangd-wiki-article-2020-nb
来自cslt Wiki
2017
- Back to 2017, we set our goal of deep speech factorizatoin. The first paper is published on ICASSP 2018
- Lantian Li, Dong Wang, Yixiang Chen, Ying Shi, Zhiyuan Tang, "DEEP FACTORIZATION FOR SPEECH SIGNAL", ICASSP 2018. [1]
- We noticed the problem of soft-max based training, due to the discardxing of the output layers
- Lantian Li, Zhiyuan Tang, Dong Wang, "FULL-INFO TRAINING FOR DEEP SPEAKER FEATURE LEARNING", ICASSP 2018. [2]
2018
- 2018/12/26, propose the idea of deep statistical speaker representation. That was based on VAE [3]
2019
- We noticed the impact of irregulation of deep speaker vectors, and tried to present normalization approaches
- Yang Zhang and Lantian Li and Dong Wang, VAE-based regularization for deep speaker embedding, Interspeech 2019. [4]
- 2019/04/20, "Normalization in speaker embedding", Speaker recognition workshop, Kunshan, Shanghai, [5]
- 2019/07/17, Deep Feature Learning and Normalization for Speaker Recognition, report in India summr school [6]
- 2019/08/14, present the first proposal that uses flow to model deep speaker featrues. (Report in Huawei group discussion)
- 2019/10/27, present the initial idea of using flow to perform factorization, CSLT weekly meeting [7]