Reading list from NCMMSC Speech group
来自cslt Wiki
| Paper | Referee | Area and notes | Link |
|---|---|---|---|
| George E. Dahl, Dong Yu, Li Deng, and Alex Acero, Context-Dependent | |||
| Lawrence R. Rabiner, A tutorial on hidden Markov models and selected | |||
| End-to-End Text-Dependent Speaker Verification Georg Heigold, Ignacio Moreno, | |||
| Rapid Speaker Adaptation in Eigenvoice Space | 苏腾荣(华米) | ||
| G. Hinton, L. Deng, D. Yu et al., “Deep neural networks for acoustic modeling | |||
| Speech recognition with weighted finite-state transducers | 苏腾荣(华米) | ||
| Speech Recognition Algorithms Using Weighted Finite-State Transducers Takaaki | |||
| Daniel Povey.Discriminative Training for Large Vocabulary Speech Recognition. | |||
| Has¸im Sak, Andrew Senior, Kanishka Rao, Franc¸oise Beaufays, Fast and | |||
| Alex Graves, Supervised Sequence Labeling with Recurrent Neural Networks. Phd | |||
| Fast and Accurate Recurrent Neural Network Acoustic Models for Speech | |||
| Lattice-based optimization of sequence classification criteria for | |||
| MJF Gales:Maximum likelihood linear transformations for HMM-based speech | |||
| Woodland, P.C.: Maximum likelihood linear regression for speaker adaptation of | |||
| Tandem connectionist feature extraction for conventional HMM | |||
| A novel scheme for speaker recognition using a phonetically-aware deep neural | |||
| Campbell W M, Sturim D E, Reynolds D A. Support vector machines using GMM | |||
| Campbell W M, Sturim D E, Reynolds D A, et al. SVM based speaker verification | |||
| Douglas A. Reynolds, Thomas F. Quatieri, and Robert B. Dunn, Speaker | |||
| Najim Dehak, Patrick Kenny, R′eda Dehak, Pierre Dumouchel, and Pierre Ouellet, | |||
| Analysis of I-vector Length Normalization in Speaker Recognition Systems | |||
| Within-Class Covariance Normalization for SVM-based Speaker Recognition Andrew | |||
| Silke M Witt, Steve J Young, Phone-level pronunciation scoring and assessment | |||
| S. M. Witt.Use of Speech Recognition in Computer-assisted Language learning | |||
| Andrew J. Hunt, Alan W. Black, Unit selection in a concatenative speech | |||
| Zen H, Tokuda K, Black A W. Statistical parametric speech synthesis[J]. Speech | |||
| Tokuda K, Nankaku Y, Toda T, et al. Speech synthesis based on hidden Markov | |||
| Zee, H., Senior, A., Schuster. M. 2013, Statistical parametric speech sythesis | |||
| parameter generation algorithms for HMM-based speech synthesis, Proc. of | |||
| statistical parametric speech synthesis,Heiga Zen | 杨辰雨(新加坡I2R) | ||
| ZH Ling:Deep Learning for Acoustic Modeling in Parametric Speech | |||
| Xu Yi. Separation of functional components of tone and intonation from | |||
| automatic segmentation of speech into sentences and topics. Speech | |||
| ToBI: A standard for labeling English prosody | 杨辰雨(新加坡I2R) | ||
| chinese prosody and prosodic labeling of spontaneous speech | |||
| Shrikanth S. Narayanan and Panayiotis Georgiou, Behavioral Signal Processing: | |||
| Levelt. W, Roelofs. A, 1999, A theory of lexical access in speech production. | |||
| A Highly Robust Audio Fingerprinting System,Pilips 的Jaap Haitsma | |||
| Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013. | |||
| Dzmitry Bahdanau, KyungHyun Cho, Yoshua Bengio, Neural Machine Translation By | |||
| 《Spoken Language Processing: A Guide to Theory, Algorithm, and System | |||
| 自然语言处理综论,daniel jurafsky | 汪淼淼(阿里巴巴) | ||
| Speech enhancement theory and practice, Philipos C. Loizou, | |||
| Statistical methods for speech recognition, Jenilek, | 金琴(中国人民大学) | ||
| Hidden Markov Models for Speech Recognition (Edinburgh University Press 1990) | |||
| Machine Learning Paradigms for Speech Recognition | 卢鲤(腾讯) | ||
| Text-to-speech synthesis, Paul Taylor, University of Cambridge | |||
| A course in phonetics, Ladefoged | 冯卉(天津大学) | 群内多人推荐 | |
| A Course in Phonetics (7th Ed.). P. Ladeforged & K. Johnson (2015). Cengage | |||
| Acoustics and Auditory Phonetics (3rd Ed.).K. Johnson (2012). | |||
| Articulatory Phonetics. B. Gick, I. Wilson, & D. Derrick (2013). | |||
| 实验语音学概要,实验语音学概要 修订版 | 熊子瑜(语言所),时秀娟(天津师大) | ||
| 实验语音学基础教程,孔江平 | 时秀娟(天津师大) | ||
| Phonetics,Reetz & Jongman | 孙锐欣(华东师大) | ||
| 《实验语音学概要》吴宗济 | 王磊(音乐雷达)等 | 语音合成--音韵学 | |
| 自然语言处理综论,Daniel Jurafsky | |||
| Duda的Pattern Classification 第二版,有中文版 | 谢凌云(中国传媒大学) | ||
| 《现代汉语音典》蔡莲红、孔江平 | 王愈(捷通华声) | ||
| 《汉语语调实验研究》2012年,作者林茂灿 | 李爱军(社科院语言所) | ||
| 在英语语调理论AM基础上对汉语语调的研究 | |||
| Sun-Ah Jun写的prosodic | |||
| Kenneth N. Stevens的Acoustic Phonetics | 解炎陆(北京语言大学) | ||
| "Ladefoged《世界语音》 | |||
| Theory and Applications of Digital Speech Processing, Lawrence Rabiner, | |||
| T. F. Quatieri, Discrete-time speech signal processing(英文版) | |||
| 《信号与系统》奥本海《Signals and Systems》Alan V. Oppenheim | 陈谐(剑桥) | ||
| Microphone Arrays: Signal Processing Techniques and Applications (Digital | |||
| Signal Processing) by Michael Brandstein, Darren Ward, Springer, 2001. | |||
| Pattern recognition and meachine learning | 王东(清华) | ||
| Machine learning a probabilistic perspective,machine learning algorithmic | |||
| Introduction to statistical pattern recognition. Keinosuke Fukunaga | |||
| An introduction for support vector machine | 朱璇(三星北京研究院) | svm | |
| 步尚全《基础泛函分析》 | 邓侃(思昂教育) | 泛函 | |
| <<测度论与概率论基础>>,北京大学出版社 | 明怀平(新加坡I2R) | ||
| Daniel Povey, "Discriminative Training for Large Vocabulary Speech | |||
| 语境相关的声学模型和搜索策略的研究,高升,中国科学院博士论文,2001 | |||
| Tools | |||
| HTK book | |||
| Kaldi | |||
| Praat | |||
| Theano | |||
| CNTK | |||
| RNNLIB | |||
| Eesen | CTC toolkit | https://github.com/yajiemiao/eesen | |
| Video & online course | |||
| Deep Learning Summer School, Montreal 2015 | |||
| INTRODUCTION TO DIGITAL FILTERS | 王愈(捷通华声) | ||
| 一套在线的信号处理教程,深入浅出地讲解了信号分析处理的基础知识,并结合Matlab常用的信号系统库函数——如freqz——推导讲解简明透彻。 | |||
| 九州语言网 | 李爱军(社科院语言所) | ||
| 对汉语方言语法、语音感兴趣的,可以访问熊子瑜负责的语言所在建九州语言网 |