“第二十七章 语音合成”版本间的差异

来自cslt Wiki
跳转至: 导航搜索
第7行: 第7行:
* AI100问:甜美的导航声音是如何产生的?  [http://aigraph.cslt.org/ai100/AI-100-63-甜美的导航声音是如何产生的.pdf]
* AI100问:甜美的导航声音是如何产生的?  [http://aigraph.cslt.org/ai100/AI-100-63-甜美的导航声音是如何产生的.pdf]
* 维基百科:语音合成 [http://aigraph.cslt.org/courses/27/Speech_synthesis.pdf][http://aigraph.cslt.org/courses/27/语音合成.pdf]
* 维基百科:声码器 [http://aigraph.cslt.org/courses/27/聲碼器.pdf][http://aigraph.cslt.org/courses/27/Vocoder.pdf]
第14行: 第15行:
* Vocoder 1939 (long) [http://aigraph.cslt.org/courses/27/vocoder-1939.mp4]
* Vocoder 1939 (long) [http://aigraph.cslt.org/courses/27/vocoder-1939.mp4]
* Vocoder 1939 (short) [http://aigraph.cslt.org/courses/27/vocoder-short.mp4]
* Vocoder 1939 (short) [http://aigraph.cslt.org/courses/27/vocoder-short.mp4]
* Vocal folder [http://aigraph.cslt.org/courses/27/vocalfolder.mp4]
* Vocal tract [http://aigraph.cslt.org/courses/27/vocaltract.mp4]
* Auditory perception [http://aigraph.cslt.org/courses/27/hearing.mp4]
第35行: 第38行:
* 汤志远,李蓝天,王东,石颖,蔡云麒,郑方,《语音识别基本法》,清华大学出牌社,2021. [https://item.jd.com/13143784.html]
* 汤志远,李蓝天,王东,石颖,蔡云麒,郑方,《语音识别基本法》,清华大学出牌社,2021. [https://item.jd.com/13143784.html]
* Ning Y, He S, Wu Z, et al. A review of deep learning based speech synthesis[J]. Applied Sciences, 2019, 9(19): 4050. [https://www.mdpi.com/2076-3417/9/19/4050/pdf]
* Dudley H. The vocoder—Electrical re-creation of speech[J]. Journal of the Society of Motion Picture Engineers, 1940, 34(3): 272-278. [https://ieeexplore.ieee.org/abstract/document/7250932]
* Zen H, Tokuda K, Black A W. Statistical parametric speech synthesis[J]. speech communication, 2009, 51(11): 1039-1064. [https://nitech.repo.nii.ac.jp/index.php?action=pages_view_main&active_action=repository_action_common_download&item_id=5432&item_no=1&attribute_id=39&file_no=1&page_id=13&block_id=21]
* Dudley H. Remaking speech[J]. The Journal of the Acoustical Society of America, 1939, 11(2): 169-177.[https://asa.scitation.org/doi/pdf/10.1121/1.1916020]
* Dudley, Homer (October 1940). "The Carrier Nature of Speech". Bell System Technical Journal. XIX (4). [https://onlinelibrary.wiley.com/doi/epdf/10.1002/j.1538-7305.1940.tb00843.x]
* Ning Y, He S, Wu Z, et al. A review of deep learning based speech synthesis[J]. Applied Sciences, 2019, 9(19): 4050. [https://www.mdpi.com/2076-3417/9/19/4050/pdf][https://ieeexplore.ieee.org/abstract/document/6768033/]
* Zen H, Tokuda K, Black A W. Statistical parametric speech synthesis[J]. speech communication, 2009, 51(11): 1039-1064. [https://www.sciencedirect.com/science/article/abs/pii/S0167639309000648]

2023年8月13日 (日) 02:14的最后版本



  • AI100问:甜美的导航声音是如何产生的? [2]
  • 维基百科:语音合成 [3][4]
  • 维基百科:声码器 [5][6]


  • 源-滤波器模型 [7]
  • Vocoder 1939 (long) [8]
  • Vocoder 1939 (short) [9]
  • Vocal folder [10]
  • Vocal tract [11]
  • Auditory perception [12]


  • Tacotron2 [13]
  • CycleFlow 语音转换 [14]
  • Online demo for TTS and Voice conversion [15]
  • Online TTS demo [16]
  • IBM TTS demo [17]


  • CodePen Web demo for TTS [18]
  • Simple HTML code [19]
  • NVIDIA Tacotron2 [20]


  • 汤志远,李蓝天,王东,石颖,蔡云麒,郑方,《语音识别基本法》,清华大学出牌社,2021. [21]
  • Dudley H. The vocoder—Electrical re-creation of speech[J]. Journal of the Society of Motion Picture Engineers, 1940, 34(3): 272-278. [22]
  • Dudley H. Remaking speech[J]. The Journal of the Acoustical Society of America, 1939, 11(2): 169-177.[23]
  • Dudley, Homer (October 1940). "The Carrier Nature of Speech". Bell System Technical Journal. XIX (4). [24]
  • Ning Y, He S, Wu Z, et al. A review of deep learning based speech synthesis[J]. Applied Sciences, 2019, 9(19): 4050. [25][26]
  • Zen H, Tokuda K, Black A W. Statistical parametric speech synthesis[J]. speech communication, 2009, 51(11): 1039-1064. [27]