TTS-project-synthesis

来自cslt Wiki

2017年12月1日 (五) 03:52Zhangzy（讨论 | 贡献）的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)

跳转至：导航、搜索

目录

1 Project name
2 Project members
3 Introduction
4 Sample waves

Project name

Text To Speech

Project members

Dong Wang, Zhiyong Zhang

Introduction

Text To Speech

Sample waves

Synthesis text:好雨知时节，当春乃发声，随风潜入夜，润物细无声

Mono-speaker TTS

Female[1]

Male[2]

Child[3]

Multi-speaker mix-trainingr

Without Speaker-vector

Female & Male[4]

Female & Child[5]

Male & Child[6]

With speaker-vector

When synthesis, we just replace the speaker-vector for specific person.

Specific person===

Female[7]

Male[8]

Interpolate the speaker-vector of different person

Female & Male with different ratio

(1) 0.0:1.0[9]

(2) 0.1:0.9[10]

(3) 0.2:0.8[11]

(4) 0.3:0.7[12]

(5) 0.4:0.6[13]

(6) 0.5:0.5[14]

(7) 0.6:0.4[15]

(8) 0.7:0.3[16]

(9) 0.8:0.2[17]

(10) 0.9:0.1[18]

(11) 1.0:0.0[19]

Mono-speaker Emotion TTS

Specific emotion

Neutral emotion [20]
Happy emotion [21]
Sorrow emotion [22]
Angry emotion [23]

Interpolation emotion

Angry & neutral with different ratio

(1) 0.0:1.0 [24]
(2) 0.1:0.9 [25]
(3) 0.2:0.8 [26]
(4) 0.3:0.7 [27]
(5) 0.4:0.6 [28]
(6) 0.5:0.5 [29]
(7) 0.6:0.4 [30]
(8) 0.7:0.3 [31]
(9) 0.8:0.2 [32]
(10) 0.9:0.1 [33]
(11) 1.0:0.0 [34]

Multi-speaker Multi-emotion

Synthesis text:'据了解，天津市今年粮食种植面积达六百万亩，预计全年粮食总产量可达二十公斤，比去年提高了'

Female

female-angry [35]
female-happy [36]
female-neutral [37]
female-sorrow [38]

Male

male-angry [39]
male-happy [40]
male-neutral [41]
male-sorrow [42]

取自“http://index.cslt.org/mediawiki/index.php?title=TTS-project-synthesis&oldid=29365”