Abstract
We propose a technique for synthesizing speech with desired style expressivity of an arbitrary target speaker's voice. In an MLLR-based speaker adaptation technique for multiple regression hidden semi-Markov model (MRHSMM), the quality of synthesized speech crucially depends on the initial MRHSMM trained from a certain source speaker's data and it is not always possible to synthesize natural sounding speech with a given target speaker's voice. To overcome this problem, we perform simultaneous adaptation of speaker and style from an average voice model. Experimental results show that the proposed technique provides more natural sounding speech than the conventional one with speaker adaptation only.
Original language | English |
---|---|
Title of host publication | 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP |
Pages | 4633-4636 |
Number of pages | 4 |
DOIs | |
Publication status | Published - 2008 Sep 16 |
Externally published | Yes |
Event | 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP - Las Vegas, NV, United States Duration: 2008 Mar 31 → 2008 Apr 4 |
Other
Other | 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP |
---|---|
Country | United States |
City | Las Vegas, NV |
Period | 08/3/31 → 08/4/4 |
Keywords
- Average voice model
- Expressive speech synthesis
- Hidden Markov model
- Speaker adaptation
- Style control
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering