Abstract
This paper presents a technique for controlling and emphasizing speaker characteristics of synthetic speech. The key idea comes from the way of imitating voice by professional impersonators. In the voice imitation, impersonators effectively utilize exaggeration of a target speaker's voice characteristics. To model and control the degree of speaker characteristics, we use a speech synthesis framework based on multiple-regression hidden semi-Markov model (MRHSMM). In MRHSMM, mean parameters are given by multiple regression of a low-dimensional control vector. The control vector represents how much the target speaker's model parameters are different from those of the average voice model. By changing the control vector in speech synthesis, we can control the degree of voice characteristics of the target speaker. Results of subjective experiments show that the speaker reproducibility of synthetic speech is improved by emphasizing speaker characteristics.
Original language | English |
---|---|
Pages (from-to) | 2631-2634 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Publication status | Published - 2009 |
Externally published | Yes |
Event | 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009 - Brighton, United Kingdom Duration: 2009 Sep 6 → 2009 Sep 10 |
Keywords
- HMM-based speech synthesis
- Multiple-regression HSMM
- Speaker characteristics
- Style control
- Voice imitation
ASJC Scopus subject areas
- Human-Computer Interaction
- Signal Processing
- Software
- Sensory Systems