A style control technique for HMM-based expressive speech synthesis

Takashi Nose, Junichi Yamagishi, Takashi Masuko, Takao Kobayashi

Research output: Contribution to journalArticlepeer-review

107 Citations (Scopus)


This paper describes a technique for controlling the degree of expressivity of a desired emotional expression and/or speaking style of synthesized speech in an HMM-based speech synthesis framework. With this technique, multiple emotional expressions and speaking styles of speech are modeled in a single model by using a multiple-regression hidden semi-Markov model (MRHSMM). A set of control parameters, called the style vector, is defined, and each speech synthesis unit is modeled by using the MRHSMM, in which mean parameters of the state output and duration distributions are expressed by multiple-regression of the style vector. In the synthesis stage, the mean parameters of the synthesis units are modified by transforming an arbitrarily given style vector that corresponds to a point in a low-dimensional space, called style space, each of whose coordinates represents a certain specific speaking style or emotion of speech. The results of subjective evaluation tests show that style and its intensity can be controlled by changing the style vector.

Original languageEnglish
Pages (from-to)1406-1413
Number of pages8
JournalIEICE Transactions on Information and Systems
Issue number9
Publication statusPublished - 2007 Sep
Externally publishedYes


  • Emotional expression
  • HMM-based speech synthesis
  • Hidden semi-Markov model (HSMM)
  • Multiple-regression HSMM (MRHSMM)
  • Speaking style
  • Style interpolation

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence


Dive into the research topics of 'A style control technique for HMM-based expressive speech synthesis'. Together they form a unique fingerprint.

Cite this