Robust estimation of multiple-regression HMM parameters for dimension-based expressive dialogue speech synthesis

Tomohiro Nagata, Hiroki Mori, Takashi Nose

研究成果: Conference article査読

1 被引用数 (Scopus)

抄録

This paper describes spontaneous dialogue speech synthe- sis based on multiple-regression hidden semi-Markov model (MRHSMM), which enables users to specify paralinguistic in- formation of synthesized speech with a dimensional representa- Tion. Paralinguistic aspects of synthesized speech are controlled by multiple regression models whose explanatory variables are abstract dimensions such as pleasant-unpleasant and aroused- sleepy. For robust estimation of the regression matrices of the MRHSMM with unbalanced spontaneous dialogue speech sam- ples, the re-estimation formulae were derived in the framework of the maximum a posteriori (MAP) estimation. The result of a perceptual experiment confirmed that the naturalness of synthe- sized speech was improved by applying the MAP estimation for regression matrices. In addition a high correlation (R ≃ 0:7) wasobserved between given and perceived paralinguistic infor- mation, which implies that the proposed method could success- fully reflect intended paralinguistic messages on the synthesized speech.

本文言語English
ページ(範囲)1549-1553
ページ数5
ジャーナルProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
出版ステータスPublished - 2013 1 1
外部発表はい
イベント14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013 - Lyon, France
継続期間: 2013 8 252013 8 29

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

フィンガープリント 「Robust estimation of multiple-regression HMM parameters for dimension-based expressive dialogue speech synthesis」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル