Style estimation of speech based on multiple regression hidden semi-Markov model

Takashi Nose, Yoichi Kato, Takao Kobayashi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a technique for estimating the degree or intensity of emotional expressions and speaking styles appeared in speech. The key idea is based on a style control technique for speech synthesis using multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse process of the style control. We derive an algorithm for estimating predictor variables of MRHSMM each of which represents a sort of emotion intensity or speaking style variability appeared in acoustic features based on an ML criterion. We also show preliminary experimental results to demonstrate an ability of the proposed technique for synthetic and acted speech samples with emotional expressions and speaking styles.

Original languageEnglish
Title of host publicationInternational Speech Communication Association - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007
Pages2900-2903
Number of pages4
Publication statusPublished - 2007 Dec 1
Externally publishedYes
Event8th Annual Conference of the International Speech Communication Association, Interspeech 2007 - Antwerp, Belgium
Duration: 2007 Aug 272007 Aug 31

Publication series

NameInternational Speech Communication Association - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007
Volume4

Other

Other8th Annual Conference of the International Speech Communication Association, Interspeech 2007
CountryBelgium
CityAntwerp
Period07/8/2707/8/31

Keywords

  • Emotional speech
  • Multiple regression HSMM
  • Speaking style
  • Style estimation
  • Style modeling

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Modelling and Simulation
  • Linguistics and Language
  • Communication

Fingerprint Dive into the research topics of 'Style estimation of speech based on multiple regression hidden semi-Markov model'. Together they form a unique fingerprint.

Cite this