Evaluation of prosodic contextual factors for HMM-based speech synthesis

Shuji Yokomizo, Takashi Nose, Takao Kobayashi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

This paper explores the effect of prosodic contextual factors for speech synthesis based on hidden Markov model (HMM). In the HMM-based speech synthesis, to model not only the phonetic features but also the prosodic ones, a variety of contextual factors are taken into account in the model training. In a baseline system, a lot of contextual factors are used, and the resultant cost for parameter tying by context clustering becomes relatively high compared to that in the speech recognition. We examine the choice of prosodic contexts by objective measures for English and Japanese speech data which have difference linguistic and prosodic characteristics. Experimental results show that more compact context sets give also comparable or close performance to the conventional full context.

Original languageEnglish
Title of host publicationProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
PublisherInternational Speech Communication Association
Pages430-433
Number of pages4
Publication statusPublished - 2010
Externally publishedYes

Publication series

NameProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

Keywords

  • Computation time
  • Context clustering
  • Contextual factor
  • HMM-based speech synthesis
  • Prosody modeling

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing

Fingerprint

Dive into the research topics of 'Evaluation of prosodic contextual factors for HMM-based speech synthesis'. Together they form a unique fingerprint.

Cite this