Discontinuous observation HMM for prosodic-event-based F0 generation

Tomoki Koriyama, Takashi Nose, Takao Kobayashi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper examines F0 modeling and generation techniques for spontaneous speech synthesis. In the previous study, we proposed a prosodic-unit HMM where the synthesis unit is defined as a segment between two prosodic events represented by a ToBI label framework. To take the advantage of the prosodic-unit HMM, continuous F0 sequences must be modeled from discontinuous F0 data including unvoiced regions. The conventional F0 models such as the MSD-HMM and the continuous F0 HMM are not always appropriate for such demand. To overcome this problem, we propose an alternative F0 model named discontinuous observation HMM (DO-HMM) where the unvoiced frames are regarded as missing data. We objectively evaluate the performance of the DO-HMM by comparing it with the conventional F0 modeling techniques and discuss the results.

Original languageEnglish
Title of host publication13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Pages462-465
Number of pages4
Publication statusPublished - 2012 Dec 1
Externally publishedYes
Event13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
Duration: 2012 Sep 92012 Sep 13

Publication series

Name13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Volume1

Conference

Conference13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
CountryUnited States
CityPortland, OR
Period12/9/912/9/13

Keywords

  • Discontinuous observation HMM
  • F0 modeling
  • Hmm-based speech synthesis
  • Prosody generation
  • Spontaneous speech

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Communication

Fingerprint Dive into the research topics of 'Discontinuous observation HMM for prosodic-event-based F0 generation'. Together they form a unique fingerprint.

  • Cite this

    Koriyama, T., Nose, T., & Kobayashi, T. (2012). Discontinuous observation HMM for prosodic-event-based F0 generation. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 (pp. 462-465). (13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012; Vol. 1).