A speech parameter generation algorithm using local variance for HMM-based speech synthesis

Vataya Chunwijitra, Takashi Nose, Takao Kobayashi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

This paper proposes a parameter generation algorithm using lo-cal variance (LV) constraint of spectral parameter trajectory for HMM-based speech synthesis. In the parameter generation pro-cess, we take account of both the HMM likelihood of speech feature vectors and a likelihood for LVs. To model LV precisely, we use dynamic features of LV with context-dependent HMMs. The objective experimental results show that the proposed tech-nique can generate a better spectral trajectory in terms of the spectral and LV distortions than a conventional technique with global variance (GV) constraint. The subjective experimental results also show that the proposed technique significantly im-prove the reproducibility of the synthetic speech than the con-ventional one.

Original languageEnglish
Title of host publication13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Pages1150-1153
Number of pages4
Publication statusPublished - 2012 Dec 1
Externally publishedYes
Event13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
Duration: 2012 Sep 92012 Sep 13

Publication series

Name13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Volume2

Conference

Conference13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
CountryUnited States
CityPortland, OR
Period12/9/912/9/13

Keywords

  • HMM-based speech synthesis
  • Local variance
  • Over-smoothing problem
  • Speech parameter generation

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Communication

Fingerprint Dive into the research topics of 'A speech parameter generation algorithm using local variance for HMM-based speech synthesis'. Together they form a unique fingerprint.

Cite this