A style control technique for singing voice synthesis based on multiple-regression hsmm

Takashi Nose, Misa Kanemoto, Tomoki Koriyama, Takao Kobayashi

Research output: Contribution to journalConference articlepeer-review

5 Citations (Scopus)

Abstract

This paper proposes a technique for controlling singing style in the HMM-based singing voice synthesis. A style control technique based on multiple regression HSMM (MRHSMM), which was originally proposed for the HMM-based expressive speech synthesis, is applied to the conventional technique. The idea of pitch adaptive training is introduced into the MRHSMM to improve the modeling accuracy of fundamental frequency (F0) associated with notes. A robust vibrato modeling technique based on a moving average filter is also proposed to reproduce a natural-sounding vibrato expression even when the vibrato expression of the original singing voice is unclear. Subjective evaluation results show that users can intuitively control a singing style while keeping naturalness of the synthetic voice.

Original languageEnglish
Pages (from-to)378-382
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 2013 Jan 1
Externally publishedYes
Event14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013 - Lyon, France
Duration: 2013 Aug 252013 Aug 29

Keywords

  • HMM-based singing voice synthesis
  • Multiple-regression HSMM
  • Pitch adaptive training
  • Style control
  • Vibrato modeling

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'A style control technique for singing voice synthesis based on multiple-regression hsmm'. Together they form a unique fingerprint.

Cite this