Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis

Junichi Yamagishi, Simon King, Steve Renals, Takashi Nose, Heiga Zen, Keiichi Tokuda, Zhen Hua Ling, Tomoki Toda

研究成果: Article査読

133 被引用数 (Scopus)

抄録

This paper describes a speaker-adaptive HMM-based speech synthesis system. The new system, called “HTS-2007,” employs speaker adaptation (CSMAPLR+MAP), feature-space adaptive training, mixed-gender modeling, and full-covariance modeling using CSMAPLR transforms, in addition to several other techniques that have proved effective in our previous systems. Subjective evaluation results show that the new system generates significantly better quality synthetic speech than speaker-dependent approaches with realistic amounts of speech data, and that it bears comparison with speaker-dependent approaches even when large amounts of speech data are available. In addition, a comparison study with several speech synthesis techniques shows the new system is very robust: It is able to build voices from less-than-ideal speech data and synthesize good-quality speech even for out-of-domain sentences.

本文言語English
ページ(範囲)1208-1230
ページ数23
ジャーナルIEEE Transactions on Audio, Speech and Language Processing
17
6
DOI
出版ステータスPublished - 2009 8
外部発表はい

ASJC Scopus subject areas

  • 音響学および超音波学
  • 電子工学および電気工学

フィンガープリント

「Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル