Transform mapping using shared decision tree context clustering for HMM-based cross-lingual speech synthesis

Daiki Nagahama, Takashi Nose, Tomoki Koriyama, Takao Kobayashi

研究成果: Conference article査読

3 被引用数 (Scopus)

抄録

This paper proposes a novel transform mapping technique based on shared decision tree context clustering (STC) for HMM- based cross-lingual speech synthesis. In the conventional cross- lingual speaker adaptation based on state mapping, the adapta- Tion performance is not always satisfactory when there are mis- matches of languages and speakers between the average voice models of input and output languages. In the proposed tech- nique, we alleviate the effect of the mismatches on the trans- form mapping by introducing a language-independent decision tree constructed by STC, and represent the average voice mod- els using language-independent and dependent tree structures. We also use a bilingual speech corpus for keeping speaker char- Acteristics between the average voice models of different lan- guages. The experimental results show that the proposed tech- nique decreases both spectral and prosodic distortions between original and generated parameter trajectories and significantly improves the naturalness of synthetic speech while keeping the speaker similarity compared to the state mapping.

本文言語English
ページ(範囲)770-774
ページ数5
ジャーナルProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
出版ステータスPublished - 2014 1月 1
イベント15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Singapore, Singapore
継続期間: 2014 9月 142014 9月 18

ASJC Scopus subject areas

  • 言語および言語学
  • 人間とコンピュータの相互作用
  • 信号処理
  • ソフトウェア
  • モデリングとシミュレーション

フィンガープリント

「Transform mapping using shared decision tree context clustering for HMM-based cross-lingual speech synthesis」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル