Speech factorization for HMM-TTS based on cluster adaptive training.

Javier Latorre, Vincent Wan, Mark J.F. Gales, Langzhou Chen, K. K. Chin, Kate Knill, Masami Akamine

研究成果: Conference contribution

28 被引用数 (Scopus)

抄録

This paper presents a novel approach to factorize and control different speech factors in HMM-based TTS systems. In this paper cluster adaptive training (CAT) is used to factorize speaker identity and expressiveness (i.e. emotion). Within a CAT framework, each speech factor can be modelled by a different set of clusters. Users can control speaker identity and expressiveness independently by modifying the weights associated with each set. These weights are defined in a continuous space, so variations of speaker and emotion are also continuous. Additionally, given a speaker which has only neutral-style training data, the approach is able to synthesise speech with that speaker's voice and different expressions. Lastly, the paper discusses how generalization of the basic factorization concept could allow the production of expressive speech from neutral voices for other HMM-TTS systems not based on CAT.

本文言語English
ホスト出版物のタイトル13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
ページ970-973
ページ数4
出版ステータスPublished - 2012 12月 1
外部発表はい
イベント13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
継続期間: 2012 9月 92012 9月 13

出版物シリーズ

名前13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
2

Conference

Conference13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
国/地域United States
CityPortland, OR
Period12/9/912/9/13

ASJC Scopus subject areas

  • コンピュータ ネットワークおよび通信
  • 通信

フィンガープリント

「Speech factorization for HMM-TTS based on cluster adaptive training.」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル