Speaker-independent HMM-based voice conversion using quantized fundamental frequency

Takashi Nose, Takao Kobayashi

研究成果: Conference contribution

2 被引用数 (Scopus)

抄録

This paper proposes a segment-based voice conversion technique between arbitrary speakers with a small amount of training data. In the proposed technique, an input speech utterance of source speaker is decoded into phonetic and prosodic symbol sequences, and then the converted speech is generated from the pre-trained target speaker's HMM using the decoded information. To reduce the required amount of training data, we use speaker-independent model in the decoding of the input speech, and model adaptation for the training of the target speaker's model. Experimental results show that there is no need to prepare the source speaker's training data, and the proposed technique with only ten sentences of the target speaker's adaptation data outperforms the conventional GMM-based one using parallel data of 200 sentences.

本文言語English
ホスト出版物のタイトルProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
出版社International Speech Communication Association
ページ1724-1727
ページ数4
出版ステータスPublished - 2010
外部発表はい

出版物シリーズ

名前Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

ASJC Scopus subject areas

  • 言語および言語学
  • 言語聴覚療法

フィンガープリント

「Speaker-independent HMM-based voice conversion using quantized fundamental frequency」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル