Voice conversion from arbitrary speakers based on deep neural networks with adversarial learning

Sou Miyamoto, Takashi Nose, Suzunosuke Ito, Harunori Koike, Yuya Chiba, Akinori Ito, Takahiro Shinozaki

研究成果: Conference contribution

抄録

In this study, we propose a voice conversion technique from arbitrary speakers based on deep neural networks using adversarial learning, which is realized by introducing adversarial learning to the conventional voice conversion. Adversarial learning is expected to enable us more natural voice conversion by using a discriminative model which classifies input speech to natural speech or converted speech in addition to a generative model. Experiments showed that proposed method was effective to enhance global variance (GV) of melcepstrum but naturalness of converted speech was a little lower than speech using the conventional variance compensation technique.

本文言語English
ホスト出版物のタイトルAdvances in Intelligent Information Hiding and Multimedia Signal Processing - Proceedings of the 13th International Conference on Intelligent Information Hiding and Multimedia Signal Processing,
編集者Junzo Watada, Lakhmi C. Jain, Jeng-Shyang Pan, Pei-Wei Tsai
出版社Springer Science and Business Media Deutschland GmbH
ページ97-103
ページ数7
ISBN(印刷版)9783319638584
DOI
出版ステータスPublished - 2018
イベント13th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2017 - Matsue, Shimane, Japan
継続期間: 2017 8 122017 8 15

出版物シリーズ

名前Smart Innovation, Systems and Technologies
82
ISSN(印刷版)2190-3018
ISSN(電子版)2190-3026

Other

Other13th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2017
国/地域Japan
CityMatsue, Shimane
Period17/8/1217/8/15

ASJC Scopus subject areas

  • 決定科学(全般)
  • コンピュータ サイエンス(全般)

フィンガープリント

「Voice conversion from arbitrary speakers based on deep neural networks with adversarial learning」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル