CycleGAN-Based High-Quality Non-Parallel Voice Conversion with Spectrogram and WaveRNN

Aoi Kanagaki, Masaya Tanaka, Takashi Nose, Ryohei Shimizu, Akira Ito, Akinori Ito

研究成果: Conference contribution

抄録

This paper proposes Scyclone, a high-quality voice conversion (VC) technique without parallel data training. Scyclone improves speech naturalness and speaker similarity of the converted speech by introducing CycleGAN-based spectrogram conversion with a simplified WaveRNN-based vocoder. In Scyclone, a linear spectrogram is used as the conversion feature, which avoids quality degradation due to extraction errors. The subjective experiments show that Scyclone is significantly better than CycleGAN-VC2, one of the existing state-of-the-art parallel-data-free VC techniques.

本文言語English
ホスト出版物のタイトル2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020
出版社Institute of Electrical and Electronics Engineers Inc.
ページ356-357
ページ数2
ISBN(電子版)9781728198026
DOI
出版ステータスPublished - 2020 10 13
イベント9th IEEE Global Conference on Consumer Electronics, GCCE 2020 - Kobe, Japan
継続期間: 2020 10 132020 10 16

出版物シリーズ

名前2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020

Conference

Conference9th IEEE Global Conference on Consumer Electronics, GCCE 2020
国/地域Japan
CityKobe
Period20/10/1320/10/16

ASJC Scopus subject areas

  • 信号処理
  • 電子工学および電気工学
  • メディア記述
  • 器械工学
  • コンピュータ ネットワークおよび通信
  • コンピュータ ビジョンおよびパターン認識

フィンガープリント

「CycleGAN-Based High-Quality Non-Parallel Voice Conversion with Spectrogram and WaveRNN」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル