Integration of accent sandhi and prosodic features estimation for japanese text-to-speech synthesis

Daisuke Fujimaki, Takashi Nose, Akinori Ito

研究成果: Conference contribution

抄録

In recent years, Japanese text-to-speech (TTS) synthesis methods have been actively researched. We need to estimate appropriate prosodic information for generating a high-quality synthetic speech. However, manual annotation is costly, and automatic annotation introduces estimation errors. This paper examines the integration of accent sandhi and prosodic feature estimation in the acoustic modeling for Japanese TTS to overcome the problems. The proposed method achieves total optimization of the F0 model by using the linguistic features from a dictionary. Objective and subjective evaluations confirmed that the cost of creating accent labels was reduced, and the accuracy of the prosodic feature estimation was improved.

本文言語English
ホスト出版物のタイトル2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020
出版社Institute of Electrical and Electronics Engineers Inc.
ページ358-359
ページ数2
ISBN(電子版)9781728198026
DOI
出版ステータスPublished - 2020 10 13
イベント9th IEEE Global Conference on Consumer Electronics, GCCE 2020 - Kobe, Japan
継続期間: 2020 10 132020 10 16

出版物シリーズ

名前2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020

Conference

Conference9th IEEE Global Conference on Consumer Electronics, GCCE 2020
国/地域Japan
CityKobe
Period20/10/1320/10/16

ASJC Scopus subject areas

  • 信号処理
  • 電子工学および電気工学
  • メディア記述
  • 器械工学
  • コンピュータ ネットワークおよび通信
  • コンピュータ ビジョンおよびパターン認識

フィンガープリント

「Integration of accent sandhi and prosodic features estimation for japanese text-to-speech synthesis」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル