A study on tailor-made speech synthesis based on deep neural networks

Shuhei Yamada, Takashi Nose, Akinori Ito

研究成果: Conference contribution

2 被引用数 (Scopus)

抄録

We propose “tailor-made speech synthesis,” the speech synthesis technique which enables users to control the synthetic speech naturally and intuitively. As a first step to realizing tailor-made speech synthesis, we introduce F0 context into speaker model training of speech synthesis based on deep neural networks (DNNs). F0 context represents relative log F0 at the mora or the accent-phrase level of training data. It allows users to control the F0 of synthetic speech steplessly on the contrary to conventional F0 context in HMM-based technique. Experiments showed that F0 context was effective to control the F0 because the F0 of synthetic voice followed the value of F0 context.

本文言語English
ホスト出版物のタイトルAdvances in Intelligent Information Hiding and Multimedia Signal Processing - Proceeding of the 12th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2016
編集者Hsiang-Cheh Huang, Jeng-Shyang Pan, Pei-Wei Tsai
出版社Springer Science and Business Media Deutschland GmbH
ページ159-166
ページ数8
ISBN(印刷版)9783319502083
DOI
出版ステータスPublished - 2017
イベント12th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2016 - Kaohsiung, Taiwan, Province of China
継続期間: 2016 11 212016 11 23

出版物シリーズ

名前Smart Innovation, Systems and Technologies
63
ISSN(印刷版)2190-3018
ISSN(電子版)2190-3026

Other

Other12th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2016
国/地域Taiwan, Province of China
CityKaohsiung
Period16/11/2116/11/23

ASJC Scopus subject areas

  • 決定科学(全般)
  • コンピュータ サイエンス(全般)

フィンガープリント

「A study on tailor-made speech synthesis based on deep neural networks」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル