Complex cepstrum as phase information in statistical parametric speech synthesis

Ranniery Maia, Masami Akamine, M. J.F. Gales

研究成果: Conference contribution

31 被引用数 (Scopus)

抄録

Statistical parametric synthesizers usually rely on a simplified model of speech production where a minimum-phase filter is driven by a zero or random phase excitation signal. However, this procedure does not take into account the natural mixed-phase characteristics of the speech signal. This paper addresses this issue by proposing the use of the complex cepstrum for modeling phase information in statistical parametric speech synthesizers. Here a frame-based complex cepstrum is calculated through the interpolation of pitch-synchronous magnitude and unwrapped phase spectra. The noncausal part of the frame-based complex cepstrum is then modeled as phase features in the statistical parametric synthesizer. At synthesis time, the generated phase parameters are used to derive coefficients of a glottal filter. Experimental results show that the proposed approach effectively embeds phase information in the synthetic speech, resulting in close-to-natural waveforms and better speech quality.

本文言語English
ホスト出版物のタイトル2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
ページ4581-4584
ページ数4
DOI
出版ステータスPublished - 2012 10 23
外部発表はい
イベント2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Kyoto, Japan
継続期間: 2012 3 252012 3 30

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN(印刷版)1520-6149

Other

Other2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
国/地域Japan
CityKyoto
Period12/3/2512/3/30

ASJC Scopus subject areas

  • ソフトウェア
  • 信号処理
  • 電子工学および電気工学

フィンガープリント

「Complex cepstrum as phase information in statistical parametric speech synthesis」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル