A maximum likelihood approach to the detection of moments of maximum excitation and its application to high-quality speech parameterization

Ranniery Maia, Yannis Stylianou, Masami Akamine

Research output: Contribution to journalConference article

1 Citation (Scopus)

Abstract

This paper presents an algorithm to detect moments of maximum excitation (MME) in speech. It assumes a model in which speech can be represented as a sequence of pulses located at the MME convolved with a time-varying minimum-phase impulse response. By considering that in the glottal cycle speech concentrates more energy at the MME than at other instants, the locations and amplitudes of the excitation pulses are determined through maximum likelihood estimation. The suggested approach provides a fully automatic and consistent method for the detection of MME in speech without relying on ad hoc procedures which usually do not work well across different speech styles without a required amount of adjustments. Experiments with speech parameterization, in the context of complex cepstrum analysis and synthesis, have shown that the proposed MME-based processing can improve signal to error reconstruction ratio up to 10%, when compared to the use of glottal closure instant estimations provided by a well-known algorithm.

Original languageEnglish
Pages (from-to)603-607
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2015-January
Publication statusPublished - 2015 Jan 1
Externally publishedYes
Event16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015 - Dresden, Germany
Duration: 2015 Sep 62015 Sep 10

Keywords

  • Epoch detection
  • Pitch marking
  • Speech analysis
  • Speech modeling
  • Speech parameterization

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint Dive into the research topics of 'A maximum likelihood approach to the detection of moments of maximum excitation and its application to high-quality speech parameterization'. Together they form a unique fingerprint.

  • Cite this