An optimized multi-duration HMM for spontaneous speech recognition

Yuichi Ohkawa, Akihiro Yoshida, Motoyuki Suzuki, Akinori Ito, Shozo Makino

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

In spontaneous speech, various speech style and speed changes can be observed, which are known to degrade speech recognition accuracy. In this paper, we describe an optimized multi-duration HMM (OMD). An OMD is a kind of multi-path HMM with at most two parallel paths. Each path is trained using speech samples with short or long phoneme duration. The thresholds to divide samples of phonemes are determined through phoneme recognition experiment. Not only the thresholds but also topologies of HMM are determined using the recognition result. Next, we parallelize OMD model with ordinary HMM trained by spontaneous speech and HMM trained by read speech in parallel. Using this 'all-parallel' model, 19.3% reduction of word error rate was obtained compared with the ordinary HMM trained with spontaneous speech.

Original languageEnglish
Title of host publicationEUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology
PublisherInternational Speech Communication Association
Pages485-488
Number of pages4
Publication statusPublished - 2003
Event8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland
Duration: 2003 Sep 12003 Sep 4

Other

Other8th European Conference on Speech Communication and Technology, EUROSPEECH 2003
Country/TerritorySwitzerland
CityGeneva
Period03/9/103/9/4

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Linguistics and Language
  • Communication

Fingerprint

Dive into the research topics of 'An optimized multi-duration HMM for spontaneous speech recognition'. Together they form a unique fingerprint.

Cite this