Abstract
In spontaneous speech, various speech style and speed changes can be observed, which are known to degrade speech recognition accuracy. In this paper, we describe an optimized multi-duration HMM (OMD). An OMD is a kind of multi-path HMM with at most two parallel paths. Each path is trained using speech samples with short or long phoneme duration. The thresholds to divide samples of phonemes are determined through phoneme recognition experiment. Not only the thresholds but also topologies of HMM are determined using the recognition result. Next, we parallelize OMD model with ordinary HMM trained by spontaneous speech and HMM trained by read speech in parallel. Using this 'all-parallel' model, 19.3% reduction of word error rate was obtained compared with the ordinary HMM trained with spontaneous speech.
Original language | English |
---|---|
Title of host publication | EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology |
Publisher | International Speech Communication Association |
Pages | 485-488 |
Number of pages | 4 |
Publication status | Published - 2003 |
Event | 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland Duration: 2003 Sep 1 → 2003 Sep 4 |
Other
Other | 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 |
---|---|
Country/Territory | Switzerland |
City | Geneva |
Period | 03/9/1 → 03/9/4 |
ASJC Scopus subject areas
- Computer Science Applications
- Software
- Linguistics and Language
- Communication