Abstract
We propose a fast speaker adaptation method using an aspect model. The performance of speaker independent (SI) model is very sensitive to environments such as microphones, speakers, and noises. Speaker adaptation techniques try to obtain near speaker dependent (SD) performance with only small amounts of specific data and are often based on initial SI model. One of the most important purposes for adaptation algorithms is to modify a large number of parameters with only a small amount of adaptation data. The number of free parameters to be estimated from adaptation data can be reduced by using aspect model. In this paper, we introduce an aspect model into an acoustic model for rapid speaker adaptation. A formulation of probabilistic latent semantic analysis (PLSA) is extended to continuous density HMM. We carried out an isolated word recognition experiment on Korean database, and the results are compared to those of conventional expectation maximization (EM) algorithm, maximum a posteriori (MAP) and maximum likelihood linear regression (MLLR).
Original language | English |
---|---|
Pages (from-to) | 1221-1224 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Publication status | Published - 2008 Dec 1 |
Event | INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia Duration: 2008 Sep 22 → 2008 Sep 26 |
Keywords
- Aspect model
- PLSA
- SD model
- SI model
- Speaker adaptation
ASJC Scopus subject areas
- Human-Computer Interaction
- Signal Processing
- Software
- Sensory Systems