Abstract
This paper describes an N-gram language model adaptation technique. As an N-gram model requires a large size sample corpus for probability estimation, it is difficult to utilize N-gram model for a specific small task. In this paper, N-gram task adaptation is proposed using large corpus of the general task (TI text) and small corpus of the specific task (AD text). A simple weighting is employed to mix TI and AD text. In addition to mix two texts, the effect of vocabulary is also investigated. The experimental results show that adapted N-gram model with proper vocabulary size has significantly lower perplexity than the task independent models.
Original language | English |
---|---|
Pages | 2735-2738 |
Number of pages | 4 |
Publication status | Published - 1997 |
Externally published | Yes |
Event | 5th European Conference on Speech Communication and Technology, EUROSPEECH 1997 - Rhodes, Greece Duration: 1997 Sept 22 → 1997 Sept 25 |
Conference
Conference | 5th European Conference on Speech Communication and Technology, EUROSPEECH 1997 |
---|---|
Country/Territory | Greece |
City | Rhodes |
Period | 97/9/22 → 97/9/25 |
ASJC Scopus subject areas
- Computer Science Applications
- Software
- Linguistics and Language
- Communication