N-GRAM LANGUAGE MODEL ADAPTATION USING SMALL CORPUS FOR SPOKEN DIALOG RECOGNITION

Akinori Ito, Hideyuki Saitoh, Masaharu Katoh, Masaki Kohda

Research output: Contribution to conferencePaperpeer-review

Abstract

This paper describes an N-gram language model adaptation technique. As an N-gram model requires a large size sample corpus for probability estimation, it is difficult to utilize N-gram model for a specific small task. In this paper, N-gram task adaptation is proposed using large corpus of the general task (TI text) and small corpus of the specific task (AD text). A simple weighting is employed to mix TI and AD text. In addition to mix two texts, the effect of vocabulary is also investigated. The experimental results show that adapted N-gram model with proper vocabulary size has significantly lower perplexity than the task independent models.

Original languageEnglish
Pages2735-2738
Number of pages4
Publication statusPublished - 1997
Externally publishedYes
Event5th European Conference on Speech Communication and Technology, EUROSPEECH 1997 - Rhodes, Greece
Duration: 1997 Sept 221997 Sept 25

Conference

Conference5th European Conference on Speech Communication and Technology, EUROSPEECH 1997
Country/TerritoryGreece
CityRhodes
Period97/9/2297/9/25

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Linguistics and Language
  • Communication

Fingerprint

Dive into the research topics of 'N-GRAM LANGUAGE MODEL ADAPTATION USING SMALL CORPUS FOR SPOKEN DIALOG RECOGNITION'. Together they form a unique fingerprint.

Cite this