Learning lexicons from spoken utterances based on statistical model selection

Ryo Taguchi, Naoto Iwahashi, Takashi Nose, Kotaro Funakoshi, Mikio Nakano

研究成果: Conference article査読

9 被引用数 (Scopus)

抄録

This paper proposes a method for the unsupervised learning of lexicons from pairs of a spoken utterance and an object as its meaning without any a priori linguistic knowledge other than a phoneme acoustic model. In order to obtain a lexicon, a statistical model of the joint probability of a spoken utterance and an object is learned based on the minimum description length principle. This model consists of a list of word phoneme sequences and three statistical models: the phoneme acoustic model, a word-bigram model, and a word meaning model. Experimental results show that the method can acquire acoustically, grammatically and semantically appropriate words with about 85% phoneme accuracy.

本文言語English
ページ(範囲)2731-2734
ページ数4
ジャーナルProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
出版ステータスPublished - 2009 11月 26
外部発表はい
イベント10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009 - Brighton, United Kingdom
継続期間: 2009 9月 62009 9月 10

ASJC Scopus subject areas

  • 人間とコンピュータの相互作用
  • 信号処理
  • ソフトウェア
  • 感覚系

フィンガープリント

「Learning lexicons from spoken utterances based on statistical model selection」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル