Investigation of combining various major language model technologies including data expansion and adaptation

Ryo Masumura, Taichi Asami, Takanobu Oba, Hirokazu Masataki, Sumitaka Sakauchi, Akinori Ito

研究成果: Article査読

1 被引用数 (Scopus)

抄録

This paper aims to investigate the performance improvements made possible by combining various major language model (LM) technologies together and to reveal the interactions between LM technologies in spontaneous automatic speech recognition tasks. While it is clear that recent practical LMs have several problems, isolated use of major LM technologies does not appear to offer sufficient performance. In consideration of this fact, combining various LM technologies has been also examined. However, previous works only focused on modeling technologies with limited text resources, and did not consider other important technologies in practical language modeling, i.e., use of external text resources and unsupervised adaptation. This paper, therefore, employs not only manual transcriptions of target speech recognition tasks but also external text resources. In addition, unsupervised LM adaptation based on multi-pass decoding is also added to the combination. We divide LM technologies into three categories and employ key ones including recurrent neural network LMs or discriminative LMs. Our experiments show the effectiveness of combining various LM technologies in not only in-domain tasks, the subject of our previous work, but also out-of-domain tasks. Furthermore, we also reveal the relationships between the technologies in both tasks.

本文言語English
ページ(範囲)2452-2461
ページ数10
ジャーナルIEICE Transactions on Information and Systems
E99D
10
DOI
出版ステータスPublished - 2016 10

ASJC Scopus subject areas

  • ソフトウェア
  • ハードウェアとアーキテクチャ
  • コンピュータ ビジョンおよびパターン認識
  • 電子工学および電気工学
  • 人工知能

フィンガープリント

「Investigation of combining various major language model technologies including data expansion and adaptation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル