Comparison of speech recognition performance between kaldi and google cloud speech API

Takashi Kimura, Takashi Nose, Shinji Hirooka, Yuya Chiba, Akinori Ito

研究成果: Conference contribution

2 被引用数 (Scopus)

抄録

In recent years, many systems having a speech interface have grown. The speech interface includes spoken dialogue function and high performance of a spoken dialogue system has been required. The spoken dialogue system consists of a speech recognition module. In this study, we focus on the speech recognition module of the spoken dialogue system and aim for improving the spoken dialogue system by enhancing the performance of the speech recognition system. Among several speech recognition systems, Kaldi is a widely used speech recognition system in many kinds of researches. On the other hand, several speech recognition services that are Web API is also provided, such as IBM Watson Speech to Text, Microsoft Bing Speech API, and Google Cloud Speech API, which is known that it has high performance. This paper compares speech recognition performance between Kaldi and Google Cloud Speech API in WER and RTF and confirms the recognition performance of each recognition system.

本文言語English
ホスト出版物のタイトルRecent Advances in Intelligent Information Hiding and Multimedia Signal Processing - Proceeding of the Fourteenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing
編集者Lakhmi C. Jain, Lakhmi C. Jain, Pei-Wei Tsai, Akinori Ito, Jeng-Shyang Pan, Lakhmi C. Jain
出版社Springer Science and Business Media Deutschland GmbH
ページ109-115
ページ数7
ISBN(印刷版)9783030037475
DOI
出版ステータスPublished - 2019
イベント14th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2018 - Sendai, Japan
継続期間: 2018 11 262018 11 28

出版物シリーズ

名前Smart Innovation, Systems and Technologies
110
ISSN(印刷版)2190-3018
ISSN(電子版)2190-3026

Other

Other14th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2018
国/地域Japan
CitySendai
Period18/11/2618/11/28

ASJC Scopus subject areas

  • 決定科学(全般)
  • コンピュータ サイエンス(全般)

フィンガープリント

「Comparison of speech recognition performance between kaldi and google cloud speech API」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル