Automatic assessment of English proficiency for Japanese learners without reference sentences based on deep neural network acoustic models

Jiang Fu, Yuya Chiba, Takashi Nose, Akinori Ito

研究成果: Article査読

11 被引用数 (Scopus)

抄録

Speech-based computer-assisted language learning (CALL) systems should recognize the utterances of the learner with high accuracy and evaluate the language proficiency of the specific speaker with appropriate methods. In this paper, we discuss the automatic assessment of the second language (L2) for non-native speakers. There are many existing works on pronunciation evaluation by applying the goodness of pronunciation (GOP) method. This paper introduces an automatic proficiency evaluation system that combines various kinds of non-native acoustic models and native ones, such as Gaussian mixture model (GMM)-hidden Markov model (HMM) and deep neural network (DNN)-HMM. Most of existing works assume that we know the transcription of an utterance (the reference sentence) when evaluating the utterance, especially in reading and repeating tasks. To realize a reference-free proficiency evaluation, we propose a novel machine score named as the reference-free error rate (RER) to evaluate English proficiency. In our experiments, the DNN-based non-native acoustic models outperformed the traditional acoustic models on non-native speech recognition. Thus, we calculated the RER by regarding the recognition result from the DNN-based non-native acoustic model as “reference” and the result from the native acoustic model as “recognition result”. The proposed RER has high correlation with human proficiency scores, which indicates the effectiveness of RER for automatically estimating the proficiency. By combining the RER with other machine scores such as the log-likelihood scores, we obtained high correlation (reading aloud task: [Formula presented]; constrained interactive dialogue task: [Formula presented]; spontaneous English conversation task: [Formula presented]) to the human scores.

本文言語English
ページ(範囲)86-97
ページ数12
ジャーナルSpeech Communication
116
DOI
出版ステータスPublished - 2020 1月

ASJC Scopus subject areas

  • ソフトウェア
  • モデリングとシミュレーション
  • 通信
  • 言語および言語学
  • 言語学および言語
  • コンピュータ ビジョンおよびパターン認識
  • コンピュータ サイエンスの応用

フィンガープリント

「Automatic assessment of English proficiency for Japanese learners without reference sentences based on deep neural network acoustic models」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル