Automatic assessment of English proficiency for Japanese learners without reference sentences based on deep neural network acoustic models

Jiang Fu, Yuya Chiba, Takashi Nose, Akinori Ito

Research output: Contribution to journalArticle

Abstract

Speech-based computer-assisted language learning (CALL) systems should recognize the utterances of the learner with high accuracy and evaluate the language proficiency of the specific speaker with appropriate methods. In this paper, we discuss the automatic assessment of the second language (L2) for non-native speakers. There are many existing works on pronunciation evaluation by applying the goodness of pronunciation (GOP) method. This paper introduces an automatic proficiency evaluation system that combines various kinds of non-native acoustic models and native ones, such as Gaussian mixture model (GMM)-hidden Markov model (HMM) and deep neural network (DNN)-HMM. Most of existing works assume that we know the transcription of an utterance (the reference sentence) when evaluating the utterance, especially in reading and repeating tasks. To realize a reference-free proficiency evaluation, we propose a novel machine score named as the reference-free error rate (RER) to evaluate English proficiency. In our experiments, the DNN-based non-native acoustic models outperformed the traditional acoustic models on non-native speech recognition. Thus, we calculated the RER by regarding the recognition result from the DNN-based non-native acoustic model as “reference” and the result from the native acoustic model as “recognition result”. The proposed RER has high correlation with human proficiency scores, which indicates the effectiveness of RER for automatically estimating the proficiency. By combining the RER with other machine scores such as the log-likelihood scores, we obtained high correlation (reading aloud task: [Formula presented]; constrained interactive dialogue task: [Formula presented]; spontaneous English conversation task: [Formula presented]) to the human scores.

Original languageEnglish
Pages (from-to)86-97
Number of pages12
JournalSpeech Communication
Volume116
DOIs
Publication statusPublished - 2020 Jan

Keywords

  • Acoustic models
  • Automatic proficiency assessment
  • Computer-assisted language learning (CALL)
  • Deep neural network (DNN)
  • Japanese learners
  • Non-native speech
  • Speech recognition

ASJC Scopus subject areas

  • Software
  • Modelling and Simulation
  • Communication
  • Language and Linguistics
  • Linguistics and Language
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Automatic assessment of English proficiency for Japanese learners without reference sentences based on deep neural network acoustic models'. Together they form a unique fingerprint.

  • Cite this