A precise evaluation method of prosodic quality of non-native speakers using average voice and prosody substitution

Hafiyan Prafianto, Takashi Nose, Akinori Ito

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a method to improve the consistency of human evaluation of non-native speaker's utterance, with a capability to evaluate features such as accent and rhythm. In this method, human evaluators evaluate the accent and the rhythm independently by using average voice model and prosody substitution. We also investigated the advantages of evaluating those features independently. We found that, when the prosodic features are not evaluated independently, the accent scores are affected by the goodness of the rhythm and vice versa. The correlation coefficient of the accent score and the rhythm score of identical utterances was 0.23 using the conventional method and -0.026 using the proposed method. This also leads to greater disagreement between the scores given by different evaluators. Using the conventional method, 23% of the pairs between evaluators have their inter-evaluator correlation of the rhythm score more than 0.5, while using this proposed method, 67% of the pairs have the inter-evaluator correlation more than 0.5.

Original languageEnglish
Title of host publicationICALIP 2016 - 2016 International Conference on Audio, Language and Image Processing - Proceedings
EditorsFa-Long Luo, Xiaoqing Yu, Wanggen Wan
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages208-212
Number of pages5
ISBN (Electronic)9781509006533
DOIs
Publication statusPublished - 2017 Feb 7
Event5th International Conference on Audio, Language and Image Processing, ICALIP 2016 - Shanghai, China
Duration: 2016 Jul 112016 Jul 12

Publication series

NameICALIP 2016 - 2016 International Conference on Audio, Language and Image Processing - Proceedings

Other

Other5th International Conference on Audio, Language and Image Processing, ICALIP 2016
Country/TerritoryChina
CityShanghai
Period16/7/1116/7/12

Keywords

  • Average voice
  • CALL system
  • CAPT system
  • Evaluation of prosodic quality
  • Prosody substitution

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'A precise evaluation method of prosodic quality of non-native speakers using average voice and prosody substitution'. Together they form a unique fingerprint.

Cite this