Estimation of user's internal state before the user's first utterance using acoustic features and face orientation

Yuya Chiba, Masashi Ito, Akinori Ito

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Introduction of user models (e.g. models of a user's belief, skill and familiarity to the system) is believed to increase flexibility of response of a dialogue system. Conventionally, the internal state is estimated based on linguistic information of the previous utterance, but this approach cannot applied to the user who did not make an input utterance in the first place. Thus, we are developing a method to estimate an internal state of a spoken dialogue system's user before his/her input utterance. In a previous report, we used three acoustic features and a visual feature based on manual labels. In this paper, we introduced new features for the estimation: length of filled pause and face orientation angles. Then, we examined effectiveness of the proposed features by experiments. As a result, we obtained a three-class discrimination accuracy of 85.6% in an open test, which was 1.5 point higher than the result obtained using the previous feature set.

Original languageEnglish
Title of host publicationProceedings - 5th International Conference on Human System Interactions, HSI 2012
Pages23-28
Number of pages6
DOIs
Publication statusPublished - 2012 Dec 1
Event5th International Conference on Human System Interactions, HSI 2012 - Perth, WA, Australia
Duration: 2012 Jun 62012 Jun 8

Publication series

NameInternational Conference on Human System Interaction, HSI
ISSN (Print)2158-2246
ISSN (Electronic)2158-2254

Other

Other5th International Conference on Human System Interactions, HSI 2012
CountryAustralia
CityPerth, WA
Period12/6/612/6/8

Keywords

  • multimodal information
  • non-verbal information
  • spoken dialogue system
  • user modeling

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Software

Fingerprint Dive into the research topics of 'Estimation of user's internal state before the user's first utterance using acoustic features and face orientation'. Together they form a unique fingerprint.

  • Cite this

    Chiba, Y., Ito, M., & Ito, A. (2012). Estimation of user's internal state before the user's first utterance using acoustic features and face orientation. In Proceedings - 5th International Conference on Human System Interactions, HSI 2012 (pp. 23-28). [6473758] (International Conference on Human System Interaction, HSI). https://doi.org/10.1109/HSI.2012.13