Filler prediction based on bidirectional LSTM for generation of natural response of spoken dialog

Yoshihiro Yamazaki, Yuya Chiba, Takashi Nose, Akinori Ito

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Most of the conventional response generation models do not generate speech disfluencies including fillers, because they are trained from a written language corpus. It is necessary to insert fillers to written sentences for training a response generation model for the spoken language. In this paper, we proposed the filler prediction model based on bidirectional LSTM (BLSTM). This approach can consider a whole utterance and model both positions and kinds of fillers simultaneously. The experiments showed that the proposed method surpasses the conventional approach in terms of the prediction accuracy.

Original languageEnglish
Title of host publication2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages360-361
Number of pages2
ISBN (Electronic)9781728198026
DOIs
Publication statusPublished - 2020 Oct 13
Event9th IEEE Global Conference on Consumer Electronics, GCCE 2020 - Kobe, Japan
Duration: 2020 Oct 132020 Oct 16

Publication series

Name2020 IEEE 9th Global Conference on Consumer Electronics, GCCE 2020

Conference

Conference9th IEEE Global Conference on Consumer Electronics, GCCE 2020
CountryJapan
CityKobe
Period20/10/1320/10/16

Keywords

  • dialog system
  • filler prediction
  • response generation
  • speech disfluencies

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Media Technology
  • Instrumentation
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'Filler prediction based on bidirectional LSTM for generation of natural response of spoken dialog'. Together they form a unique fingerprint.

Cite this