Joint transition-based dependency parsing and disfluency detection for automatic speech recognition texts

Masashi Yoshikawa, Hiroyuki Shindo, Yuji Matsumoto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

Joint dependency parsing with disfluency detection is an important task in speech language processing. Recent methods show high performance for this task, although most authors make the unrealistic assumption that input texts are transcribed by human annotators. In real-world applications, the input text is typically the output of an automatic speech recognition (ASR) system, which implies that the text contains not only disfluency noises but also recognition errors from the ASR system. In this work, we propose a parsing method that handles both disfluency and ASR errors using an incremental shift-reduce algorithm with several novel features suited to ASR output texts. Because the gold dependency information is usually annotated only on transcribed texts, we also introduce an alignment-based method for transferring the gold dependency annotation to the ASR output texts to construct training data for our parser. We conducted an experiment on the Switchboard corpus and show that our method outperforms conventional methods in terms of dependency parsing and disfluency detection.

Original languageEnglish
Title of host publicationEMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages1036-1041
Number of pages6
ISBN (Electronic)9781945626258
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016 - Austin, United States
Duration: 2016 Nov 12016 Nov 5

Publication series

NameEMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016
CountryUnited States
CityAustin
Period16/11/116/11/5

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Joint transition-based dependency parsing and disfluency detection for automatic speech recognition texts'. Together they form a unique fingerprint.

Cite this