What makes reading comprehension questions easier?

Saku Sugawara, Kentaro Inui, Satoshi Sekine, Akiko Aizawa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

A challenge in creating a dataset for machine reading comprehension (MRC) is to collect questions that require a sophisticated understanding of language to answer beyond using superficial cues. In this work, we investigate what makes questions easier across recent 12 MRC datasets with three question styles (answer extraction, description, and multiple choice). We propose to employ simple heuristics to split each dataset into easy and hard subsets and examine the performance of two baseline models for each of the subsets. We then manually annotate questions sampled from each subset with both validity and requisite reasoning skills to investigate which skills explain the difference between easy and hard questions. From this study, we observed that (i) the baseline performances for the hard subsets remarkably degrade compared to those of entire datasets, (ii) hard questions require knowledge inference and multiple-sentence reasoning in comparison with easy questions, and (iii) multiple-choice questions tend to require a broader range of reasoning skills than answer extraction and description questions. These results suggest that one might overestimate recent advances in MRC.

Original languageEnglish
Title of host publicationProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
EditorsEllen Riloff, David Chiang, Julia Hockenmaier, Jun'ichi Tsujii
PublisherAssociation for Computational Linguistics
Pages4208-4219
Number of pages12
ISBN (Electronic)9781948087841
Publication statusPublished - 2020
Event2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018 - Brussels, Belgium
Duration: 2018 Oct 312018 Nov 4

Publication series

NameProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018

Conference

Conference2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
CountryBelgium
CityBrussels
Period18/10/3118/11/4

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Cite this