Language models as knowledge bases: On entity representations, storage capacity, and paraphrased queries

Benjamin Heinzerling, Kentaro Inui

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Pretrained language models have been suggested as a possible alternative or complement to structured knowledge bases. However, this emerging LM-as-KB paradigm has so far only been considered in a very limited setting, which only allows handling 21k entities whose name is found in common LM vocabularies. Furthermore, a major benefit of this paradigm, i.e., querying the KB using natural language paraphrases, is underexplored. Here we formulate two basic requirements for treating LMs as KBs: (i) the ability to store a large number facts involving a large number of entities and (ii) the ability to query stored facts. We explore three entity representations that allow LMs to handle millions of entities and present a detailed case study on paraphrased querying of facts stored in LMs, thereby providing a proof-of-concept that language models can indeed serve as knowledge bases.

Original languageEnglish
Title of host publicationEACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages1772-1791
Number of pages20
ISBN (Electronic)9781954085022
Publication statusPublished - 2021
Event16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021 - Virtual, Online
Duration: 2021 Apr 192021 Apr 23

Publication series

NameEACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021
CityVirtual, Online
Period21/4/1921/4/23

ASJC Scopus subject areas

  • Software
  • Computational Theory and Mathematics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Language models as knowledge bases: On entity representations, storage capacity, and paraphrased queries'. Together they form a unique fingerprint.

Cite this