Learning abbreviations from Chinese and English terms by modeling non-local information

Xu Sun, Naoaki Okazaki, Jun'ichi Tsujii, Houfeng Wang

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

The present article describes a robust approach for abbreviating terms. First, in order to incorporate nonlocal information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model and the label encoding with global information. Although the two approaches compete with one another, we find they are also highly complementary. We propose a combination of the two approaches, and we will show the proposed method outperforms all of the existing methods on abbreviation generation datasets. In order to reduce computational complexity of learning non-local information, we further present an online training method, which can arrive the objective optimum with accelerated training speed. We used a Chinese newswire dataset and a English biomedical dataset for experiments. Experiments revealed that the proposed abbreviation generator with non-local information achieved the best results for both the Chinese and English languages.

Original languageEnglish
Article number5
JournalACM Transactions on Asian Language Information Processing
Volume12
Issue number2
DOIs
Publication statusPublished - 2013

Keywords

  • Abbreviation processing
  • Machine learning
  • Non-local information
  • Stochastic learning

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint Dive into the research topics of 'Learning abbreviations from Chinese and English terms by modeling non-local information'. Together they form a unique fingerprint.

Cite this