Reducing lexical features in parsing by word embeddings

Hiroya Komatsu, Zen Den, Naoaki Okazaki, Kentaro Inui

Research output: Contribution to conferencePaper

1 Citation (Scopus)

Abstract

The high-dimensionality of lexical features in parsing can be memory consuming and cause over-fitting problems. We propose a general framework to replace all lexical feature templates by low-dimensional features induced from word embeddings. Applied to a near state-of-the-art dependency parser (Huang et al., 2012), our method improves the baseline, performs better than using cluster bit string features, and outperforms a recent neural network based parser. A further analysis shows that our framework has the effect hypothesized by Andreas and Klein (2014), namely (i) connecting unseen words to known ones, and (ii) encouraging common behaviors among invocabulary words.

Original languageEnglish
Pages106-113
Number of pages8
Publication statusPublished - 2015 Jan 1
Event29th Pacific Asia Conference on Language, Information and Computation, PACLIC 2015 - Shanghai, China
Duration: 2015 Oct 302015 Nov 1

Other

Other29th Pacific Asia Conference on Language, Information and Computation, PACLIC 2015
CountryChina
CityShanghai
Period15/10/3015/11/1

ASJC Scopus subject areas

  • Artificial Intelligence
  • Human-Computer Interaction
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Reducing lexical features in parsing by word embeddings'. Together they form a unique fingerprint.

  • Cite this

    Komatsu, H., Den, Z., Okazaki, N., & Inui, K. (2015). Reducing lexical features in parsing by word embeddings. 106-113. Paper presented at 29th Pacific Asia Conference on Language, Information and Computation, PACLIC 2015, Shanghai, China.