Fast newton-CG method for batch learning of conditional random fields

Yuta Tsuboi, Yuya Unno, Hisashi Kashima, Naoaki Okazaki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We propose a fast batch learning method for linear-chain Conditional Random Fields (CRFs) based on Newton-CG methods. Newton-CG methods are a variant of Newton method for high-dimensional problems. They only require the Hessian-vector products instead of the full Hessian matrices. To speed up Newton-CG methods for the CRF learning, we derive a novel dynamic programming procedure for the Hessian-vector products of the CRF objective function. The proposed procedure can reuse the byproducts of the time-consuming gradient computation for the Hessian-vector products to drastically reduce the total computation time of the Newton-CG methods. In experiments with tasks in natural language processing, the proposed method outperforms a conventional quasi-Newton method. Remarkably, the proposed method is competitive with online learning algorithms that are fast but unstable.

Original languageEnglish
Title of host publicationAAAI-11 / IAAI-11 - Proceedings of the 25th AAAI Conference on Artificial Intelligence and the 23rd Innovative Applications of Artificial Intelligence Conference
Pages489-494
Number of pages6
Publication statusPublished - 2011 Nov 2
Event25th AAAI Conference on Artificial Intelligence and the 23rd Innovative Applications of Artificial Intelligence Conference, AAAI-11 / IAAI-11 - San Francisco, CA, United States
Duration: 2011 Aug 72011 Aug 11

Publication series

NameProceedings of the National Conference on Artificial Intelligence
Volume1

Other

Other25th AAAI Conference on Artificial Intelligence and the 23rd Innovative Applications of Artificial Intelligence Conference, AAAI-11 / IAAI-11
CountryUnited States
CitySan Francisco, CA
Period11/8/711/8/11

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Fast newton-CG method for batch learning of conditional random fields'. Together they form a unique fingerprint.

Cite this