Joint english spelling error correction and POS tagging for language learners writing

Keisuke Sakaguchi, Tomoya Mizumoto, Mamoru Komachi, Yuji Matsumoto

Research output: Contribution to conferencePaperpeer-review

3 Citations (Scopus)

Abstract

We propose an approach to correcting spelling errors and assigning part-of-speech (POS) tags simultaneously for sentences written by learners of English as a second language (ESL). In ESL writing, there are several types of errors such as preposition, determiner, verb, noun, and spelling errors. Spelling errors often interfere with POS tagging and syntactic parsing, which makes other error detection and correction tasks very difficult. In studies of grammatical error detection and correction in ESL writing, spelling correction has been regarded as a preprocessing step in a pipeline. However, several types of spelling errors in ESL are difficult to correct in the preprocessing, for example, homophones (e.g. *hear/here), confusion (*quiet/quite), split (*now a day/nowadays), merge (*swimingpool/swimming pool), inflection (*please/pleased) and derivation (*badly/bad), where the incorrect word is actually in the vocabulary and grammatical information is needed to disambiguate. In order to correct these spelling errors, and also typical typographical errors (*begginning/ beginning), we propose a joint analysis of POS tagging and spelling error correction with a CRF (Conditional Random Field)-based model. We present an approach that achieves significantly better accuracies for both POS tagging and spelling correction, compared to existing approaches using either individual or pipeline analysis. We also show that the joint model can deal with novel types of misspelling in ESL writing.

Original languageEnglish
Pages2357-2374
Number of pages18
Publication statusPublished - 2012 Dec 1
Event24th International Conference on Computational Linguistics, COLING 2012 - Mumbai, India
Duration: 2012 Dec 82012 Dec 15

Other

Other24th International Conference on Computational Linguistics, COLING 2012
CountryIndia
CityMumbai
Period12/12/812/12/15

Keywords

  • Part-of-speech tagging
  • Spelling error correction

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Joint english spelling error correction and POS tagging for language learners writing'. Together they form a unique fingerprint.

Cite this