This paper addresses the issues in the task of annotating geographical entities on microblogs and reports the preliminary results of our efforts to annotate Japanese microblog texts. Unlike prior work, we aim at annotating not only geographical location entities but also facility entities, such as stations, restaurants and schools. We discuss (i) how to build a gazetteer of geographical entities with a sufficiently broad coverage, (ii) what types ambiguities that need to be considered, (iii) why the annotator tends to disagree, and (iv) what technical problems should be addressed to automate the task of annotating the geographical entities. All the annotation data and the annotation guidelines are publicly available for research purposes from our web site.
- Corpus annotation
- Location reference expressions
- Natural language processing
ASJC Scopus subject areas
- Computer Science(all)