Experience mining: Building a large-scale database of personal experiences and opinions from web documents

Kentaro Inui, Shuya Abe, Kazuo Hara, Hiraku Morita, Chitose Sao, Megumi Eguchi, Asuka Sumida, Koji Murakami, Suguru Matsuyoshi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

45 Citations (Scopus)

Abstract

This paper proposes a new UGC-oriented language technology application, which we call experience mining. Experience mining aims at automatically collecting instances of personal experiences as well as opinions from an explosive number of user generated contents (UGCs) such as weblog and forum posts and storing them in an experience database with semantically rich indices. After arguing the technical issues of this new task, we focus on the central problem, factuality analysis, among others and propose a machine learning-based solution as well as the task definition itself. Our empirical evaluation indicates that our factuality analysis task is sufficiently well-defined to achieve a high inter-annotator agreement and our Factorial CRF-based model considerably outperforms the baseline. We also present an application system, which currently stores over 50M experience instances extracted from 150M Japanese blog posts with semantic indices and is scheduled to start serving as an experience search engine for unrestricted users in October.

Original languageEnglish
Title of host publicationProceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
Pages314-321
Number of pages8
DOIs
Publication statusPublished - 2008 Dec 1
Externally publishedYes
Event2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008 - Sydney, NSW, Australia
Duration: 2008 Dec 92008 Dec 12

Publication series

NameProceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008

Other

Other2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
Country/TerritoryAustralia
CitySydney, NSW
Period08/12/908/12/12

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Experience mining: Building a large-scale database of personal experiences and opinions from web documents'. Together they form a unique fingerprint.

Cite this