Experience mining: Building a large-scale database of personal experiences and opinions from web documents

Kentaro Inui, Shuya Abe, Kazuo Hara, Hiraku Morita, Chitose Sao, Megumi Eguchi, Asuka Sumida, Koji Murakami, Suguru Matsuyoshi

研究成果: Conference contribution

45 被引用数 (Scopus)

抄録

This paper proposes a new UGC-oriented language technology application, which we call experience mining. Experience mining aims at automatically collecting instances of personal experiences as well as opinions from an explosive number of user generated contents (UGCs) such as weblog and forum posts and storing them in an experience database with semantically rich indices. After arguing the technical issues of this new task, we focus on the central problem, factuality analysis, among others and propose a machine learning-based solution as well as the task definition itself. Our empirical evaluation indicates that our factuality analysis task is sufficiently well-defined to achieve a high inter-annotator agreement and our Factorial CRF-based model considerably outperforms the baseline. We also present an application system, which currently stores over 50M experience instances extracted from 150M Japanese blog posts with semantic indices and is scheduled to start serving as an experience search engine for unrestricted users in October.

本文言語English
ホスト出版物のタイトルProceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
ページ314-321
ページ数8
DOI
出版ステータスPublished - 2008
外部発表はい
イベント2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008 - Sydney, NSW, Australia
継続期間: 2008 12 92008 12 12

出版物シリーズ

名前Proceedings - 2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008

Other

Other2008 IEEE/WIC/ACM International Conference on Web Intelligence, WI 2008
国/地域Australia
CitySydney, NSW
Period08/12/908/12/12

ASJC Scopus subject areas

  • コンピュータ ネットワークおよび通信
  • コンピュータ サイエンスの応用
  • 電子工学および電気工学

フィンガープリント

「Experience mining: Building a large-scale database of personal experiences and opinions from web documents」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル