TY - JOUR
T1 - Learning Query Patterns by Using Wikipedia Articles as Supervised Data to Retrieve Web Pages for Multi-document Summarization
AU - Tanaka, Shohei
AU - Okazaki, Naoaki
AU - Ishizuka, Mitsuru
N1 - Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2011
Y1 - 2011
N2 - This paper presents a novel method for acquiring a set of query patterns that are able to retrieve documents containing important information about an entity. Given an existing Wikipedia category that should contain the entity, we first extract all entities that are the subjects of the articles in the category. From these articles, we extract triplets of the form (subject-entity, query pattern, concept) that are expected to be in the search results of the query patterns. We then select a small set of query patterns so that when formulating search queries with these patterns, the overall precision and coverage of the returned information from the Web are optimized. We model this optimization problem as a Weighted Maximum Satisfiability (Weighted Max-SAT) problem. Experimental results demonstrate that the proposed method outperformed the methods based on statistical measures such as frequency and point-wise mutual information (PMI) being widely used in relation extraction. keywords: summarization, Weighted Max-SAT, Wikipedia, query pattern.
AB - This paper presents a novel method for acquiring a set of query patterns that are able to retrieve documents containing important information about an entity. Given an existing Wikipedia category that should contain the entity, we first extract all entities that are the subjects of the articles in the category. From these articles, we extract triplets of the form (subject-entity, query pattern, concept) that are expected to be in the search results of the query patterns. We then select a small set of query patterns so that when formulating search queries with these patterns, the overall precision and coverage of the returned information from the Web are optimized. We model this optimization problem as a Weighted Maximum Satisfiability (Weighted Max-SAT) problem. Experimental results demonstrate that the proposed method outperformed the methods based on statistical measures such as frequency and point-wise mutual information (PMI) being widely used in relation extraction. keywords: summarization, Weighted Max-SAT, Wikipedia, query pattern.
UR - http://www.scopus.com/inward/record.url?scp=85024746891&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85024746891&partnerID=8YFLogxK
U2 - 10.1527/tjsai.26.366
DO - 10.1527/tjsai.26.366
M3 - Article
AN - SCOPUS:85024746891
VL - 26
SP - 366
EP - 375
JO - Transactions of the Japanese Society for Artificial Intelligence
JF - Transactions of the Japanese Society for Artificial Intelligence
SN - 1346-0714
IS - 2
ER -