TY - GEN
T1 - Approximation of optimal two-dimensional association rules for categorical attributes using semidefinite programming
AU - Fujisawa, Katsuki
AU - Hamuro, Yukinobu
AU - Katoh, Naoki
AU - Tokuyama, Takeshi
AU - Yada, Katsutoshi
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 1999.
PY - 1999
Y1 - 1999
N2 - We consider the problem of finding two-dimensional association rules for categorical attributes. Suppose we have two conditional attributes A and B both of whose domains are categorical, and one binary target attribute whose domain is {“positive”, “negative”}. We want to split the Cartesian product of domains of A and B into two subsets so that a certain objective function is optimized, i.e., we want to find a good segmentation of the domains of A and B. We consider in this paper the objective function that maximizes the confidence under the constraint of the upper bound of the support size. We first prove that the problem is NP-hard, and then propose an approximation algorithm based on semidefinite programming. In order to evaluate the effectiveness and efficiency of the proposed algorithm, we carry out computational ex- periments for problem instances generated by real sales data consisting of attributes whose domain size is a few hundreds at maximum. Approxi- mation ratios of the solutions obtained measured by comparing solutions for semidefinite programming relaxation range from 76% to 95%. It is observed that the performance of generated association rules are signifi- cantly superior to that of one-dimensional rules.
AB - We consider the problem of finding two-dimensional association rules for categorical attributes. Suppose we have two conditional attributes A and B both of whose domains are categorical, and one binary target attribute whose domain is {“positive”, “negative”}. We want to split the Cartesian product of domains of A and B into two subsets so that a certain objective function is optimized, i.e., we want to find a good segmentation of the domains of A and B. We consider in this paper the objective function that maximizes the confidence under the constraint of the upper bound of the support size. We first prove that the problem is NP-hard, and then propose an approximation algorithm based on semidefinite programming. In order to evaluate the effectiveness and efficiency of the proposed algorithm, we carry out computational ex- periments for problem instances generated by real sales data consisting of attributes whose domain size is a few hundreds at maximum. Approxi- mation ratios of the solutions obtained measured by comparing solutions for semidefinite programming relaxation range from 76% to 95%. It is observed that the performance of generated association rules are signifi- cantly superior to that of one-dimensional rules.
UR - http://www.scopus.com/inward/record.url?scp=73349106769&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=73349106769&partnerID=8YFLogxK
U2 - 10.1007/3-540-46846-3_14
DO - 10.1007/3-540-46846-3_14
M3 - Conference contribution
AN - SCOPUS:73349106769
SN - 354066713X
SN - 9783540667131
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 148
EP - 159
BT - Discovery Science - 2nd International Conference, DS 1999, Proceedings
A2 - Arikawa, Setsuo
A2 - Furukawa, Koichi
PB - Springer Verlag
T2 - 2nd International Conference on Discovery Science, DS 1999
Y2 - 6 December 1999 through 8 December 1999
ER -