Algorithms for finding attribute value group for binary segmentation of categorical databases

Yasuhiko Morimoto, Takeshi Fukuda, Takeshi Tokuyama

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

We consider the problem of finding a set of attribute values that give a high quality binary segmentation of a database. The quality of a segmentation is defined by an objective function suitable for the user's objective, such as "mean squared error," "mutual information," or "χ2," each of which is defined in terms of the distribution of a given target attribute. Our goal is to find value groups on a given conditional domain that split databases into two segments, optimizing the value of an objective function. Though the problem is intractable for general objective functions, there are feasible algorithms for finding high quality binary segmentations when the objective function is convex, and we prove that the typical criteria mentioned above are all convex. We propose two practical algorithms, based on computational geometry techniques, which find a much better value group than conventional heuristics.

Original languageEnglish
Pages (from-to)1269-1279
Number of pages11
JournalIEEE Transactions on Knowledge and Data Engineering
Volume14
Issue number6
DOIs
Publication statusPublished - 2002 Nov

Keywords

  • Binary segmentation
  • Categorical test
  • Data mining
  • Data reduction
  • Decision tree
  • Value groups

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Algorithms for finding attribute value group for binary segmentation of categorical databases'. Together they form a unique fingerprint.

Cite this