Implementation and Evaluation of Decision Trees with Range and Region Splitting

Yasuhiko Morimoto, Takeshi Fukuda, Shinichi Morishita, Takeshi Tokuyama

研究成果: Article査読

21 被引用数 (Scopus)


We propose an extension of an entropy-based heuristic for constructing a decision tree from a large database with many numeric attributes. When it comes to handling numeric attributes, conventional methods are inefficient if any numeric attributes are strongly correlated. Our approach offers one solution to this problem. For each pair of numeric attributes with strong correlation, we compute a two-dimensional association rule with respect to these attributes and the objective attribute of the decision tree. In particular, we consider a family ℛ of grid-regions in the plane associated with the pair of attributes. For R ∈ ℛ, the data can be split into two classes: data inside R and data outside R. We compute the region Ropt ∈ ℛ that minimizes the entropy of the splitting, and add the splitting associated with Ropt (for each pair of strongly correlated attributes) to the set of candidate tests in an entropy-based heuristic. We give efficient algorithms for cases in which ℛ is (1) x-monotone connected regions, (2) based-monotone regions, (3) rectangles, and (4) rectilinear convex regions. The algorithm has been implemented as a subsystem of SONAR (System for Optimized Numeric Association Rules) developed by the authors. We have confirmed that we can compute the optimal region efficiently. And diverse experiments show that our approach can create compact trees whose accuracy is comparable with or better than that of conventional trees. More importantly, we can grasp non-linear correlation among numeric attributes which could not be found without our region splitting.

出版ステータスPublished - 1997

ASJC Scopus subject areas

  • ソフトウェア
  • 離散数学と組合せ数学
  • 計算理論と計算数学
  • 人工知能


「Implementation and Evaluation of Decision Trees with Range and Region Splitting」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。