Inducción de Árboles de Decisión basada en un Índice de Validación de Cluster (Inducing Decision Trees based on a Cluster Quality Index)

Octavio Loyola-González (octavioloyola@bioplantas.cu)1, Miguel Angel Medina-Pérez (miguel.medina.perez@gmail.com)2, Milton García-Borroto (mgarciab@ceis.cujae.edu.cu)3


1Centro de Bioplantas
2Instituto Tecnológico y de Estudios Superiores de Monterrey (ITESM-CEM)
3Instituto Superior Politécnico “José Antonio Echeverría”

This paper appears in: Revista IEEE América Latina

Publication Date: April 2015
Volume: 13,   Issue: 4 
ISSN: 1548-0992


Abstract:
Decision trees are popular classifiers in data mining, artificial intelligence, and pattern recognition, because they are accurate and easy to comprehend. In this paper, we introduce a new procedure for inducing decision trees, to obtain trees that are more accurate, more compact, and more balanced. Each candidate split is evaluated using Rand Statistics, a quality index based on external measures, because it is considered by many authors as the best existing index. Our method was compared with other state-of-the-art methods and the results over 30 databases from the UCI Repository prove our claims. We also introduce a new equation to measure the balance of a binary tree.

Index Terms:
supervised classification, decision trees, validation indexes, rand statistic, gain ratio, gini index   


Documents that cite this document
This function is not implemented yet.


[PDF Full-Text (274)]