Abstract—To date, decision trees are among the most used classification models. They owe their popularity to their efficiency during both the learning and the classification phases and, above all, to the high interpretability of the learned classifiers. This latter aspect is of primary importance in those domains in which understanding and validating the decision process is as important as the accuracy degree of the prediction. Pruning is a common technique used to reduce the size of decision trees, thus improving their interpretability and possibly reducing the risk of overfitting. In the present work, we investigate on the integration between evolutionary algorithms and decision tree pruning, presenting a decision tree post-pruning strategy based on the well-known multi-objective evolutionary algorithm NSGA-II. Our approach is compared with the default pruning strategies of the decision tree learners C4.5 (J48 - on which the proposed method is based) and C5.0. We empirically show that evolutionary algorithms can be profitably applied to the classical problem of decision tree pruning, as the proposed strategy is capable of generating a more variegate set of solutions than both J48 and C5.0; moreover, the trees produced by our method tend to be smaller than the best candidates produced by the classical tree learners, while preserving most of their accuracy and sometimes improving it.
Index Terms—Data mining, decision trees, evolutionary computation, pruning methodologies.
A. Brunello and A. Montanari are with the University of Udine, Udine, Italy (e-mail: email@example.com, firstname.lastname@example.org). E. Marzano is with Gap SRL Company, Udine, Italy (e-mail: email@example.com).
G. Sciavicco is with the University of Ferrara, Ferrara, Italy (e-mail: firstname.lastname@example.org).
Cite: Andrea Brunello, Enrico Marzano, Angelo Montanari, and Guido Sciavicco, "Decision Tree Pruning via Multi-Objective Evolutionary Computation," International Journal of Machine Learning and Computing vol. 7, no. 6, pp. 167-175, 2017.