An Algorithm for the Optimization of Multiple Classifiers in Data Mining Based on Graphs

Andrei Kelarev1, Joe Ryan2, John Yearwood1
1 P.O. Box 663, Ballarat, Victoria 3353, Australia
2School of Electrical Engineering and Computer Science, University of Newcastle, Callaghan, NSW 2308, Australia

Abstract

This article develops an efficient combinatorial algorithm based on labeled directed graphs and motivated by applications in data mining for designing multiple classifiers. Our method originates from the standard approach described in [37]. It defines a representation of a multiclass classifier in terms of several binary classifiers. We are using labeled graphs to introduce additional structure on the classifier. Representations of this sort are known to have serious advantages. An important property of these representations is their ability to correct errors of individual binary classifiers and produce correct combined output. For every representation like this we develop a combinatorial algorithm with quadratic running time to compute the largest number of errors of individual binary classifiers which can be corrected by the combined multiple classifier. In addition, we consider the question of optimizing the classifiers of this type and find all optimal representations for these multiple classifiers.