In a well-known paper, Holte (Machine Learning, 1993, Vol 11, pp.65-91) argued that "very simple classification rules perform well on most commonly used data sets". Holte proposed an algorithm name 1R which selects the most informative attribute in a data set, and builds a one level decision tree. 1R was tested in that experiment on a benchmark of sixteen UCI data sets, and compared with Quinlan's C4 (Quinlan, 1986). C4 builds a decision tree that can rely on many, potentially all, the attributes, while 1R is restricted to one. C4 typically generates more complex decision trees than 1R. One could therefore expect C4 to be significantly more accurate than 1R. Holte's argued that this was not the case. He showed that the average accuracy of 1R over the entire benchmark was 80.2% relative to 85.9% of C4; Just 5.7 point difference, or 7.1%. |
|
Several arguments were raised over the years against Holte's conclusions. In this talk a new argument against these conclusions will be presented. It will be shown that when random successes were taken into account in the very same experiment, entirely different results (and conclusions) have emerged. The experiment to be presented and discussed demonstrates yet again that accuracy is the wrong meter for measuring and comparing classifiers' performance. |
|
A brief introduction of both AUC (Area Under ROC Curve) and Cohen's Kappa will be given, followed by a presentation and an open discussion of the findings. |
|
Contact information: |
Daniel Potthoff | Wilco van den Heuvel | Email | Email |
|