Some features speak loud, but together they all speak louder: A study on the correlation between classification error and feature usage in decision-tree classification ensembles

Bárbara Cervantes, Raúl Monroy, Miguel Angel Medina-Pérez, Miguel Gonzalez-Mendoza, Jose Ramirez-Marquez

    Research output: Contribution to journalArticlepeer-review

    17 Scopus citations

    Abstract

    While diversity has been argued to be the rationale for the success of an ensemble of classifiers, little has been said on how uniform use of the feature space influences classification error. Following an observation from a recent result, published elsewhere, among several ensembles of decision trees, those with a more uniform feature-use frequency also have a smaller classification error. This paper provides further support to such hypothesis. We have conducted experiments over 60 classification datasets, using 42 different types of decision tree ensembles, to test our hypothesis. Our results validate the hypothesis, prompting the design of ensemble construction methods that make a more uniform use of features, for classification problems of low and medium dimensionality.

    Original languageEnglish
    Pages (from-to)270-282
    Number of pages13
    JournalEngineering Applications of Artificial Intelligence
    Volume67
    DOIs
    StatePublished - Jan 2018

    Keywords

    • Decision tree ensemble

    Fingerprint

    Dive into the research topics of 'Some features speak loud, but together they all speak louder: A study on the correlation between classification error and feature usage in decision-tree classification ensembles'. Together they form a unique fingerprint.

    Cite this