Abstract
In the field of machine learning, the problem of imbalanced classification arises when the class percentage on the data is unevenly distributed. Different strategies using boosting ensemble algorithms have shown improved results over the imbalanced classification problem by combining weak learners to produce a single strong learner. In particular, decision trees are often used as base learners in ensemble learning for classification or regression. However, boosting ensemble algorithms sometimes generate a large number of decision trees that could grow too large to be understandable and interpretable. Additionally, the use of weights adds more complexity to the final result. For this reason, in this paper, we present RuleCOSI, a novel method for combining and simplifying the output of an ensemble of binary decision trees into a single set of production rules. The proposed method takes into account the weight of each decision tree and using a combination matrix generates a single set of simplified production rules with performance comparable to that of the original boosting ensemble. In order to measure the performance and prove the applicability of the proposed method, we carried out an empirical validation using three different boosting algorithms over several well-known machine learning datasets as well as real-life data collected from a manufacturing company. The results of the algorithm are acceptable in most of the experiments reducing the complexity of the boosting ensemble output while maintaining a similar performance.
| Original language | English |
|---|---|
| Pages (from-to) | 64-82 |
| Number of pages | 19 |
| Journal | Expert Systems with Applications |
| Volume | 126 |
| DOIs | |
| State | Published - 15 Jul 2019 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 9 Industry, Innovation, and Infrastructure
Keywords
- Boosting
- Decision trees
- Ensemble learning
- Imbalanced classification
- Rule extraction
Fingerprint
Dive into the research topics of 'RuleCOSI: Combination and simplification of production rules from boosted decision trees for imbalanced classification'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver