Understanding the role of diversity in ensemble-based AutoML methods for classification tasks
Cargando...
Fecha
2025-04-17
Título de la revista
ISSN de la revista
Título del volumen
Editor
Institute of Electrical and Electronics Engineers Inc.
Resumen
Ensemble-based Automated Machine Learning (AutoML) methods have gained prominence for their ability to combine diverse machine learning models, achieving superior generalization performance. Despite their empirical success, the underlying mechanisms driving this performance, particularly the role of model diversity, are not yet adequately understood. This study uses novel theoretical frameworks related to the role of diversity in ensembles, which were recently proposed, to shed light on this issue. In this work, we focus on AutoML methods for classification tasks. We use AUTO-SKLEARN (a widely used AutoML ensemble-based method) as a basis. More specifically, we examine how individual model diversity and performance evolves across the four key phases of AUTO-SKLEARN (base-learners, meta-learning, Bayesian Optimization (BO), and Caruana Ensemble). We also examine how they contribute to the diversity and performance of the final ensemble produced by the AutoML method. Using datasets from the AutoML benchmark, we empirically validate these insights by analyzing error rates and diversity measures across the mentioned phases. Our findings highlight the trade-off between individual model accuracy and ensemble diversity, showing that phases like BO improve the mean error rate of classifiers by nearly 50% percent but reduce their mean diversity by 20%. However, the Caruana phase increases the diversity by a 50% compared to the BO phase, allowing better generalization despite the higher mean error rate of the selected individual models (48% higher than BO). This work provides theoretical and empirical evidence that diversity is critical to the success of ensemble-based AutoML methods and a deeper understanding of diversity’s impact on generalization performance and the role of the different AutoML phases. These findings can contribute to advance the development of more robust and theoretically grounded AutoML frameworks
Palabras clave
AUTO-SKLEARN
Automated machine learning (AutoML)
Bayesian optimization (BO)
Diversity
Ensemble learning
Automated machine learning (AutoML)
Bayesian optimization (BO)
Diversity
Ensemble learning
Descripción
Materias
Cita
Osei, S., Masegosa, A. R., & Masegosa, A. D. (2025). Understanding the role of diversity in ensemble-based AutoML methods for classification tasks. IEEE Access, 13, 63566-63586. https://doi.org/10.1109/ACCESS.2025.3554093