The impact of parameter optimization of ensemble learning on defect prediction
Conţinutul numărului revistei
Articolul precedent
Articolul urmator
34 1
Ultima descărcare din IBN:
2019-06-05 16:47
Căutarea după subiecte
similare conform CZU
004.4 (113)
Programe. Software (109)
SM ISO690:2012
MARUF OZTURK, Muhammed. The impact of parameter optimization of ensemble learning on defect prediction. In: Computer Science Journal of Moldova. 2019, nr. 1(79), pp. 85-128. ISSN 1561-4042.
EXPORT metadate:
Google Scholar
Dublin Core
Computer Science Journal of Moldova
Numărul 1(79) / 2019 / ISSN 1561-4042

The impact of parameter optimization of ensemble learning on defect prediction

CZU: 004.4
Pag. 85-128

Maruf Ozturk Muhammed
Department of Computer Engineering Faculty of Engineering Isparta
Disponibil în IBN: 30 mai 2019


Machine learning algorithms have configurable parameters which are generally used with default settings by practitioners. Making modifications on the parameters of machine learning algorithm is called hyperparameter optimization (HO) performed to find out the most suitable parameter setting in classification experiments. Such studies propose either using default classification model or optimal parameter configuration. This work investigates the effects of applying HO on ensemble learning algorithms in terms of defect prediction performance. Further, this paper presents a new ensemble learning algorithm called novelEnsemble for defect prediction data sets. The method has been tested on 27 data sets. Proposed method is then compared with three alternatives. Welch’s Heteroscedastic F Test is used to examine the difference between performance parameters. To control the magnitude of the difference, Cliff’s Delta is applied on the results of comparison algorithms. According to the results of the experiment: 1) Ensemble methods featuring HO performs better than a single predictor; 2) Despite the error of triTraining decreases linearly, it produces errors at an unacceptable level; 3) novelEnsemble yields promising results especially in terms of area under the curve (AUC) and Matthews Correlation Coefficient (MCC); 4) HO is not stagnant depending on the scale of the data set; 5) Each ensemble learning approach may not create a favorable effect on HO. To demonstrate the prominence of hyperparameter selection process, the experiment is validated with suitable statistical analyzes. The study revealed that the success of HO which is, contrary to expectations, not depended on the type of the classifiers but rather on the design of ensemble learners.

Defect prediction, parameter optimization, ensemble learning.