ClEnDAE: A classifier based on ensembles with built-in dimensionality reduction through denoising autoencoders

TitleClEnDAE: A classifier based on ensembles with built-in dimensionality reduction through denoising autoencoders
Publication TypeJournal Article
Year of Publication2021
AuthorsPulgar, Francisco J., Charte Francisco, Rivera-Rivas A.J., and del Jesus M. J.
JournalInformation Sciences
Volume565
Pagination146-176
Keywordsclassification, Deep learning, Denoising autoencoders, Dimensionality reduction, Ensembles feature fusion
Abstract

High dimensionality is an issue that affects most classification algorithms. This factor implies that the predictive performance of many traditional classifiers decreases considerably as the number of features increases. Therefore, there are numerous proposals that try to mitigate the effects of this issue. This study proposes ClEnDAE, a new classifier based on ensembles whose components incorporate denoising autoencoders (DAEs) to reduce the dimensionality of the input space. On the one hand, the use of ensembles improves the predictive performance by using several components that work jointly. On the other hand, the use of DAEs allows a new higher-level, smaller-sized feature space to be generated, reducing high dimensionality effects. Finally, an experimentation is conducted with the goal of evaluating the behavior of ClEnDAE. The first part of the test compares the performance of ClEnDAE to a model based on basic DAE and to the original untreated data. The second part analyzes the results of ClEnDAE and other traditional methods of dimensionality reduction in order to determine the improvement achieved with the proposed algorithm. In both parts of the experimentation, conclusions show that ClEnDAE offers better predictive performance than the other analyzed models. The main advantage of the ClEnDAE method is the combination of the potential of the ensemble-based methodology, where several components work in parallel, and DAEs, which generate new low-dimensional features that provide more relevant information. Therefore, the classification performance is better than with other classic proposals.

Notes

TIN2015-68454-R; PID2019-107793GB-I00 / AEI /10.13039/501100011033

DOI10.1016/j.ins.2021.02.060