Resampling Multilabel Datasets by Decoupling Highly Imbalanced Labels

TitleResampling Multilabel Datasets by Decoupling Highly Imbalanced Labels
Publication TypeConference Paper
Year of Publication2015
AuthorsCharte, Francisco, Rivera Antonio J., del Jesus M. J., and Herrera F.
Conference Name10th International Conference on Hybrid Artificial Intelligent Systems, HAIS 2015
Date Published6
Conference LocationBilbao (Spain)
ISBN Number978-3-319-19643-5

Multilabel classification is a task that has been broadly studied in late years. However, how to face learning from imbalanced multilabel datasets (MLDs) has only been addressed latterly. In this regard, a few proposals can be found in the literature, most of them based on resampling techniques adapted from the traditional classification field. The success of these methods varies extraordinarily depending on the traits of the chosen MLDs. One of the characteristics which significantly influences the behavior of multilabel resampling algorithms is the joint appearance of minority and majority labels in the same instances. It was demonstrated that MLDs with a high level of concurrence among imbalanced labels could hardly benefit from resampling methods. This paper proposes an original resampling algorithm, called REMEDIAL, which is not based on removing majority instances nor creating minority ones, but on a procedure to decouple highly imbalanced labels. As will be experimentally demonstrated, this is an interesting approach for certain MLDs.