|A First Approach to Deal with Imbalance in Multi-label Datasets
|Year of Publication
|Charte, Francisco, Rivera-Rivas A.J., del Jesus M. J., and Herrera F.
|8th International Conference on Hybrid Artificial Intelligent Systems (HAIS 2013)
The process of learning from imbalanced datasets has been deeply studied for binary and multi-class classification. This problem also affects to multi-label datasets. Actually, the imbalance level in multi-label datasets uses to be much larger than in binary or multi-class datasets. Notwithstanding, the proposals on how to measure and deal with imbalanced datasets in multi-label classification are scarce. In this paper, we introduce two measures aimed to obtain information about the imbalance level in multi-label datasets. Furthermore, two preprocessing methods designed to reduce the imbalance level in multi-label datasets are proposed, and their effectiveness is validated experimentally. Finally, an analysis for determining when these methods have to be applied depending on the dataset characteristics is provided.
A First Approach to Deal with Imbalance in Multi-label Datasets