|Title||A First Approach to Deal with Imbalance in Multi-label Datasets|
|Publication Type||Conference Paper|
|Year of Publication||2013|
|Authors||Charte, Francisco, Rivera Antonio J., del Jesus M. J., and Herrera F.|
|Conference Name||8th International Conference on Hybrid Artificial Intelligent Systems (HAIS 2013)|
|Conference Location||Salamanca (Spain)|
The process of learning from imbalanced datasets has been deeply studied for binary and multi-class classification. This problem also affects to multi-label datasets. Actually, the imbalance level in multi-label datasets uses to be much larger than in binary or multi-class datasets. Notwithstanding, the proposals on how to measure and deal with imbalanced datasets in multi-label classification are scarce. In this paper, we introduce two measures aimed to obtain information about the imbalance level in multi-label datasets. Furthermore, two preprocessing methods designed to reduce the imbalance level in multi-label datasets are proposed, and their effectiveness is validated experimentally. Finally, an analysis for determining when these methods have to be applied depending on the dataset characteristics is provided.
A First Approach to Deal with Imbalance in Multi-label Datasets