A First Approach to Deal with Imbalance in Multi-label Datasets

Author
Abstract
The process of learning from imbalanced datasets has been deeply studied for binary and multi-class classification. This problem also affects to multi-label datasets. Actually, the imbalance level in multi-label datasets uses to be much larger than in binary or multi-class datasets. Notwithstanding, the proposals on how to measure and deal with imbalanced datasets in multi-label classification are scarce. In this paper, we introduce two measures aimed to obtain information about the imbalance level in multi-label datasets. Furthermore, two preprocessing methods designed to reduce the imbalance level in multi-label datasets are proposed, and their effectiveness is validated experimentally. Finally, an analysis for determining when these methods have to be applied depending on the dataset characteristics is provided.
Year of Publication
2013
Date Published
9
Conference Location
Salamanca (Spain)
ISBN Number
978-3-642-40845-8
DOI
10.1007/978-3-642-40846-5_16
Download citation
Number of Pages
150-160
Bibliography media
Notes

TIN2012-33856,TIN2011-28488,TIC-3928,P10-TIC-6858