|Concurrence among Imbalanced Labels and Its Influence on Multilabel Resampling Algorithms
|Year of Publication
|Charte, Francisco, Rivera-Rivas A.J., del Jesus M. J., and Herrera F.
|9th International Conference on Hybrid Artificial Intelligent Systems (HAIS 2014)
In the context of multilabel classification, the learning from imbalanced data is getting considerable attention recently. Several algorithms to face this problem have been proposed in the late five years, as well as various measures to assess the imbalance level. Some of the proposed methods are based on resampling techniques, a very well-known approach whose utility in traditional classification has been proven. This paper aims to describe how a specific characteristic of multilabel datasets (MLDs), the level of concurrence among imbalanced labels, could have a great impact in resampling algorithms behavior. Towards this goal, a measure named SCUMBLE, designed to evaluate this concurrence level, is proposed and its usefulness is experimentally tested. As a result, a straightforward guideline on the effectiveness of multilabel resampling algorithms depending on MLDs characteristics can be inferred.
Concurrence among Imbalanced Labels and Its Influence on Multilabel Resampling Algorithms