LI-MLC: A Label Inference Methodology for Addressing High Dimensionality in the Label Space for Multilabel Classification

Author	Francisco Charte Ojeda Antonio Jesús Rivera Rivas Maria José del Jesus Díaz Francisco Herrera Triguero
Abstract	Multilabel classification (MLC) has generated considerable research interest in recent years, as a technique that can be applied to many real-world scenarios. To process them with binary or multiclass classifiers, methods for transforming multilabel data sets (MLDs) have been proposed, as well as adapted algorithms able to work with this type of data sets. However, until now, few studies have addressed the problem of how to deal with MLDs having a large number of labels. This characteristic can be defined as high dimensionality in the label space (output attributes), in contrast to the traditional high dimensionality problem, which is usually focused on the feature space (by means of feature selection) or sample space (by means of instance selection). The purpose of this paper is to analyze dimensionality in the label space in MLDs, and to present a transformation methodology based on the use of association rules to discover label dependencies. These dependencies are used to reduce the label space, to ease the work of any MLC algorithm, and to infer the deleted labels in a final postprocessing stage. The proposed process is validated in an extensive experimentation with several MLDs and classification algorithms, resulting in a statistically significant improvement of performance in some cases, as will be shown.
Year of Publication	2014
Journal	IEEE Transactions on Neural Networks and Learning Systems
Volume	25
Start Page	1842
Number of Pages	1842-1854
DOI	10.1109/TNNLS.2013.2296501
Download citation	DOI Google Scholar BibTeX
Notes	TIN2012-33856,TIN2011-28488,TIC-3928,P10-TIC-6858
Notes	TIN2012-33856,TIN2011-28488,TIC-3928,P10-TIC-6858
Bibliography media	Document 2014-TNNLS-LI-MLC.pdf

Author

Abstract

Multilabel classification (MLC) has generated considerable research interest in recent years, as a technique that can be applied to many real-world scenarios. To process them with binary or multiclass classifiers, methods for transforming multilabel data sets (MLDs) have been proposed, as well as adapted algorithms able to work with this type of data sets. However, until now, few studies have addressed the problem of how to deal with MLDs having a large number of labels. This characteristic can be defined as high dimensionality in the label space (output attributes), in contrast to the traditional high dimensionality problem, which is usually focused on the feature space (by means of feature selection) or sample space (by means of instance selection). The purpose of this paper is to analyze dimensionality in the label space in MLDs, and to present a transformation methodology based on the use of association rules to discover label dependencies. These dependencies are used to reduce the label space, to ease the work of any MLC algorithm, and to infer the deleted labels in a final postprocessing stage. The proposed process is validated in an extensive experimentation with several MLDs and classification algorithms, resulting in a statistically significant improvement of performance in some cases, as will be shown.

Year of Publication

2014

Journal

IEEE Transactions on Neural Networks and Learning Systems

Volume

25

Start Page

1842

Number of Pages

1842-1854

DOI

10.1109/TNNLS.2013.2296501

Download citation

Notes

TIN2012-33856,TIN2011-28488,TIC-3928,P10-TIC-6858

Notes

TIN2012-33856,TIN2011-28488,TIC-3928,P10-TIC-6858

Bibliography media

Document

2014-TNNLS-LI-MLC.pdf

LI-MLC: A Label Inference Methodology for Addressing High Dimensionality in the Label Space for Multilabel Classification

Location

Resources

User account menu

🍪 Cookie Notice