R Ultimate Multilabel Dataset Repository

TitleR Ultimate Multilabel Dataset Repository
Publication TypeConference Paper
Year of Publication2016
AuthorsCharte, Francisco, Charte David, Rivera-Rivas A.J., del Jesus M. J., and Herrera F.
Conference Name11th International Conference on Hybrid Artificial Intelligent Systems, HAIS 2016
Date Published4
Conference LocationSeville (Spain)
ISBN Number978-3-319-32033-5

Multilabeled data is everywhere on the Internet. From news on digital media and entries published in blogs, to videos hosted in Youtube, every object is usually tagged with a set of labels. This way they can be categorized into several non-exclusive groups. However, publicly available multilabel datasets (MLDs) are not so common. There is a handful of websites providing a few of them, using disparate file formats. Finding proper MLDs, converting them into the correct format and locating the appropriate bibliographic data to cite them are some of the difficulties usually confronted by researchers and practitioners. In this paper RUMDR (R Ultimate Multilabel Dataset Repository), a new multilabel dataset repository aimed to fuse all public MLDs, is introduced, along with mldr.datasets, an R package which eases the process of retrieving MLDs and their bibliographic information, exporting them to the desired file formats and partitioning them.