Resources available

Cometa: The comprehensive multi-label data archive, it is a collection of multi-label datasets, available at in different file formats and pre-partitioned following several strategies. The dozens of available datasets can be used with tools such as MULAN, MEKA, KEEL, LibSVM and the utiml, mldr and mldr.datasets R packages. A detailed description of each dataset, including attributes, labels, multi-label metrics and label relationships plots, is also provided. The user can easily sort the datasets according to several of these metrics, choosing the most appropriate to its interest.

A framework to easy execute Emerging Pattern Mining (EPM) algorithms which is a data mining task to describe a set of data using supervised learning. In this framework you can execute the most important EPM algorithms that exists in the literature with the proposal of discovering emerging trends on timestamped data or interesting differences between multiple variables and classes. Access to this framework.

SIMiDat dmServer is a single data mining software based on web technology in order to execute a wide number of computational intelligence algorithms without the necessity to install any software in your computer. The website incorporates different algorithms to preprocess and analyse data in an easy way through a GUI. Visit dmServer.

KEEL (Knowledge Extraction based on Evolutionary Learning) is an open source (GPLv3) Java software tool that can be used for a large number of different knowledge data discovery tasks. KEEL provides a simple GUI based on data flow to design experiments with different datasets and computational intelligence algorithms (paying special attention to evolutionary algorithms) in order to assess the behavior of the algorithms. It allows to perform a complete analysis of new computational intelligence proposals in comparison to existing ones. Moreover, KEEL has been designed with a two-fold goal: research and educational. Visit KEEL.



R package available at CRAN


mldr: Exploratory Data Analysis and Manipulation of Multi-Label Data Sets

Exploratory data analysis and manipulation functions for multi- label data sets along with an interactive Shiny application to ease their use.

SDEFSR: Subgroup Discovery with Evolutionary Fuzzy Systems in R

Implementation of evolutionary fuzzy systems for the data mining task called "subgroup discovery". It also provide a Shiny App for make the analysis easier. The algorithms works with data sets provided in KEEL, ARFF and CSV format and also with data.frame objects.

mldr.datasets: R Ultimate Multilabel Dataset Repository

Large collection of multilabel datasets along with the functions needed to export them to several formats, to make partitions, and to obtain bibliographic information.


ruta: Implementation of Unsupervised Neural Architectures

Implementation of several unsupervised neural networks, from building their architecture to their training and evaluation. Available networks are auto-encoders including their main variants: sparse, contractive, denoising, robust and variational, as described in Charte et al. (2018) doi:10.1016/j.inffus.2017.12.007.

smartdata: Data Preprocessing

Eases data preprocessing tasks, providing a data flow based on a pipe operator which eases cleansing, transformation, oversampling, or instance/feature selection operations.

predtoolsTS: Time Series Prediction Tools

Makes the time series prediction easier by automatizing this process using four main functions: prep(), modl(), pred() and postp(). Features different preprocessing methods to homogenize variance and to remove trend and seasonality. Also has the potential to bring together different predictive models to make comparatives. Features ARIMA and Data Mining Regression models (using caret).


Clustering: Techniques for Evaluating Clustering

The design of this package allows us to run different clustering packages and compare the results between them, to determine which algorithm behaves best from the data provided.