Cometa: The comprehensive multi-label data archive, it is a collection of multi-label datasets, available at cometa.ujaen.es in different file formats and pre-partitioned following several strategies. The dozens of available datasets can be used with tools such as MULAN, MEKA, KEEL, LibSVM and the utiml, mldr and mldr.datasets R packages. A detailed description of each dataset, including attributes, labels, multi-label metrics and label relationships plots, is also provided. The user can easily sort the datasets according to several of these metrics, choosing the most appropriate to its interest.
A framework to easy execute Emerging Pattern Mining (EPM) algorithms which is a data mining task to describe a set of data using supervised learning. In this framework you can execute the most important EPM algorithms that exists in the literature with the proposal of discovering emerging trends on timestamped data or interesting differences between multiple variables and classes. Access to this framework.
SIMiDat dmServer is a single data mining software based on web technology in order to execute a wide number of computational intelligence algorithms without the necessity to install any software in your computer. The website incorporates different algorithms to preprocess and analyse data in an easy way through a GUI. Visit dmServer.
KEEL (Knowledge Extraction based on Evolutionary Learning) is an open source (GPLv3) Java software tool that can be used for a large number of different knowledge data discovery tasks. KEEL provides a simple GUI based on data flow to design experiments with different datasets and computational intelligence algorithms (paying special attention to evolutionary algorithms) in order to assess the behavior of the algorithms. It allows to perform a complete analysis of new computational intelligence proposals in comparison to existing ones. Moreover, KEEL has been designed with a two-fold goal: research and educational. Visit KEEL.
R package available at CRAN
Functions intended to work with the API of the Spain Government
A parser and a writer for 'WEKA' Attribute-Relation File Format
Allows to forecast time series using nearest neighbors regression. When the forecasting horizon is higher than 1, two multi-step ahead forecasting strategies can be used. The model built is autoregressive, that is, it is only based on the observations of the time series. The nearest neighbors used in a prediction can be consulted and plotted.
Exploratory data analysis and manipulation functions for multi- label data sets along with an interactive Shiny application to ease their use.
Implementation of evolutionary fuzzy systems for the data mining task called "subgroup discovery". It also provide a Shiny App for make the analysis easier. The algorithms works with data sets provided in KEEL, ARFF and CSV format and also with data.frame objects.
Large collection of multilabel datasets along with the functions needed to export them to several formats, to make partitions, and to obtain bibliographic information.
Implementation of several unsupervised neural networks, from building their architecture to their training and evaluation. Available networks are auto-encoders including their main variants: sparse, contractive, denoising, robust and variational, as described in Charte et al. (2018) doi:10.1016/j.inffus.2017.12.007.
Eases data preprocessing tasks, providing a data flow based on a pipe operator which eases cleansing, transformation, oversampling, or instance/feature selection operations.
Makes the time series prediction easier by automatizing this process using four main functions: prep(), modl(), pred() and postp(). Features different preprocessing methods to homogenize variance and to remove trend and seasonality. Also has the potential to bring together different predictive models to make comparatives. Features ARIMA and Data Mining Regression models (using caret).