TitleSmartdata: Data preprocessing to achieve smart data in R
Publication TypeJournal Article
Year of Publication2019
AuthorsCordon, I., Luengo Julián, García Salvador, Herrera F., and Charte Francisco
Date Published09/2019
KeywordsData preprocessing, machine learning, Preprocessing, Smart data

As the amount of data available exponentially grows, data scientists are aware that finding the value in the data is key to a successful data exploiting. However, the data rarely presents itself in a ordered, clean way. In opposition to dealing with raw data, the term smart data is becoming more and more visible both in the specialized literature and companies. While software packages publicly exist to deal with raw data, there is no unified framework that encompasses all the required fields to transform such raw data to smart data. In this paper, the novel smartdata package is introduced. Written in R and available at CRAN repository, it includes the most recent and relevant algorithms to treat raw data from multiple perspectives, now unified under a simple yet powerful API, which enables the data scientist to easily pipeline their application. The main features of the package, as well as some illustrative examples of its use are detailed throughout this manuscript.


BigDaP-TOOLS - Ayudas Fundación BBVA a Equipos de Investigación Científica 2016