Preprocessing vague imbalanced datasets and its use in genetic fuzzy classifiers

Author	Ana Palacios L. Sánchez I. Couso
Keywords	Classification algorithms Context data handling Euclidean distance fuzzy set theory Fuzzy systems genetic algorithms genetic fuzzy classifier genetic fuzzy system Genetics imbalanced dataset preprocessing minimum error based classification system Nearest neighbor searches objective function pattern classification Pediatrics Training
Abstract	When there is a substantial difference between the number of cases of the majority and minority classes, minimum error-based classification systems tend to overlook these last instances. This can be corrected either by preprocessing the dataset or by altering the objective function of the classifier. In this paper we analyze the first approach, in the context of genetic fuzzy systems (GFS), and in particular of those that can operate with imprecisely observed and low quality data. We will analyze the different preprocessing mechanisms of imbalanced datasets and will show the necessity of extending these for solving those problems where the data is both imprecise and im-balanced. In addition, we include a comprehensive description of a new algorithm, able to preprocess imprecise imbalanced datasets. Several real-world datasets are used to evaluate the proposal.
Year of Publication	2010
Date Published	July
DOI	10.1109/FUZZY.2010.5584797
Download citation	DOI Google Scholar BibTeX
Number of Pages	1-8

Author

Ana Palacios

L. Sánchez

I. Couso

Keywords

Classification algorithms

Context

data handling

Euclidean distance

fuzzy set theory

Fuzzy systems

genetic algorithms

genetic fuzzy classifier

genetic fuzzy system

Genetics

imbalanced dataset preprocessing

minimum error based classification system

Nearest neighbor searches

objective function

pattern classification

Pediatrics

Training

Abstract

When there is a substantial difference between the number of cases of the majority and minority classes, minimum error-based classification systems tend to overlook these last instances. This can be corrected either by preprocessing the dataset or by altering the objective function of the classifier. In this paper we analyze the first approach, in the context of genetic fuzzy systems (GFS), and in particular of those that can operate with imprecisely observed and low quality data. We will analyze the different preprocessing mechanisms of imbalanced datasets and will show the necessity of extending these for solving those problems where the data is both imprecise and im-balanced. In addition, we include a comprehensive description of a new algorithm, able to preprocess imprecise imbalanced datasets. Several real-world datasets are used to evaluate the proposal.

Year of Publication

2010

Date Published

July

DOI

10.1109/FUZZY.2010.5584797

Download citation

Number of Pages

1-8

Preprocessing vague imbalanced datasets and its use in genetic fuzzy classifiers

Location

Resources

User account menu

🍪 Cookie Notice