Complementary information for the paper published in IEEE Trans. Fuzzy Systems

Tags

NMEEF-SD: Non-dominated Multiobjective Evolutionary Algorithm for Extracting Fuzzy Rules in Subgroup Discovery

A non-dominated multiobjective evolutionary algorithm for extracting fuzzy rules in subgroup discovery (NMEEFSD) is described and analyzed in this paper. This algorithm, which is based on the hybridization between fuzzy logic and genetic algorithms, deals with subgroup-discovery problems in order to extract novel and interpretable fuzzy rules of interest, and the evolutionary fuzzy system NMEEF-SD is based on the well-known Nondominated Sorting Genetic Algorithm II (NSGA-II) model but is oriented toward the subgroup-discovery task using specific operators to promote the extraction of interpretable and high-quality subgroup-discovery rules. The proposal includes different mechanisms to improve diversity in the population and permits the use of different combinations of quality measures in the evolutionary process. An elaborate experimental study, which was reinforced by the use of nonparametric tests, was performed to verify the validity of the proposal, and the proposal was compared with other subgroup discovery methods. The results show that NMEEF-SD obtains the best results among several algorithms studied.

IV. Experimental Study

In this experimental study, the aim was to analyze which combinations of quality measures used in the evolutionary process of NMEEF-SD offer better results and to compare the performance of the algorithm with other SD algorithms (both evolutionary and non-evolutionary). Therefore, we first studied the behavior of the NMEEF-SD algorithm with respect to the use of different combinations of quality measures within the evolutionary process.

The best combination was then compared with other evolutionary and classical SD algorithms. The experimentation was undertaken with real datasets from UCI repository. The properties of these datasets are presented in Table II: number of variables (nv), number of discrete variables (nvD), number of continuous variables (nvC), number of classes of the dataset (nc), and number of examples (ns).

Properties of the data sets used from the UCI repository (DOWNLOAD)
NamenvnvDnvCncns
Appendicitis7072106
Australian14862690
Balance4043625
Breast-w9902699
Bridges7432102
Bupa6062345
Car66041728
Chess3636023196
Cleveland130135303
Dermatology333306366
Diabetes8082768
Echo6152131
German2013721000
Glass9096214
Haberman3032306
Hayesroth4403132
Heart13672270
Hepatitis191362155
Hypothyroid2518723163
Ionosphere340342351
Iris4043150
Led70710500
Lymp181804148
Marketing13130108993
Mushrooms2222028124
Nursery880512960
Tic-tac-toe9902958
Vehicle180184846
Vote161602435
Wine130133178
IV.B. Quality measures analysis
The complete results table can be found below:
IV.C. Comparison of the existing evolutionary algorithms for subgroup discovery
The complete results table can be found below:
IV.D. Comparison of NMEEF-SD and the classical subgroup discovery algorithms
The complete results table can be found below:
Results comparison obtained with/without the use of the Re-initialisation based on coverage operator
The complete results table can be found below: