Obtaining fuzzy rules from interval-censored data with genetic algorithms and a random sets-based semantic of the linguistic labels

Abstract  Fuzzy memberships can be understood as coverage functions of random sets. This interpretation makes sense in the context of
fuzzy rule learning: a random-sets-based semantic of the linguistic labels is compatible with the use of fu…

Abstract  

Fuzzy memberships can be understood as coverage functions of random sets. This interpretation makes sense in the context of
fuzzy rule learning: a random-sets-based semantic of the linguistic labels is compatible with the use of fuzzy statistics
for obtaining knowledge bases from data. In particular, in this paper we formulate the learning of a fuzzy-rule-based classifier
as a problem of statistical inference. We propose to learn rules by maximizing the likelihood of the classifier. Furthermore,
we have extended this methodology to interval-censored data, and propose to use upper and lower bounds of the likelihood to
evolve rule bases. Combining descent algorithms and a co-evolutionary scheme, we are able to obtain rule-based classifiers
from imprecise data sets, and can also identify the conflictive instances in the training set: those that contribute the most
to the indetermination of the likelihood of the model.

  • Content Type Journal Article
  • DOI 10.1007/s00500-010-0627-6
  • Authors
    • Luciano Sánchez, University of Oviedo Computer Science Department Campus de Viesques 33071 Gijón Asturias Spain
    • Inés Couso, University of Oviedo Statistics Department, Facultad de Ciencias 33071 Oviedo Asturias Spain

Addressing data complexity for imbalanced data sets: analysis of SMOTE-based oversampling and evolutionary undersampling

Abstract  In the classification framework there are problems in which the number of examples per class is not equitably distributed,
formerly known as imbalanced data sets. This situation is a handicap when trying to identify the minority cl…

Abstract  

In the classification framework there are problems in which the number of examples per class is not equitably distributed,
formerly known as imbalanced data sets. This situation is a handicap when trying to identify the minority classes, as the
learning algorithms are not usually adapted to such characteristics. An usual approach to deal with the problem of imbalanced
data sets is the use of a preprocessing step. In this paper we analyze the usefulness of the data complexity measures in order
to evaluate the behavior of undersampling and oversampling methods. Two classical learning methods, C4.5 and PART, are considered
over a wide range of imbalanced data sets built from real data. Specifically, oversampling techniques and an evolutionary
undersampling one have been selected for the study. We extract behavior patterns from the results in the data complexity space
defined by the measures, coding them as intervals. Then, we derive rules from the intervals that describe both good or bad
behaviors of C4.5 and PART for the different preprocessing approaches, thus obtaining a complete characterization of the data
sets and the differences between the oversampling and undersampling results.

  • Content Type Journal Article
  • DOI 10.1007/s00500-010-0625-8
  • Authors
    • Julián Luengo, University of Granada Department of Computer Science and Artificial Intelligence 18071 Granada Spain
    • Alberto Fernández, University of Jaén Department of Computer Science 23071 Jaén Spain
    • Salvador García, University of Jaén Department of Computer Science 23071 Jaén Spain
    • Francisco Herrera, University of Granada Department of Computer Science and Artificial Intelligence 18071 Granada Spain

Learning a tensor subspace for semi-supervised dimensionality reduction

Abstract  The high-dimensional data is frequently encountered and processed in real-world applications and unlabeled samples are readily
available, but labeled or pairwise constrained ones are fairly expensive to capture. Traditionally, when…

Abstract  

The high-dimensional data is frequently encountered and processed in real-world applications and unlabeled samples are readily
available, but labeled or pairwise constrained ones are fairly expensive to capture. Traditionally, when a pattern itself
is an n
1 × n
2 image, the image first has to be vectorized to the vector pattern in

Ân1 ×n2

by concatenating its pixels. However, such a vector representation fails to take into account the spatial locality of pixels
in the images, which are intrinsically matrices. In this paper, we propose a tensor subspace learning-based semi-supervised
dimensionality reduction algorithm (TS2DR), in which an image is naturally represented as a second-order tensor in

Ân1 ÄÂn2

and domain knowledge in the forms of pairwise similarity and dissimilarity constraints is used to specify whether pairs of
instances belong to the same class or different classes. TS2DR has an analytic form of the global structure preserving embedding transformation, which can be easily computed based on
eigen-decomposition. We also verify the efficiency of TS2DR by conducting unbalanced data classification experiments based on the benchmark real-word databases. Numerical results
show that TS2DR tends to capture the intrinsic structure characteristics of the given data and achieves better classification accuracy,
while being much more efficient.

  • Content Type Journal Article
  • DOI 10.1007/s00500-010-0631-x
  • Authors
    • Zhao Zhang, Nanjing Forestry University Department of Computer Science and Technology Nanjing 210037 China
    • Ning Ye, Nanjing Forestry University Department of Computer Science and Technology Nanjing 210037 China

Using fuzzy logic modelling to simulate farmers’ decision-making on diversification and integration in the Mekong Delta, Vietnam

Abstract  To reveal farmers’ motives for on-farm diversification and integration of farming components in the Mekong Delta, Vietnam,
we developed a fuzzy logic model (FLM) using a 10-step approach. Farmers’ decision-making was mimicked i…

Abstract  

To reveal farmers’ motives for on-farm diversification and integration of farming components in the Mekong Delta, Vietnam,
we developed a fuzzy logic model (FLM) using a 10-step approach. Farmers’ decision-making was mimicked in a three-layer hierarchical
architecture of fuzzy inference systems, using data of 72 farms. The model includes three variables for family motives of
diversification, six variables related to component integration, next to variables for the production factors and for farmers’
appreciation of market prices and know-how on 10 components. To obtain a good classification rate of the less frequent activities,
additional individual fine-tuning was necessary after general model calibration. To obtain the desired degree of sensitivity
to each variable, it was necessary to use up to five linguistic values for some of the input and output variables in the intermediate
hierarchical layers. Model’s sensitivity to motivational variables determining diversification and integration was of the
same magnitude as its sensitivity to market prices and farmers’ know-how of the activities, but less than its sensitivity
to labour, capital and land endowment. Modelling to support strategic decision-making seems too elaborate for individual farms,
but FLM will be useful to integrate farmers’ opinions in strategic decision-making at higher hierarchical levels.

  • Content Type Journal Article
  • Category Original Paper
  • DOI 10.1007/s00500-010-0618-7
  • Authors
    • Roel Bosma, Wageningen University Aquaculture and Fisheries P.O. Box 338 6700 AH Wageningen The Netherlands
    • Uzay Kaymak, Erasmus University Econometric Institute Rotterdam The Netherlands
    • Jan van den Berg, Delft University of Technology Fac. Technology, Policy and Management Delft The Netherlands
    • Henk Udo, Wageningen University Animal Production Systems Wageningen The Netherlands
    • Johan Verreth, Wageningen University Aquaculture and Fisheries P.O. Box 338 6700 AH Wageningen The Netherlands

Improving the performance and scalability of Differential Evolution on problems exhibiting parameter interactions

Abstract  Differential Evolution (DE) is a powerful optimization procedure that self-adapts to the search space, although DE lacks diversity
and sufficient bias in the mutation step to make efficient progress on non-separable problems. We pr…

Abstract  

Differential Evolution (DE) is a powerful optimization procedure that self-adapts to the search space, although DE lacks diversity
and sufficient bias in the mutation step to make efficient progress on non-separable problems. We present an enhancement to
DE that introduces greater diversity while also directing the search to more promising regions. The Combinatorial Sampling
Differential Evolution (CSDE) is introduced which can sample vectors in two ways; highly correlated with the search space
or around a ‘better’ individual. The CSDE approach can provide a similar number of samples as crossover, without being biased
towards the principle coordinate axes of a decision space. This approach to sampling vectors is capable of optimizing problems
with extensive parameter interactions. It also demonstrates fast convergence towards the global optimum and is highly scalable
in the decision space on a variety of single and multi-objective problems due to the balance between sampling highly directed
correlated vectors and non-correlated vectors which contribute to sampling diversity.

  • Content Type Journal Article
  • Category Focus
  • DOI 10.1007/s00500-010-0614-y
  • Authors
    • Antony W. Iorio, University of New South Wales @ Australian Defense Force Academy Defense and Security Applications Research Centre Northcott Drive Canberra ACT 2600 Australia
    • Xiaodong Li, RMIT University School of Computer Science and Information Technology GPO Box 2476v Melbourne VIC 3001 Australia

Remarks and corrections to the triangular approximations of fuzzy numbers using α-weighted valuations

Abstract  A recent paper was dedicated to find the nearest fuzzy triangular approximations of a fuzzy number by using α-weighted valuations.
We prove, by simple examples, that the results of approximations are not always triangular fuzzy nu…

Abstract  

A recent paper was dedicated to find the nearest fuzzy triangular approximations of a fuzzy number by using α-weighted valuations.
We prove, by simple examples, that the results of approximations are not always triangular fuzzy numbers and that in fact
they are not fuzzy sets. We give a correct solution of the problem of approximation in a more general case, and we study the
properties of identity, additivity, translation invariance, scale invariance, and monotonicity of the new approximation operator.

  • Content Type Journal Article
  • DOI 10.1007/s00500-010-0620-0
  • Authors
    • Adrian I. Ban, University of Oradea Department of Mathematics and Informatics Universitatii 1 410087 Oradea Romania

Learning concurrently data and rule bases of Mamdani fuzzy rule-based systems by exploiting a novel interpretability index

Abstract  Interpretability of Mamdani fuzzy rule-based systems (MFRBSs) has been widely discussed in the last years, especially in the
framework of multi-objective evolutionary fuzzy systems (MOEFSs). Here, multi-objective evolutionary algor…

Abstract  

Interpretability of Mamdani fuzzy rule-based systems (MFRBSs) has been widely discussed in the last years, especially in the
framework of multi-objective evolutionary fuzzy systems (MOEFSs). Here, multi-objective evolutionary algorithms (MOEAs) are
applied to generate a set of MFRBSs with different trade-offs between interpretability and accuracy. In MOEFSs interpretability
has often been measured in terms of complexity of the rule base and only recently partition integrity has also been considered.
In this paper, we introduce a novel index for evaluating the interpretability of MFRBSs, which takes both the rule base complexity
and the data base integrity into account. We discuss the use of this index in MOEFSs, which generate MFRBSs by concurrently
learning the rule base, the linguistic partition granularities and the membership function parameters during the evolutionary
process. The proposed approach has been experimented on six real world regression problems and the results have been compared
with those obtained by applying the same MOEA, with only accuracy and complexity of the rule base as objectives. We show that
our approach achieves the best trade-offs between interpretability and accuracy.

  • Content Type Journal Article
  • DOI 10.1007/s00500-010-0629-4
  • Authors
    • Michela Antonelli, University of Pisa Dipartimento di Ingegneria dell’Informazione: Elettronica, Informatica, Telecomunicazioni Via Diotisalvi 2 56122 Pisa Italy
    • Pietro Ducange, University of Pisa Dipartimento di Ingegneria dell’Informazione: Elettronica, Informatica, Telecomunicazioni Via Diotisalvi 2 56122 Pisa Italy
    • Beatrice Lazzerini, University of Pisa Dipartimento di Ingegneria dell’Informazione: Elettronica, Informatica, Telecomunicazioni Via Diotisalvi 2 56122 Pisa Italy
    • Francesco Marcelloni, University of Pisa Dipartimento di Ingegneria dell’Informazione: Elettronica, Informatica, Telecomunicazioni Via Diotisalvi 2 56122 Pisa Italy

The numerical solution of linear fuzzy Fredholm integral equations of the second kind by using finite and divided differences methods

Abstract  In recent years, many numerical methods have been proposed for solving fuzzy linear integral equations. In this paper, we
use the divided differences and finite differences methods for solving a parametric of the fuzzy Fredholm int…

Abstract  

In recent years, many numerical methods have been proposed for solving fuzzy linear integral equations. In this paper, we
use the divided differences and finite differences methods for solving a parametric of the fuzzy Fredholm integral equations
of the second kind with arbitrary kernel and present some examples to illustrate this method.

  • Content Type Journal Article
  • DOI 10.1007/s00500-010-0606-y
  • Authors
    • N. Parandin, Islamic Azad University Department of Mathematics, Kermanshah Branch Kermanshah Iran
    • M. A. Fariborzi Araghi, Islamic Azad University Department of Mathematics, Central Tehran Branch P.O. Box 13185.768 Tehran Iran

A genetic programming method for protein motif discovery and protein classification

Abstract  Proteins can be grouped into families according to some features such as hydrophobicity, composition or structure, aiming
to establish common biological functions. This paper presents MAHATMA—memetic algorithm-based highly adapte…

Abstract  

Proteins can be grouped into families according to some features such as hydrophobicity, composition or structure, aiming
to establish common biological functions. This paper presents MAHATMA—memetic algorithm-based highly adapted tool for motif
ascertainment—a system that was conceived to discover features (particular sequences of amino acids, or motifs) that occur
very often in proteins of a given family but rarely occur in proteins of other families. These features can be used for the
classification of unknown proteins, that is, to predict their function by analyzing their primary structure. Experiments were
done with a set of enzymes extracted from the Protein Data Bank. The heuristic method used was based on genetic programming
using operators specially tailored for the target problem. The final performance was measured using sensitivity, specificity
and hit rate. The best results obtained for the enzyme dataset suggest that the proposed evolutionary computation method is
effective in finding predictive features (motifs) for protein classification.

  • Content Type Journal Article
  • DOI 10.1007/s00500-010-0624-9
  • Authors
    • Denise Fukumi Tsunoda, Federal University of Parana Av. Prefeito Lothário Meissner, 632, Room 38 Curitiba PR Brazil
    • Alex Alves Freitas, University of Kent School of Computing Room S107 Canterbury Kent CT2 7NF UK
    • Heitor Silvério Lopes, Federal University of Technology Av. 7 de Setembro, 3165, Bloco D, 3° floor Curitiba PR Brazil

Weighted local sharing and local clearing for multimodal optimisation

Abstract  Local sharing is a method designed for efficient multimodal optimisation that combines fitness sharing with spatially structured
populations and elitist replacement. In local sharing, the bias towards sharing and the influence of s…

Abstract  

Local sharing is a method designed for efficient multimodal optimisation that combines fitness sharing with spatially structured
populations and elitist replacement. In local sharing, the bias towards sharing and the influence of spatial structure is
controlled by the deme (neighbourhood) size. This introduces an undesirable trade-off; to maximise the sharing effect large
deme sizes must be used, but the opposite must be true if one wishes to maximise the influence of spatial population structure.
This paper introduces two modifications to the local sharing method. The first alters local sharing so that parent selection
and fitness sharing operate at two different spatial levels; parent selection is performed within small demes, while the effect
of fitness sharing is weighted according to the distance between individuals in the entire population structure. The second
method replaces fitness sharing within demes with clearing to produce a method that we call local clearing. The proposed methods,
as tested on several benchmark problems, demonstrate a level of efficiency that surpasses that of traditional fitness sharing
and standard local sharing. Additionally, they offer a level of parameter robustness that surpasses other elitist niching
methods, such as clearing. Through analysis of the local clearing method, we show that this parameter robustness is a result
of the isolated nature of the demes in a spatially structured population being able to independently concentrate on subsets
of the desired optima in a fitness landscape.

  • Content Type Journal Article
  • DOI 10.1007/s00500-010-0612-0
  • Authors
    • Grant Dick, University of Otago Department of Information Science Dunedin New Zealand
    • Peter A. Whigham, University of Otago Department of Information Science Dunedin New Zealand