Binary Relation Learning and Classifying for Preselection in Evolutionary Algorithms

Evolutionary algorithms (EAs) are a kind of population-based heuristic optimization method by using trial-and-error. Therefore, the search efficiency is a major concern in both of the algorithm design and applications. The preselection, which estimates…

Evolutionary algorithms (EAs) are a kind of population-based heuristic optimization method by using trial-and-error. Therefore, the search efficiency is a major concern in both of the algorithm design and applications. The preselection, which estimates the quality of candidate solutions and discards unpromising ones before fitness evaluation, is a widely used component for reducing the number of fitness evaluations in EAs. The surrogate models, such as regression and classification, are usually applied for quality estimation. In some EA frameworks, the relationship between a pair of solutions helps to distinguish “good” and “bad” solutions. In such cases, it is not necessary to estimate the specific quality of each candidate solution but the binary relationship of a pair of solutions. Following this idea, this article proposes a new preselection strategy, called relationship classification-based preselection (RCPS). In RCPS, a classification model is built to learn the relationship between a pair of solutions based on a given training data set, and promising candidate solutions are prescreened by this relation. The mechanism of RCPS is visualized and analyzed. The advantages of RCPS over traditional surrogate model-based preselection strategies are illustrated through a comprehensive empirical study. The experimental results suggest that on two sets of test suits, RCPS outperforms the comparison preselection strategies. To achieve a same accuracy, an EA with RCPS needs a smaller number of fitness evaluations than the one without RCPS.

A Framework for Scalable Bilevel Optimization: Identifying and Utilizing the Interactions Between Upper-Level and Lower-Level Variables

When solving bilevel optimization problems (BOPs) by evolutionary algorithms (EAs), it is necessary to obtain the lower-level optimal solution for each upper-level solution, which gives rise to a large number of lower-level fitness evaluations, especia…

When solving bilevel optimization problems (BOPs) by evolutionary algorithms (EAs), it is necessary to obtain the lower-level optimal solution for each upper-level solution, which gives rise to a large number of lower-level fitness evaluations, especially for large-scale BOPs. It is interesting to note that some upper-level variables may not interact with some lower-level variables. Under this condition, if the value(s) of one/several upper-level variables change(s), we only need to focus on the optimization of the interacting lower-level variables, thus reducing the dimension of the search space and saving the number of lower-level fitness evaluations. This article proposes a new framework (called GO) to identify and utilize the interactions between upper-level and lower-level variables for scalable BOPs. GO includes two phases: 1) the grouping phase and 2) the optimization phase. In the grouping phase, after identifying the interactions between upper-level and lower-level variables, they are divided into three types of subgroups (denoted as types I–III), which contain only upper-level variables, only lower-level variables, and both upper-level and lower-level variables, respectively. In the optimization phase, if type-I and type-II subgroups only include one variable, a multistart sequential quadratic programming is designed; otherwise, a single-level EA is applied. In addition, a criterion is proposed to judge whether a type-II subgroup has multiple optima. If multiple optima exist, by incorporating the information of the upper level, we design new objective function and degree of constraint violation to locate the optimistic solution. As for type-III subgroups, they are optimized by a bilevel EA (BLEA). The effectiveness of GO is demonstrated on a set of scalable test problems by applying it to five representative BLEAs. Moreover, GO is applied to the resource pricing in mobile edge computing.

Sharp Bounds for Genetic Drift in Estimation of Distribution Algorithms

Estimation of distribution algorithms (EDAs) are a successful branch of evolutionary algorithms (EAs) that evolve a probabilistic model instead of a population. Analogous to genetic drift in EAs, EDAs also encounter the phenomenon that the random sampl…

Estimation of distribution algorithms (EDAs) are a successful branch of evolutionary algorithms (EAs) that evolve a probabilistic model instead of a population. Analogous to genetic drift in EAs, EDAs also encounter the phenomenon that the random sampling in the model update can move the sampling frequencies to boundary values not justified by the fitness. This can result in a considerable performance loss. This article gives the first tight quantification of this effect for three EDAs and one ant colony optimizer, namely, for the univariate marginal distribution algorithm, the compact genetic algorithm, population-based incremental learning, and the max–min ant system with iteration-best update. Our results allow to choose the parameters of these algorithms in such a way that within a desired runtime, no sampling frequency approaches the boundary values without a clear indication from the objective function.

Characterizing Genetic Programming Error Through Extended Bias and Variance Decomposition

An error function can be used to select between candidate models but it does not provide a thorough understanding of the behavior of a model. A greater understanding of an algorithm can be obtained by performing a bias-variance decomposition. Splitting…

An error function can be used to select between candidate models but it does not provide a thorough understanding of the behavior of a model. A greater understanding of an algorithm can be obtained by performing a bias-variance decomposition. Splitting the error into bias and variance is effective for understanding a deterministic algorithm such as $k$ -nearest neighbor, which provides the same predictions when performed multiple times using the same data. However, simply splitting the error into bias and variance is not sufficient for nondeterministic algorithms, such as genetic programming (GP), which potentially produces a different model each time it is run, even when using the same data. This article presents an extended bias-variance decomposition that decomposes error into bias, external variance (error attributable to limited sampling of the problem), and internal variance (error due to random actions performed in the algorithm itself). This decomposition is applied to GP to expose the three components of error, providing a unique insight into the role of maximum tree depth, number of generations, size/complexity of function set, and data standardization in influencing predictive performance. The proposed tool can be used to inform targeted improvements for reducing specific components of model error.

GP bib URL of your papers before December 1, 2020

As mentioned in my blog posting of 8 June 2020, due to the pandemic, maintenance of the genetic programming bibliography fell behind at the start of this year. I am now pleased to be able to say that I have completed a major new release and would like …

As mentioned in my blog posting of 8 June 2020, due to the pandemic, maintenance of the genetic programming bibliography fell behind at the start of this year. I am now pleased to be able to say that I have completed a major new release and would like to encourage everyone to check that their entry in the GP bibliography http://gpbib.cs.ucl.ac.uk/ is up to date and especially that it contains URLs for their GP papers and, were appropriate, their PhD thesis.

Thank you

Bill