Genetic Algorithms typically operate upon haploid genomes, which contain at each locus only one allele. However, many biological organisms, including humans, possess diploid genomes with two or more alleles at each locus, with a complex dominance relationship governing the expression of each allele in the phenotype. Multiploidy is not limited to diploidy - many plants are tetraploid, and some varieties of corn are decaploid .
The biological success of multiploidy has inspired some researchers in Evolutionary Computation to seek analogous mechanisms to improve the performance and evolvability of population-based algorithms.
It is thought the biological utility of multiploidy is as a form of memory of previously useful adaptations. The greater variability introduced in the population by diploidy is useful in the presence of viruses, parasites, predators, or other co-evolving populations. For this reason, diploidy has primarily attracted evolutionary computation researchers interested in non-stationary fitness functions. The relative complexity of simulating diploid or multiploid evolution - especially the need to introduce a dominance function - has largely deterred the use of multiploidy for GAs used to optimize over a static fitness function .
The available literature on multiploidy in Evolutionary Computation is primarily focussed on applications to non-static fitness functions. I found little discussing the effects of outlaw genes or other deleterious recessive alleles, which due to their recessive nature can propagate into future generations with little effect on population fitness. Such alleles, when heterozygous, may actually confer improved fitness (e.g. the sickle cell anemia gene in humans), but when homozygous can be highly pernicious. The use of an incest taboo in EC algorithms to reduce the frequency of such combinations has apparently not been researched.
Non-stationary fitness functions are the rule rather the exception in nature. Populations frequently engage in arms races marked by rapid co-evolution, for example between predatory species and their prey, or between a virus and a population of host immune systems. This corresponds to a dynamic fitness function of phenotypic traits.
One view on the usefulness of multiploidy is that it allows for memory of previously discovered solutions which may be useful as the fitness landscape changes. These genes can shield themselves from extinction by becoming recessive. Note that it is not essential that the recessive alleles contribute to fitness by the same mechanisms that they did in the past history of the population . For example, in nature, an allele which produces a phenotypic difference in larger body size may have once been an adaptation to increasing size of predators, but may later help in adapting to a cooling climate.
Calabretta, Galbiati, Nolfi and Parisi found that using diploidy in a GA evolving neural network weights (population size 100) resulted in a higher variation of fitness over the population, with a slight advantage to diploidy in terms of the fitness of the best individual. They also found that the diploid populations performed better with a periodically varying environment, and also exhibited less reduction in fitness when the environment changed suddenly. In this research however, only mutation was used, without crossover, so their results are limited in scope.
Kim, Kim, Lee, Cho and Lee-Kwang also find that a diploid GA improves the performance in the presence of a changing fitness landscape.
Collingwood, Corne and Ross investigated the use of multiploidy in GAs for some well known test problems: the Indecisive(k) problem and the max problem. They found that the multiploid algorithms outperformed haploid algorithms at the same population size on these two problems. In the Indecisive(k) problem, which has many attractive local suboptima, the multiploid algorithms greatly outperformed the haploid algorithm, with the 3-ploid, 7-ploid and 9-ploid algorithms finding the correct solution in 100% of the trials, compared to 60% for the haploid algorithm. The credit the multiploidy with helping the algorithm to recover from genetic drift. They conclude that multiploidy is best suited for problems with suboptima at a large Hamming distance from the global optimum.
Similarly, Osmera, Kvasnicka and Pospichal found that diploid GAs outperformed haploid GAs for a highly multimodal function.
For the evolutionary development of hardware, Hikage, et. al. present in a stepwise manner more difficult problems to their evolving system, AdAM, which is used to evolve an artificial ant which collects simulated food as efficiently as possible. They not only found that the ants evolved faster if the environment was modified in steps as generations , proceeded, but that the diploid ants evolved faster than the haploid ants, by a factor of approximately 3. However, their results are shown for haploid ants evolved in a fixed environment, and the diploid ants in a progressive environment, so it is difficult to make a direct comparison. Their results also contrast with Harts assertion that multiploidy is slower because the average fitness of the population is lower after the same number of generations.
Multiploidy permits a higher degree of variation in a population while maintaining highly adaptive traits through dominance. For example, if two haploid individuals both have brown eyes, their offspring will have brown eyes barring a mutation. However, two diploid parents may produce a blue eyed offspring even if both have brown eyes, if both are heterozygous. Some researchers credit the variation concomitant to multiploidy with preventing premature convergence and providing extra variation to recover from genetic drift.
The greater genotypical variation introduced by diploidy may confer an advantage in arms races, with expression of a recessive allele partially denying the opponent the successful use of a modified strategy. Hillis found diploidy useful in evolving sorting networks, which involved co-evolving simulated parasites, the fitness of which was determined by their ability to stump the sorting network. As the parasites evolved better test case, the sorting networks got better at sorting them.
The interesting wrinkle introduced by multiploidy is the need for a function which determines which allele is expressed when a particular locus is heterozygous - i.e. at least two different alleles are present. Crossover is also complicated by multiploidy. Crossover can occur within each parent (meiosis), or between parents, as in haploid algorithms.
In the biological world, dominance is not a simple thing. Dominance can be only partial, allowing some of the recessive trait to be expressed, or there may be co-dominance, as in the case of the AB blood group. Similarly, in Evolutionary Computation there appear to be as many approaches to dominance as there are researchers.
Collingwood, Corne and Ross use a simple mask, which for each locus specifies which of the p chromosomes is dominant - independent of which allele is present in that chromosome. This is not biologically feasible, as they admit, but greatly simplifies the GA. Their mask is part of the genome (and mutation and crossover), but is not subject to dominance - if it is not crossed, then one mask is randomly selected from one parent to become the child mask.
Osmera, Kvasnicka and Pospichal performed an XOR operation between chromosomes. How this relates to dominance is not clear. Alternatively, they also randomly select on allele or the other at each locus, which is broadly similar to the approach of Collingwood, et. al, described above, except the choice of allele is not part of the chromosome and is not adaptive. For their crossover operator, they randomly select one of the two chromosomes in each of the two parents, and then perform single-point crossover between these two chromosomes in generating two offspring.
Calabretta, Galbiati, Nolfi and Parisi employed an interesting adaptive dominance method. In addition to genes expressing neural network weights (structural genes), there were for each locus a 2-bit dominance modifier gene. Dominance is determined by XORing the first bit of the modifier gene with the corresponding first bit of the modifier gene on the other chromosome. If the result is 0, then the structural gene corresponding to modifier gene with bit 2 equal to zero is expressed, and similarly if the result is 1. In the even of a tie, the genes are considered co-dominant, and the resulting phenotype contains the average of the two values in the structural genes.
Hikage, Hemmi and Shimohara use a much more complex scheme for the GP-based hardware evolution approach. Alleles are subtrees of a grammar tree, and the dominance operator selects one or more of these subtrees for expression. In addition to mutation, haploid crossover, deletion and duplication, they also introduce new operators: Diploid Crossover and Meiosis/Fertilization.
Osmera, Kvasnicka and Pospichal performed an XOR operation between chromosomes. How this relates to dominance is not clear. They also randomly select on allele or the other at each locus, which is broadly similar to the approach of Collingwood, et. al, described above, except the choice of allele is not part of the chromosome and is not adaptive.
Kim, Kim, Lee, Cho and Lee-Kwang employ a winner-take-all approach to diploid dominance. In their research, each chromosome is evaluated independently, and the fittest chromosome is expressed in the phenotype. Once two parents are selected, they perform crossover first on the two dominant chromosomes to generate one offspring, and then on the recessive chromosomes to generate the second offspring. In addition, they allow for some small probability of dominance mutation, i.e. the dominant and recessive chromosome switch places before crossover.
One potential problem that I did not find directly addressed in the literature is the problem of outlaw genes which can lurk in the population; either shielded by their recessive nature0. If these genes are not frequently expressed in the phenotype, they will have little effect on the populations fitness. However, crossover between closely related individuals will more frequently result in individuals homozygous in a deleterious gene.
It is mentioned by Calabretta, et. al. that the evolution of dominance helps to mask deleterious traits. For this to work in Evolutionary Computation, adaptive dominance schemes are needed, e.g. as described above in the work of Collingwood, et. al. However, this masking only works when the trait in question is heterozygous.
One solution to this problem would be institution of an incest taboo1. This would be a simple requirement that closely related genomes not be paired for mating. The most probable reason that this has not appeared in the literature to date is that the evolution of outlaw genes requires a large population and a large number of generations - perhaps beyond the experiments with multiploidy conducted to date.
For advanced applications of evolutionary algorithms with dynamic fitness landscapes, research to date indicates an advantage for multiploidy in adapting to changes in the fitness landscape. For example, in evolutionary robotics, in which the robots are presented with problems of increasing complexity as they evolve. Furthermore, for problems with many suboptima or weak fitness functions, multiploid algorithms are a promising approach for defeating genetic drift and deception.
For artificial life studies, multiploidy and the evolution of same may be an ven more compelling research topic , since co-evolution and arms races should drive organisms to seek the increased variation acheived through mutliploidy. I was unable to locate significant research in this area.
The field of multiploidy in EC is one that appears to be little explored to date. The real nature of the advantages conferred by this approach appear to be not far beyond the hypothesis stage. In all the literature on multiploidy reviewed for this report, not one rigorous argument was provided for why it works - and such an argument may in fact not be accessible. Research is still largely motivated by the success of multiploidy in the living world.
0 Also in conditions which evolve a high degree of
epistasis, some outlaw genes may have only a slightly deleterious
effect on fitness in normal circumstances (this could also be true in
a haploid algorithm).
1In an ALife context, it might be desirable to
evolve an incest