Return to my Genetic Algorithms Page
See my Master's thesis for a complete study of the TCS operator.
The following IEEE paper is a condensed version of the thesis.
Thomas P. Dickens |
Charles L. Karr |
||||||||||||||||||||||||||||||||||||||
| Formatting Note: The layout of this HTML document in a two-column format follows the original published paper. The lengths of the columns and the pagination may not look as good as the original paper. | |||||||||||||||||||||||||||||||||||||||
AbstractIt has been demonstrated that a periodic complete reinitialization of a running genetic algorithm (GA) solution will result in a higher convergence rate for a series of problems. This technique, referred to as the "Total-Comet-Strike" (TCS) operator, has been applied to a number of multi-solution GA problems. Where it has been used, an improvement has been shown both in the number of cases that converge within an imposed time limit and in the average time required for each case. 1. Introduction: The GA environmentThe TCS operator is seen as a benefit in the set of GA problems that are under a time constraint for being solved and have multiple acceptable solutions in a multidi- mensional search space. The idea for the TCS operator occurred in a detailed investigation by Dickens and Karr looking at the performance of a GA for manipulating 9 of the 16 floating-point matrix values in a 4x4 transformation matrix used for an image analysis problem[1]. The goal of that particular study was to implement a GA to achieve a reduction in the total time required for a large set of image transformations compared with the time required using a more traditional iterative optimization method, which took an average of 20 seconds for each matrix solution to complete. The image-transformation GA used the usual [2] weighted reproduction method, 1-point and 2-point cross- overs, survival of the fittest individual from generation to generation*, and a varying mutation rate to help with stag- nation. Local hill-climbing was used to gently refine the best performers. A catastrophe technique, referred to as a Comet Strike, was used [3] to introduce new genetic material into the population when needed. This technique maintained the top performers and reinitialized the remaining population. A set of matrix-element crossover |
operators was also used that expanded on the idea by Wallet et al. [4] of using
operators that swap substring regions defining portions of the transformation-matrix
parameters. A key revelation in that image-transformation study was the analysis of the performance of the GA not in the usual terms of generation-to-best-performer, or generation- to-average-fitness-value, but by looking at the number of GA solutions that completed in different time periods for a large number of GA runs on the same problem. Figure 1 shows this distribution of completion times.
For this test case, 16 images were selected; and each image was transformed into the image space of each of the 16* images, resulting in 162, or 256 different image transformations. Each of these 256 transformations was run 10 times, for a total of 2,560 runs in the case. Looking at selected individuals in this set, there were a few GA runs that completed in less than 100 ms, 17.5% of the runs completed in less than 2.0 seconds, 29% of the runs completed in less than 4.0 seconds, and 59% of the runs did not successfully converge within the allotted 20.0 seconds. The key observation was that 29% of the runs con- verged in the first 4 seconds, while only 12% of the runs converged in the time period 4 to 20 seconds. The hump that can be observed in the first 4 seconds of Figure 1 is |
||||||||||||||||||||||||||||||||||||||
*Similar to De Jong's generation-gap idea; also known as the elitist strategy. |
*Including itself, which would result in an identify matrix. |
||||||||||||||||||||||||||||||||||||||
| 0-8186-8548-4/98 © 1998 IEEE | 2 | ||||||||||||||||||||||||||||||||||||||
| Presented at IEEE International Joint Symposia on
Intelligence and Systems, May 21-23, 1998, Rockville, Maryland, USA. |
|||||||||||||||||||||||||||||||||||||||
| further explored in this paper. A more in-depth report is given in the master's thesis
dissertation [5] by Thomas Dickens to the University of Alabama, 1998. The proposed theory was that, if "lucky," the GA will converge within a few seconds, but once the 2.0-second mark is passed, the chance that the GA will converge decreases. Being lucky means that the needed genetic material is available in the initial population and the GA is able to make use of this material for a speedy convergence. If it is not "lucky," the GA would wander off into a local minimum in the multidimensional search space that does not have an acceptable local solution, thereby requiring additional time to recover and locate a quality solution. This nine-dimensional GA case was also studied in detail to assess the range of values acceptable for the nine matrix locations of converged solutions for the same input conditions. Each of the matrix values was normalized into a -1.0 to +1.0 space, and xy plots were generated showing the relationship of every pair of matrix values for the run of solutions to see if the solutions were converging to the same or similar values, at least for a few key matrix values. This was not the case. We did see relationships between certain matrix values as their 2D plot formed a line, as in Figure 2, but none of the plots showed that only a single value was acceptable for the solution. Many of the points appearing in the plots were quite spread out, as in Figure 3, and some plots showed attributes of both, as in Figure 4. This demonstrated that for a set of GA runs on the same input condition, different acceptable solutions were being found in the nine- dimensional search space.
|
This confirmed the idea that the different GA results found on the same problem
represent a set of acceptable solutions to the problem, not the same solution being
located over and over. Thus, having a multi-solution search space may be a significant
attribute of the type of GA problem that benefits from the TCS operator. 2. Related techniquesThe delta coding optimization technique for GAs, proposed [6] and refined [7,8] by Mathias and Whitley, periodically reinitializes the population and also redefines the search space. Their approach is to continually refocus the search space to concentrate the efforts of the GA in promising areas. Mathias and Whitley give no mention to the initial hump observed in Figure 1, nor does their technique try to capitalize on its effect. Other GA optimization techniques have been used, such as (1) the messy GA [9], which allows variable- length strings, dynamically refining the strings to use the higher importance genes, and (2) time-dependent tech- niques[10], which incorporate memory in their strings that can be toggled on and off to track cyclic fitness conditions. Markov chain models have been used to provide in- sights into the working of GAs [11]. They have also been applied to GAs with niching [12]. Markov chain analysis is currently being used only for small GA cases* since the models use a transition probability matrix that becomes unmanageable with larger cases. Hopefully, the under- standing gained with the Markov models may provide insights to create workable models for larger problems. 3. Test casesTest cases that were simpler than the image transformation problem were developed demonstrating the initial hump effect shown in Figure 1. They were designed such that several values produced an "optimum" solution. As an example, consider the 1D case:
Here, we use a range of 0.0 <= x <= 5.0, and F(x) is considered optimum whenever its value is less than a threshold value of -0.4. This 1D function is shown in Figure 5. Equations for 2D and 3D cases were developed using similar methodologies. Each test case includes five runs of 1,000 solutions. The plots in Figures 6-8 show the |
||||||||||||||||||||||||||||||||||||||
*Small is defined as very small populations and short string lengths. |
|||||||||||||||||||||||||||||||||||||||
| 0-8186-8548-4/98 © 1998 IEEE | 3 | ||||||||||||||||||||||||||||||||||||||
| Presented at IEEE International Joint Symposia on
Intelligence and Systems, May 21-23, 1998, Rockville, Maryland, USA. |
|||||||||||||||||||||||||||||||||||||||
| resulting five values at each data point as a study of the repeatability of each test
case.
Both 1D and 2D cases were investigated, as seen in the results from running five 1,000-run jobs each and shown in Figures 6 and 7. The search space for these cases was not large enough to avoid the initial guess causing significant early convergence. Figure 8 shows the results achieved in a 3D test case with five 1,000-job runs; these results are comparable to those in Figure 1 and show the same initial hump effect. This 3D test case uses three 12-bit numbers, or a 36-bit genetic string, resulting in a search space of 236, or 68,719,476,736 different combinations. An exhaustive search was run on this case identifying 5,586,729 strings that satisfied the objective*; these strings constitute approximately 0.008% of the possible individuals.
|
The 3D function used, with a threshold of 1.65, is:
4. Total-comet-strike (TCS) operatorIn the image GA case, as shown in Figure 1, the results showed 29% of the runs converging within 4.0 seconds, 59% of the runs not converging within 20.0 seconds, leaving only 12% of the runs converging in the 4.0- to 20.0-second time period. For the 3D test case shown in Figure 8, these numbers relative to the hump are 45.0%, 15.6%, and 39.4%, respectively. It was estimated that if the TCS operator was applied after generation 56 in the 3D case, the percentage of runs to converge should follow the same curve as seen in the first 56 generations, but the curve would now be applied only to the remaining unconverged runs in the case. Initially there should be a drop in the convergence rate, but as the hump is again traversed, the percentage of runs converging per generation should be greater than if the TCS operator had not been used. Figure 9 shows the estimated result of applying the TCS operator every 56 generations. The net gain in the number of converged runs is the sum of the area between the two curves.
|
||||||||||||||||||||||||||||||||||||||
*This took the better part of a week running on an SGI Indigo 2 workstation. |
|||||||||||||||||||||||||||||||||||||||
| 0-8186-8548-4/98 © 1998 IEEE | 4 | ||||||||||||||||||||||||||||||||||||||
| Presented at IEEE International Joint Symposia on
Intelligence and Systems, May 21-23, 1998, Rockville, Maryland, USA. |
|||||||||||||||||||||||||||||||||||||||
| These data are replotted in Figure 10 showing the accumulated percentage of converged
runs. The original case achieved only an 84.4% convergence rate. According to our
estimate, the GA with the TCS operator should achieve a 95% convergence rate.
We can quantify the TCS effect with the following equation:
where Rp is the remaining population and Ph is the percentage of solutions converging in the initial hump. The left portion of Table 1 shows this progression for four applications of the TCS operator. To validate our estimate, first we replotted the results from the 3D GA test case seen in Figure 8, replotting it with a coarser x distribution; this is shown in Figure 11. Table 1. 3D case estimate and actual of the total convergence with the TCS operator
|
Then we modified this test case to use the TCS operator every 56 generations, which
produced the results seen in Figure 12. These test cases were also five 1,000-job runs. In both Figures 11 and 12 the average has been included as a line to aid in visualizing the trend. In Figure 12 we see the expected attenuated replication of the initial hump shape in successive 56-generation intervals. Figures 13 and 14 show these average lines, plotted both as per- generation completion and as accumulated completion, as predicted in Figures 9 and 10.
The original run (Figure 11) had 780 of 5,000 runs that did not converge in the desired time, or a 15.6% nonconvergence rate. With the application of the TCS operator, the number of nonconverged runs (seen in Figure 12) dropped to 272 of 5,000, or a 5.44% nonconvergence rate. Table 1 shows both the estimated data and the actual results, which agree very well. Figure 15 shows a plot of the accumulated benefit of using the TCS operator as percentage change from the non-TCS case. Notice that there is an initial penalty after generation 56 as the TCS operator is first invoked. |
||||||||||||||||||||||||||||||||||||||
| 0-8186-8548-4/98 © 1998 IEEE | 5 | ||||||||||||||||||||||||||||||||||||||
| Presented at IEEE International Joint Symposia on
Intelligence and Systems, May 21-23, 1998, Rockville, Maryland, USA. |
|||||||||||||||||||||||||||||||||||||||
Table 2 shows the before and after times using the TCS operator for 1,000 runs of the 3D case. The TCS operator saved over half a minute in the 2-minute job, a time reduction of more than 30% for the 1,000-run solution. |
Table 2. Timing, original 3D case compared with the 3D case using the TCS operator
A beneficial side effect of using the TCS operator is the ability of the GA to remember the nonconverged values each time the TCS operator is applied. If the run is one of the few that does not converge, the best of the nonconverged values can be used. 5. Using the TCS operator on the image- transformation caseThe results from the 3D test case were applied to the image-transformation GA. The expected results are tabu- lated in Table 3 and plotted in Figure 16. Table 3. Image-transformation case estimate of the total convergence with the TCS operator
|
||||||||||||||||||||||||||||||||||||||
| 0-8186-8548-4/98 © 1998 IEEE | 6 | ||||||||||||||||||||||||||||||||||||||
| Presented at IEEE International Joint Symposia on
Intelligence and Systems, May 21-23, 1998, Rockville, Maryland, USA. |
|||||||||||||||||||||||||||||||||||||||
| The numbers on the top right of Figure 16 (1514/613) report the number of unconverged
runs for the original and the TCS applied cases, respectively. This TCS case was run
(Figure 17) and compared with the estimate from Figure 16 (Figure 18).
The comparison in Figure 18 shows that the actual benefit of the TCS operator for the image-transformation case is less than expected according to either Equation (1) or the experimental results from the 3D test case. Further investigation of the set of 256 image combinations yielded an explanation: some of the runs routinely converged much earlier than others, as shown in Figure 19. On average, these "easy" runs are more likely to converge in an earlier TCS hump; the remaining set of unconverged runs, on average, contains runs that will be harder to converge. |
Equation (2) is an estimate of the TCS effect accounting for variation in convergence rate for input sets.
The equation takes the variation in convergence rate into account by adding an exponential scaling term, Ep, where Rp is the remaining population, Ph is the percentage of solutions converging in the initial hump, and Ep is the exponential scaling term for the input set of the population. For cases where a variation in input conditions does not change the chance of convergence of the GA, this scaling term, Ep, can be set to 1; thus, Equation 2 is a general estimate of the TCS operator effect. Assigning an Ep of 0.73 to the TCS estimate in Figure 18 gave a good estimate when compared with the actual data. Table 4 shows the before and after times of using the TCS operator. The TCS operator saved more than an hour in the 9.5-hour job, a time reduction of more than 11% for the 2,560-run solution. Table 4. Timing of the original image case and the image case with the TCS operator.
|
||||||||||||||||||||||||||||||||||||||
| 0-8186-8548-4/98 © 1998 IEEE | 7 | ||||||||||||||||||||||||||||||||||||||
| Presented at IEEE International Joint Symposia on
Intelligence and Systems, May 21-23, 1998, Rockville, Maryland, USA. |
|||||||||||||||||||||||||||||||||||||||
6. Choice of the TCS generationThe effect of the selection of the generation to impose the TCS operator, Ph in Equations 1 and 2, was studied as shown in Figures 20 and 21. Using the original non-TCS data from the 3D test case, the selection of Ph was tried for all integer values in a range from 10 to 150 generations.
The best performance in terms of the minimum number of nonconverged runs occurs when the TCS operator is applied in the range from generation 50 to generation 60, with a nonconvergence rate about 5.5%. The intuitive guess of using generation 56 to apply the TCS operator was done by a visual inspection of Figure 8, selecting a point on the back-side of the initial hump near the bottom of the hump. This proved to be a good choice. The selection of the generation to impose the TCS operator was also studied for the image transformation case, as shown in Figure 21. Using the original non-TCS data, the selection of Ph was tried for every 10th integer value in a range from 10 to 10,000 generations. Equations 1 and 2 were both applied to the data; the resulting integrations for each of these generation values is plotted for Equation 1 (bottom curve) and Equation 2 (top curve).
|
The best performance in terms of the minimum number of nonconverged runs for the 2,560
run test case occurs with when the TCS operator is applied in the range from generation
2,300 to generation 2,900 for Equation 1 (a 21% nonconvergence rate) and from generation
2,700 to generation 4,600 for Equation 2 data (a 43% non- convergence rate). The intuitive guess of using generation 3,600 (about 4.0 seconds) to apply the TCS operator was done by visually inspecting Figure 1 and selecting a point on the back-side of the initial hump near the bottom of the hump. This proved to be a good choice for the application of Equation 2; generation 3,600 is in the middle of the 2,100- generation minimum portion of the Equation 2 curve. An earlier choice, around generation 2,700 (about 3.0 seconds), would have been a better choice of the TCS generation if the Ep value in Equation 2 had been 1.0; in effect, Equation 2 would have reverted back to Equation 1. Such an Ep value of 1.0 would be used for GA cases where the variation in input values does not affect the solvability of the GA. 7. Implementation and application of the TCS operatorA benefit over many of the other GA optimization methods being used is the ease of implementing the TCS operator into existing GA engines. First, the characteristics of the GA would be determined by running a series of cases and plotting generation number versus convergence. Then, a reasonable generation number, N, would be selected to apply the TCS operator, reinitializing the entire population every N generations. We envision applying the TCS operator to those GAs that need to be repetitively run with small changes in input parameters, such as the image-transformation case; the expected benefit would be a reduction in the elapsed time required to run a set of solutions. In the cases presented here, we experienced time reductions from 11% to more than 30%. The TCS operator is also seen as beneficial in time- constrained and real-time environments. In many real-time environments, [13, 14] an early solution can be useful, a "good enough" solution is required at a certain time, and a late solution is worthless. With the TCS operator, intermediate solutions can be returned as the GA continues to better the solution, and at the hard time constraint where a solution is required, the best solution reached can be used regardless of where in the current TCS cycle the GA is stopped. The overall reduction in time for the average run using the TCS operator would yield better results on the average over those of a simple GA. |
||||||||||||||||||||||||||||||||||||||
| 0-8186-8548-4/98 © 1998 IEEE | 8 | ||||||||||||||||||||||||||||||||||||||
| Presented at IEEE International Joint Symposia on
Intelligence and Systems, May 21-23, 1998, Rockville, Maryland, USA. |
|||||||||||||||||||||||||||||||||||||||
8. Future studyWe have shown the usefulness of applying the TCS operator for GA cases that exhibit the "initial hump" trend. For our investigation, the choice of the Ph point was selected based on empirical analysis; it is a point on the backside of the initial hump. This generation number is specific to the details of the GA problem; the hump will span a number of generations based on the terrain of the particular search space. Further study needs to be made to understand the characteristics of the initial hump and how to choose the optimal generation number at which to impose the TCS operator. This should also consider population size and other GA parameters, optimizing for total computational needs as well as elapsed time. The Ep scaling factor was also found empirically in our study; it is a result of the variance in the "hardness" of the GA space across a sweep of input parameters. Investigation into the types of GA problems that can use the TCS operator is needed to determine how wide the application of the TCS operator can be. For example, can single solution GA problems also benefit from use of the TCS operator? Further study is needed to understand the types of GA problems that exhibit the "initial hump" trend. 9. ConclusionWe have demonstrated the application of a new operator for use with genetic algorithms: the Total Comet Strike (TCS). When applied to GAs that show an initial hump in convergence versus generation for a large number of runs, significant improvements can be achieved in (1) the number of runs that will converge within a desired time and (2) the average time for convergence for the set of runs. A method has been shown to estimate this effect, and confirmation has been provided through experimentation. Directions for future investigation concerning the TCS operator have been outlined. References[1] Karr, Charles L., Applications of Genetic Algorithms: Industrial and Other, CRC Press, Boca Raton, Fla., 1998. [2] Goldberg, David E., Genetic Algorithms in Search, Optimization & Machine Learning, Addison Wesley, Reading, Mass., 1989. [3] Wei Shi and J.H. Chen, "Intelligent Permuting and Its Optimization," Proceedings for the International Conference on Intelligent Manufacturing, Wuhan, China, 1995, SPIEVol. 2620, pp. 514-519. |
[4] B.C. Wallet, D.J. Marchette, J.L. Solka, "A Matrix
Representation for Genetic Algorithms," Proc. SPIE-Int. Soc. Opt. Eng. (USA),
Automatic Object Recognition Conference, Orlando, Fla., 1996, SPIE Vol. 2756, 1996, pp.
206-214. [5] Thomas P. Dickens, "Method to Achieve Better Performance in Genetic Algorithms Applied to Time- Constrained, Multi-Solution Problems," master's thesis, University of Alabama, 1998. [6] D. Whitley, K. Mathias, and P. Fitzhorn, "Delta Coding: An Iterative Search Strategy for Genetic Algorithms," Proceedings of the Fourth International Conference on Genetic Algorithms, University of California, San Diego, Calif., 1991, Morgan Kaufmann Publishers, pp. 77-84. [7] Keith E. Mathias and L. Darrell Whitley, "Noisy Function Evaluation and the Delta Coding Algorithm," Proceedings for the SPIE-The International Society for Optical Engineering, Neural and Stochastic Methods in Image and Signal Processing III, San Diego, Calif., 1994, SPIE Vol. 2304, pp. 53-64. [8] Keith E. Mathias and L. Darrell Whitley, "Changing Representation During Search: A Comparative Study of Delta Coding," Evolutionary Computation, The Massachusetts Institute of Technology, 1995, Vol. 2, No. 3, pp. 249-278. [9] David E. Goldberg, Kalyanmoy Deb, Hillol Kargupta, and Georges Harik, "Rapid, Accurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms," Proceedings of the Fifth International Conference on Genetic Algorithms, University of Illinois at Urbana-Champaign, 1993, Morgan Kaufmann Publishers, pp. 56-64. [10] Alessio Gaspar and Philippe Collard, "Time Dependent Optimization With a Folding Genetic Algorithm," 1997 IEEE International Conference on Tools With Artificial Intelligence, Newport Beach, Calif., 1997, IEEE Computer Society Press, pp. 125-132. [11] Joe Suzuki, "A Markov Chain Analysis on a Genetic Algorithm," Proceedings of the Fifth International Conference on Genetic Algorithms, University of Illinois at Urbana- Champaign, 1993, Morgan Kaufmann Publishers, pp. 146-153. [12] Jeffrey Horn, "Finite Markov Chain Analysis of Genetic Algorithms With Niching," Proceedings of the Fifth International Conference on Genetic Algorithms, University of Illinois at Urbana-Champaign, 1993, Morgan Kaufmann Publishers, pp. 110-117. [13] David W. Payton and Thomas E. Bihari, "Intelligent Real-Time Control of Robotic Vehicles." In Advances in Real- Time Systems, edited by John A. Stankovic and Krithi Ramamritham, IEEE Computer Society Press, Los Alamitos, Calif., 1991, pp. 724-738. First printed in CACM, The Association for Computing Machinery, Inc., 1991, Vol. 34, No. 8, pp. 48-63. [14] M. Zanella, "Defining Real-Time: A Background for the Design of Real-Time Knowledge-Based Systems," Tenth International Conference on the Applications of Artificial Intelligence in Engineering, International Centre for Mechanical Sciences (CISM), Udine, Italy, 1995. Published in Applications of Artificial Intelligence in Engineering X, Computational Mechanics Inc., Billerica, Mass., pp. 337-347. |
||||||||||||||||||||||||||||||||||||||
| 0-8186-8548-4/98 © 1998 IEEE | 9 | ||||||||||||||||||||||||||||||||||||||
| Presented at IEEE International Joint Symposia on
Intelligence and Systems, May 21-23, 1998, Rockville, Maryland, USA. |
|||||||||||||||||||||||||||||||||||||||