Return to my Genetic Algorithms Page
This GA Paper is part of Professor Chuck Karr's book, Industrial Applications of Genetic Algorithms (International Series on Computational Intelligence) .
Professor Karr teaches Genetics at the University of Alabama.
Thomas P. Dickens
The Boeing Company
To aid in the quantitative analysis of a pair of simultaneously taken images, one image can be transformed into the image-space of the other. Analysis can then be done by using choosing the same pixel region from both images. Known points of interest are located on both images as a set of reference marker locations. By using the knowledge that the images were taken at the same time, and of the same item of interest, from approximately the same physical location, a simple 4x4 transformation matrix can be used to map the markers from one image to the other.
This paper outlines an approach to develop, and an implementation of, a genetic algorithm to find an acceptable transformation matrix for a given pair of such images. This research resulted in insights for using a GA in a time-constrained environment, techniques in reducing the effective GA search space, and advanced matrix GA operators.
The author works for the Boeing Company in Aerodynamics Computing, specializing in geometry, graphics, and scientific visualization. One assignment involved working with pairs of digital images of a model, I1 and I2, where the images in an image pair are taken with different optical filters to view different characteristics of the model. Upon the model is placed a set of registration targets to assist in down-stream image manipulation. This set of targets is located within each image in its own image-space as shown in Figure 2.1.
Currently in use is an image-calibration program which transforms the second image into the "space" of the first image, S1, creating a new image I2'. The purpose of this is that statistical operations can now be easily done on the image pair I1 and I2' as simple pixel-based operations in the 2D image space, and the image space for both images has the same mapping (S1 == S2'). The approach used in the existing image-calibration program is to take the registration targets in image I2, process them through a 4x4 transformation matrix T21, and end up with the targets in the space of I1. Once this matrix is "found", each pixel in the image can then be transformed into the I1 image space as shown in Figure 2.2 (with over sampling and jittering to avoid aliasing artifacts).
Figure 2.1. Finding transformation matrix T21 for targets in image 2 to the space of image
1.
Figure 2.2. Transforming image 2 through T21 into the space of image 1.
To find the T21 matrix, the image-calibration code calculates an initial guess by calculating the 2-D rotation, scaling, and translation from S2 to S1 for two distant targets. However, this is not a simple 2-D problem, hence the need for a 3-D transformation matrix. The code then cycles on all of the 16 values in T21, perturbing them slightly, looking for a better "fit" from S2 to S1. The program monitors the sum of the square of the distances from the targets in S2' to the targets to S1. This is analogous to physically jiggling a part to find the best-fit orientation. The program iterates until the fitness is within a given tolerance, or until a maximum number of cycles has been tried. Within this iteration logic an "escape" factor of 0.9995 is used to allow this method to escape from local minimums. The current image-calibration code as shown in Figure 2.3 in is use and works well.
Figure 2.3. Current non-GA block diagram.
This paper looks at using a Genetic Algorithm (GA) to find a suitable transformation matrix T21 for the image transformation problem. As seen in Figure 2.4, the fitness function needed for the GA is already well-defined in the current iterative method, as well as the initial matrix calculation algorithm. The goal with the GA implementation is to achieve a T21 which is as good as or better than the current method, but which converges in less time. The current method takes an average of 55 seconds to find a suitable mapping when run on an SGI Indigo2 Extreme. In production use, hundreds of images will be regularly transformed for statistical analysis.
Figure 2.4. GA-based block diagram.
The problem of image-transformation and image-registration occurs in many other fields, such as aerial and satellite photography, medical imaging, industrial imaging, and computer-vision systems. There has been work in these fields with applying Genetic Algorithms to search for image-to-image and image-to-object mappings.
Dasgupta et. al. [1] designed and implemented a structured GA (sGA) for image-to- image registration. The key to their system is a 2-level GA hierarchy with the higher- level genes activating or deactivating sets of lower-level genes, thus achieving a GA which: (1) can maintain diversity over a large number of generations, (2) can achieve a large effect with a single-bit change, and (3) discover potential areas of search space quickly. For their image-to-image registration they used simple 2-D image vectors at each of the four corners of the image. This method would be too limited for the image- to-image registration used here.
Nagao et. al. [2] look at mapping a 3-D object onto a pair of stereo images using a GA. There are many assumptions they make in their system, including camera-to-camera coordination, thus their use of the basic set of x, y, and z rotation and translation parameters is sufficient without needing the additional operators for scaling and perspective. This method uses too many constraints and assumptions to be used for the images in the current problem.
The effort by Wallet et. al. [3] looks at the general problem of using a matrix representation with GAs for various problems. Their focus is on a matrix-based crossover operator which operates on a subset of the matrix rather than a sequence of linear-ordered values. For example, a 2x3 sub-set of a 4x4 matrix can be targeted for crossover and operated on as a set. This allows logical groupings of terms within the matrix to be operated on at the same time, thus manipulating high-level concepts within the matrix. This holds true for some cases in a geometry-based 4x4 transformation matrix, such as the upper four values representing the Z rotation. However, the Y rotation uses the non-adjacent matrix locations R11, R13, R31, and R33, and a global scaling uses the locations R11, R22 and R33. The gains achieved by their suggested matrix-grouping could be further expanded upon by defining a set of N crossover templates to use based on the setup of the matrix usage in the particular GA problem. Another approach is to have the GA manipulate the higher-level parameters directly, such as the rotations, scalings, and transformations, then compose the composite matrix when the transformation is required. The work by Wallet et. al. has inspired the creation of some advanced matrix operators which are detailed below.
A set of 8 image pairs were identified to use for this initial GA investigation. Given these 16 images (8 pairs), 15 of them have been transformed into the space of the 16th, and this has been done for all of the 16 different combinations. Additionally, each of the 16 images was transformed to itself to ensure that the system can correctly generate an identity matrix. This yielded a total of 256 image-pair transformations to use for testing.
To scope the problem, the 256 transformation solutions were run using the existing method. These solutions were analyzed to find "typical" ranges for each of the 16 matrix values, which was used to help define the subset of the search-space which is applicable for this problem. Follow-on tests were run and confirmed that this restricted space is reasonable for the problem in general.
| ------------------- AVERAGES ------------------- | |||
|---|---|---|---|
| 0.99647284237993 | 0.00407807202267 | 0.00000000000000 | 0.00000000000000 |
| -0.00407807202267 | 0.99647284237993 | 0.00000000000000 | 0.00000000000000 |
| 0.00000000000000 | 0.00000000000000 | 1.00000000000000 | 0.00000000000000 |
| 2.04210090637207 | -8.24660944938660 | 0.00000000000000 | 1.00000000000000 |
| ------------------- MINIMUMS ------------------- | |||
| 0.93118642603029 | -0.12693804455969 | 0.00000000000000 | 0.00000000000000 |
| -0.14312800790814 | 0.93118642603029 | 0.00000000000000 | 0.00000000000000 |
| 0.00000000000000 | 0.00000000000000 | 1.00000000000000 | 0.00000000000000 |
| -178.83712768554687 | -161.400878906250 | 0.00000000000000 | 1.00000000000000 |
| ------------------- MAXIMUMS ------------------- | |||
| 1.05537870300592 | 0.14312800790814 | 0.00000000000000 | 0.00000000000000 |
| 0.12693804455969 | 1.05537870300592 | 0.00000000000000 | 0.00000000000000 |
| 0.00000000000000 | 0.00000000000000 | 1.00000000000000 | 0.00000000000000 |
| 186.11801147460937 | 128.21191406250000 | 0.00000000000000 | 1.00000000000000 |
Table 2.1. Survey of 256 initial transformation matrices.
The survey data looks at the average as well as the minimum and maximum values for the non-GA run of 256 transformations. Table 2.1 shows the results of applying a single rotation, scaling, then transformation for the initial calculated guess.
| ------------------- AVERAGES ------------------- | |||
|---|---|---|---|
| 1.00460495973185 | 0.00624317027524 | 0.00000000000000 | 0.00000237714546 |
| -0.00277548941898 | 1.00214623752882 | 0.00000000000000 | 0.00000527097141 |
| 0.00000000000000 | 0.00000000000000 | 0.99999999999997 | 0.00000000000000 |
| 0.41538202893630 | -10.22744641449350 | 0.00000000000000 | 1.00292148791366 |
| ------------------- MINIMUMS ------------------- | |||
| 0.98017712897502 | -0.12516742422596 | 0.00000000000000 | - 0.00006666364776 |
| -0.10974633391626 | 0.92602081328744 | 0.00000000000000 | - 0.00003969191440 |
| 0.00000000000000 | 0.00000000000000 | 0.99999999999971 | 0.00000000000000 |
| -180.84157368082086 | -164.127761212131 | 0.00000000000000 | 0.98647029527729 |
| ------------------- MAXIMUMS ------------------- | |||
| 1.03541033425418 | 0.14765869307001 | 0.00000000000000 | 0.00007621304828 |
| 0.10043706060374 | 1.08316814407260 | 0.00000000000000 | 0.00006239981333 |
| 0.00000000000000 | 0.00000000000000 | 1.00000000000000 | 0.00000000000000 |
| 181.21196540626224 | 123.13948079936416 | 0.00000000000000 | 1.02629497331227 |
Table 2.2. Survey of 256 final transformation matrices.
Table 2.2 shows the ranges for the final matrix for all 256 cases. Many interesting observations can be made from this data. The upper-left 2x2 section of the matrix represents the rotation about Z, the axis normal to the image, and also includes the X and Y scaling terms [4, 5]. As expected in this problem, these values are well used. The two values on the bottom to the left are the X and Y translation values, and as expected, these have the largest swing to line-up the images. The two values on the top of the rightmost column are perspective, or shearing terms. These compensate for one image being tilted away in relation to the other image. The italicized rows and columns are essentially the values from the identity matrix, which indicate that X and Z rotations, as well as the Z scaling, are not required. The deltas as seen in Table 2.3 for these locations are essentially zero. The GA can perhaps be simplified by not including these seven data values as part of the GA search space. The bottom right value is used as a weighting factor used in the mapping of the combined affine transformations into homogeneous coordinates. The range of this value is quite small, and perhaps it too can be left out of the GA search space if desired.
| ------------------- AVERAGES ------------------- | |||
|---|---|---|---|
| -0.00813211735192 | -0.00216509825257 | 0.00000000000000 | -0.00000237714546 |
| -0.00130258260368 | -0.00567339514889 | 0.00000000000000 | -0.00000527097141 |
| 0.00000000000000 | 0.00000000000000 | 0.00000000000003 | 0.00000000000000 |
| 1.62671887743574 | 1.98083696510691 | 0.00000000000000 | -0.00292148791366 |
| ------------------- MINIMUMS ------------------- | |||
| -0.05991555334702 | -0.01406627527258 | 0.00000000000000 | -0.00007621304828 |
| -0.04029728892045 | -0.02787662752075 | 0.00000000000000 | -0.00006239981333 |
| 0.00000000000000 | 0.00000000000000 | 0.00000000000000 | 0.00000000000000 |
| -0.39636359754898 | -0.67255857052405 | 0.00000000000000 | -0.02629497331227 |
| ------------------- MAXIMUMS ------------------- | |||
| 0.02489397080838 | 0.00467318981777 | 0.00000000000000 | 0.00006666364776 |
| 0.03698794842970 | 0.00532152847150 | 0.00000000000000 | 0.00003969191440 |
| 0.00000000000000 | 0.00000000000000 | 0.00000000000029 | 0.00000000000000 |
| 6.45551843191441 | 7.02071723668247 | 0.00000000000000 | 0.01352970472271 |
Table 2.3. Survey of the deltas from the initial to the final transformation matrices.
In Table 2.3, the delta ranges from the initial matrix to the final matrix for all 256 cases are shown. Again, notice the 7 values which are essentially zero and are ignored by the GA implementation.
| ------------------- RANGE ------------------- | |||
|---|---|---|---|
| 0.08480952415540 | 0.01873946509035 | 0.00000000000000 | 0.00014287669604 |
| 0.07728523735015 | 0.03319815599225 | 0.00000000000000 | 0.00010209172773 |
| 0.00000000000000 | 0.00000000000000 | 0.00000000000029 | 0.00000000000000 |
| 6.85188202946339 | 7.69327580720652 | 0.00000000000000 | 0.03982467803498 |
Table 2.4. Range covered from delta-minimum to delta-maximum.
In Table 2.4 we can see the delta range covered for each matrix location.
| ------------------- MINIMUMS ------------------- | |||
|---|---|---|---|
| -0.06 | -0.015 | 0.0 | -0.000077 |
| -0.041 | -0.028 | 0.0 | -0.000063 |
| 0.000 | 0.000 | 0.0 | 0.0 |
| -0.40 | -0.70 | 0.0 | -0.03 |
| ------------------- MAXIMUMS ------------------- | |||
| 0.025 | 0.0047 | 0.0 | 0.000067 |
| 0.037 | 0.0054 | 0.0 | 0.000040 |
| 0.000 | 0.0000 | 0.0 | 0.0 |
| 6.5 | 7.1 | 0.0 | 0.014 |
Table 2.5. Minimum and maximum value ranges as implemented.
The actual minimum and maximum delta ranges implemented within the GA are shown in Table 2.5. The range was expanded slightly to ensure all of the desired values fit within the search-space. Additionally, the GA code was implemented with a global matrix scaling factor to further expand this search-space for investigation.
The GA code is used to identify a set of matrix deltas which, when added to the initial matrix, generate the T21 transformation matrix. To represent these delta range values in the GA code, each matrix delta value is represented by an N-bit string. For an N-bit string used at each matrix location, the resulting data coarseness of that value would be:
| Range | Coarseness | ||||||
|---|---|---|---|---|---|---|---|
| N=4 | N=8 | N=12 | N=16 | N=20 | N=24 | N=28 | |
| 0.031 | 0.002067 | 8.1E-06 | 1.98E-09 | 3.02E-14 | 2.88E-20 | 1.72E-27 | 6.4E- 36 |
| 0.0197 | 0.001313 | 5.15E-06 | 1.26E-09 | 1.92E-14 | 1.83E-20 | 1.09E-27 | 4.06E- 36 |
| 0.000144 | 9.6E-06 | 3.76E-08 | 9.19E-12 | 1.4E-16 | 1.34E-22 | 7.97E-30 | 2.97E- 38 |
| 0.078 | 0.0052 | 2.04E-05 | 4.98E-09 | 7.6E-14 | 7.25E-20 | 4.32E-27 | 1.61E-35 |
| 0.0334 | 0.002227 | 8.73E-06 | 2.13E-09 | 3.25E-14 | 3.1E-20 | 1.85E-27 | 6.89E- 36 |
| 0.000103 | 6.87E-06 | 2.69E-08 | 6.58E-12 | 1E-16 | 9.57E-23 | 5.7E-30 | 2.12E- 38 |
| 6.9 | 0.46 | 0.001804 | 4.41E-07 | 6.72E-12 | 6.41E-18 | 3.82E-25 | 1.42E-33 |
| 7.8 | 0.52 | 0.002039 | 4.98E-07 | 7.6E-12 | 7.25E-18 | 4.32E-25 | 1.61E-33 |
| 0.044 | 0.002933 | 1.15E-05 | 2.81E-09 | 4.29E-14 | 4.09E-20 | 2.44E-27 | 9.08E- 36 |
Table 2.6. Resulting coarseness calculated for different numbers of bits.
The data calculated in Table 2.6 shows the effect of increasing the number of bits used in the delta coding. A simple linear mapping from the minimum to the maximum value for each matrix location is used. The concern here is that if N is too small, then a good T21 cannot be found because as the delta is incremented from one possible value to the next possible value, the needed value is jumped over. To allow the needed value to be approximated to an acceptable value, the number of bits-per-value can be increased. Increasing N, however, will increase the search-space of the problem. Given 9 matrix values to search on, for each 1-bit increase in N, there is a 29, or 512 times increase in the size of the search space. To allow the GA to converge in a reasonable time, the search space should be kept reasonably small.
In additional to the usual reproduction, crossover, and mutation GA operators [6], additional GA operators and techniques were designed and implemented for this specific problem. The entire set of GA operators implemented for this problem follow:
Crossover: A simple 1-point crossover, as shown in Figure 2.5, is used which swaps the alleles (bit-patterns) in two selected parents at a given bit position. This bit position can be chosen anywhere within the strings, within a delta value as well as between delta values.
Figure 2.5. Single -point crossover.
Two-point Crossover: A 2-point crossover, as shown in Figure 2.6, is used which swaps the alleles in two selected parents between two given bit positions. These bit positions can be chosen anywhere within the strings, within a delta value as well as between delta values. For this crossover, the string is considered to wrap back on itself in a ring-like fashion, which results in this 2-point crossover swapping a selection of the ring for two selected parents.
Figure 2.6. Two -point crossover.
Matrix-Element Crossover: The paper by Wallet et. al. [3] inspired the use a matrix-element crossover operator, as seen in Figure 2.7. Their technique of selecting a contiguous sub-region of the 4x4 matrix was expanded to select a number of matrix elements which were related for this specific problem. Templates in the code were designed to define different matrix elements to be used for the 8 different matrix-element crossovers possibilities desired. When the matrix-element crossover was selected, one of these templates was then selected and used to accomplish the crossover.
Figure 2.7. Matrix-element Crossover.
The matrix-element crossover swaps selected matrix elements between two selected parents. These swaps are done on full N-bit matrix elements within the strings and does not mix string values within a given matrix element. Figure 2.8 shows the eight matrix- element templates implemented here. The odd templates fit within the Wallet scheme, while the even numbered templates show the extension to the Wallet scheme. The templates correspond to transformation operators: (1) Z rotation, (2) Z rotation with homogeneous coordinate scaling, (3) X and Y translation, (4) X and Y translation and perspective, (5) X and Y perspective, (6) X and Y scaling, (7) X and Y perspective with homogeneous coordinate scaling, and (8) X, Y, and Z scaling with homogeneous coordinate scaling.
Figure 2.8. Eight Matrix-element Crossover templates.
Maintaining the Best Performer: Similar to De Jong's generation-gap idea, the best performer of each generation is copied as-is to the next generation. This guarantees that the solution as found by the GA will not get worst over time.
Varying Mutation Rate: The classic GA implementation uses a fixed mutation- rate. Being concerned with a timely solution, GA stagnation was studied very closely. GA stagnation is defined as the number of generations in which the best performer remains the same, as guaranteed above. One stagnation related solution was to impose an increasing mutation rate (Ms), which is the product of the no-change generation count (NC) and the mutation rate (MR): Ms = NC * MR. This results in ever-increasing chances for mutation to change the best performer in a positive direction.
Comet-Strike Operator: Another GA stagnation operation is, as coined by Robert Dain [8], the Comet-Strike. The analogy is that of a comet striking the earth, which reeks havoc with the ecosystem and destroys all but the most robust of live. Following the comet-strike new species evolve. The GA equivalent is to maintain the best performer and re-initialize the remaining population. The comet-strike operator is invoked when the no-change generation count (NC) goes above a set threshold and the average/minimum value is less-than M (the bulk of the population consists of similar genetic material).
Total Comet-Strike Operator: A re-initialization of the entire population. For this problem, and possibly other real-time GA problems, this operator allows the GA run to result in a better solution at the end of the imposed time-limit.
Local Hill Climbing: A technique borrowed from the current image-calibration code is to take a "good" solution, and to try small permutation on the matrix values to get a better solution. This is done to the best performer as it is copied from one generation to the next. A GA parameter is used to set the maximum number of hill-climbing iterations, resulting in small hill-climbing steps per generation. If needed, a reoccurring best performer can continue to climb the same hill over many generations.
Gray Code Mapping: Gray-coding is commonly used with GA implementations due to the 1-bit change in neighboring Gray-code values. The option to evaluate the matrix value deltas as Gray-coded numbers was provided and investigated.
The GA implementation uses various parameters during a matrix-solution run. The effect of these parameters was studied in depth to access the parameters influence on the convergence and timeliness of the run. There were some unexpected observations made in this study.
These studies were designed to have the GA run for about 30 seconds maximum. The resulting distance value was tracked as well as the time to run. Five runs at each setting were used to get a statistical spread of the parameter-setting effects. Rather than include pages and pages of multi-line plots, the main points of these studies are summarized. The author can be contacted for access to the detailed analysis.
Maximum Generations (MG) and Population Size (PS): These two parameters are tightly intertwined in this problem due to the time deadline constraint; an increase in the population size will cause each generation to take longer to run. Population sizes in the range of 10 to 100 individuals were studied. Without accounting for the time factor, larger populations generally had better convergence values for a given number of generations. It was observed that the relation between MG and PS is a linear relation, where PS * MG = 1,000,000 resulted in a maximum-generation stopping-time of about 30 seconds. A change in either PS or MG would yield the required value for the other. The population study was re-run with this time-normalized relationship, which showed an advantage in convergence value in a given time-period to a population size of about 40 individuals. A PS of 40 requires a MG of 25,000 for 30 seconds. To achieve a 20 second time for a PS of 40, a MG of 25,000 * 2 / 3 = 17,000 (approximately) was used.
Crossover Types: The 1-point, 2-point, and matrix-element crossover types were used with various usage weights. With 100% usage of the 1-point, or 80% or more of the matrix-element crossovers being used, the GA did not converge within the desired time. Various average mixes of these three crossover operators gave good results; there was not a clear mix which out-performed other mixes. The setting of 20%, 40%, 40% for the 1- point, 2-point, and matrix-element crossover operators proved to be a good robust setting.
Number of Bits: The number of bits used (N) for each matrix delta value, and the resulting size of the search space and little effect on the operation of this GA. As expected, small values of N (8 or less) resulted in unacceptable results. Surprisingly, in the range from N=8 to N=64 bits, there was little difference in the performance in the GA in terms of time-to-converge and resulting distance value when done.
Maintaining the Best Performer: Maintaining the previous generation's best performer in the current generation proved to be a valuable technique. Without this feature, runs were observed which would be converging, then lose the best performer and jump up a bit, continue to converge, then jump again, and so on, as depicted in Figure 2.9. While the average fitness of the population may have been continually decreasing over this entire period, the reality that the GA may be stopped at any given generation and the current best solution is to be used, made this unacceptable in a time-constrained GA.
Figure 2.9. Effect of losing the best-performer on the best-of-generation solution.
Varying Mutation Rate and Comet-Strike Operator: Observation shows that increasingly introducing new genetic material into the population, and also mass introduction of new genetic material into the population, can revitalize the convergence rate in a stagnate run. Without these two mutation-type operations, runs were observed which would stagnate for hundreds and many hundreds of generations. Given this tendency to favor mutation, a mutation study was undertaken to study the effect of increasing the basic single-bit mutation occurrence. Figure 2.10 shows the results of a series of runs where this mutation rate was increased, and the resulting average fitness for multiple runs is plotted. There is a trend to favor more and more chance of mutation up to a certain point (about 0.002 percent). Additional increases in mutation have a degrading effect on the GA's performance as the system is becoming less of a GA engine and more of a random-search algorithm.
Figure 2.10. Effect mutation rates on the best-of-generation solution.
Local Hill Climbing:Observation shows that this technique allowed the best performers to slide into a near-by better solution. Gray Code Mapping:As expected, the use of Gray-code mapping had a positive effect on the generation-to-generation performance of the GA, resulting in a 9.5% better convergence value after a given number of generations. However, this is not all to the story. As discovered in implementing the Gray-coding capabilities [7], the mapping from a binary integer to Gray-code is a simple 1-line equation. However, the mapping from Gray-code to a binary integer uses an iterative algorithm which loops on the number of bits (N). This code proved to be rather expensive to execute as was needed for each individual at each generation, and multiple times for the best performers in the local hill- climbing. While 9.5% better in average final value, the Gray-code execution time was 2.2 times longer (49 S versus 22 S for the settings used). With the emphasis on convergence value within a specific time-frame, the use of Gray-coding was detrimental and was thus dropped for this problem.
Use of a Matrix Sub-set: The technique of identifying a sub-set of the transformation matrix and only searching with 9 of the 16 matrix values proved to both allow acceptable convergence to be achieved as well as providing a time savings. Comparing using all 16 matrix values to only using 9 resulted in a 33% drop in the time- to-run, and a 23% better convergence value. Reducing the scope of the problem based on observed numerical results was beneficial.
The 16 test images were used to test the performance of the GA code. Each image was used as the reference, with the GA mapping the other 16 images into the image-space of the reference. This provided a sampling of 256 different image mappings, including each image being used as a reference for itself which should result in a transformation being the identity matrix.
| Number of bits for each delta value | 10 |
| Use Gray code encoding | False |
| Maximum Generation | 17,000 |
| Population Size | 40 |
| Percent chance of 1-point crossover | 20% |
| Percent chance of 2-point crossover | 40% |
| Percent chance of matrix-element crossover | 40% |
| Percent chance of mutation | 0.0002% |
| Number of no-change generations before comet-hit | 20 |
| Hill-climbing iterations per generation | 50 |
Table 2.7. Parameters used for the GA.
Parameter notes: The choice of 17,000 generations was chosen to give an average time of about 20 seconds if the GA did not converge.
Test-case Details: After a study of the problem, the originally stated target of 0.1 for the distance value was not required. A value of 0.4 is within the acceptable range for this problem. To provide a fair comparison, the original algorithm was also run with this relaxed distance target; with a resulting average per-run time of 20.0 secondswith every value below the specified 0.4 figure. It should be noted that the time-variance in the original method was very small; each run took the same time within a 10% tolerance.
The 256 test-case transformations were run 10 times each, for a total of 2560 GA-based transformations. Table 2.8 shows the results looking at CPU time, GA generations, and the final fitness-function value, the sum of the distances squared. This value is considered by itself, and also as the best and the worst for each of the 10 runs for an image pair.
| Measure | Minimum | Average | Maximum |
|---|---|---|---|
| CPU Time | 0.09 S | 13.5 S | 39.9 S |
| Generations | 62 | 11,359 | 17,000 |
| (Sum of Distances)2 | 0.143 | 0.664 | 2.89 |
| Best of 10 Sums | 0.143 | 0.40 | 0.99 |
| Worst of 10 Sums | 0.40 | 1.07 | 2.89 |
Table 2.8. Summary of results from the GA run.
Another look at the resulting data, Chart 2.1, shows the number of runs resulting in ranges of distance values; the tallest bar is in the range from 0.3 to 0.4.
Chart 2.1. Distance results of the 2560 test-case runs.
Looking the data as a whole, the GA algorithm ran in 9.6 hours versus 14.2 hours for the non-GA method. This is a reduction of 32% in time. The final values achieved did not fare as well, with an average convergence value of 0.664, 66% higher than the specified 0.4 value. Looking at Chart 2.1, we see that over 1028 runs achieved the 0.4 goal, with 50 of them jumping from above 0.4 to below 0.3 in a single generationthus 42% of the runs achieved the specified 0.4 goal. The remaining 1482 runs (58%) did not achieve the 0.4 goal, and are distributed in value ranges in decreasing amounts, with a single run at the largest final value of 2.89.
Looking at select individuals, there were some GA runs which completed in less than 100 mS. 17.5% of the runs completed in less than 2.0 seconds and 29% of the runs completed in less than 4.0 seconds. The majority of the runs, 59%, completed in the range of 19.5 to 20.5 seconds. These results indicate that, if "lucky", the GA algorithm will converge in a few seconds, but once the 2.0 second mark is passed, the chance that the GA will converge decreases, until at about 20 seconds the maximum generation count is reached. This timing spread results in an average run time of 13.5 seconds, with two groups; one around 2.0 seconds and a tighter group at 20 seconds.
Chart 2.2. Time results of the 2560 test-case runs.
The correlation between Charts 2.1 and 2.2 are that the 1078 runs which achieved the distance goal in Chart 2.1 are spread out from 0 to 19 seconds in Chart 2.2, and the 1482 runs around 20 seconds in Chart 2.2 are spread out above the distance of 0.4 in Chart 2.1. This led to the idea of implementing a Total-Comet-Strike every 4.0 seconds (0.2 * the maximum-generations), to take advantage of a completely fresh gene-pool, thinking that maintaining the best performer overshadowed a new comet-strike generated population and remaining in the area of the local minimum already found. This is essentially running 5 sequential GAs with MG = MG/5, with the run terminating if any of these sub-runs reaches the convergence value. Preliminary results implementing this additional capability looks very promising, with an initial result over 35 runs as presented in Table 2.9. All three averages, CPU time, generations, and distance value, have improved from 2.1 to 6.4 percent. The downside is that the maximum distance value has grown to 4.8.
| Measure | Minimum | Average | Maximum |
|---|---|---|---|
| CPU Time | 1.37 S | 12.9 S | 20.9 S |
| Generations | 1139 | 10,628 | 17,000 |
| (Sum of Distances)2 | 0.32 | 0.64 | 4.8 |
Table 2.9. Summary of results from the GA run.
A Genetic Algorithm has been successfully implemented to locate an acceptable transformation matrix to map one image into the space of another image. The current state of the GA does find an acceptable solution below the desired target of 0.4 in 42% of the runs in the 2560-run test-case. Tests with another image-set test-case yields similar results. The desire to better the time using the GA algorithm over the existing iterative approach was achieved for the test case, but the average distance result did not achieve its goal.
Total-Comet-Strike (TCS) Operator:The idea of re-initializing the GA every MG/N generations, by implementing the Total-Comet-Strike operator, was prompted by this effort and looks promising. The next logical addition is to cache all of the final values of the N sub-runs, and if the solution has not converged in the N sub-runs, then either the best of the five should be used and the run terminates, or the best of the N is used and the run continues for another period of time. A time-based or maximum- generation termination needs to be maintained to avoid the rouge case which refuses to converge.
GA use in a time-constrained environment: Revisiting the requirement of achieving a 0.4 distance goal within 20 seconds, on a per-individual case, this goal cannot be guaranteed with the GA approach. However, this requirement needs to be considered for the overall population of runs in the job-set, and if no one run is time critical and the desire is for the entire job-set to complete faster, this GA approach is usable. With the addition of the sub-run TCS technique as described above, the times should be improved even more. As a last resort, for a run which fails to converge, the run can be processed with the existing method with a maximum run-time of twice the targeted 20 seconds (20 for the failed GA, and 20 for the non-GA run). This time could be reduced by avoiding the initialization overhead by carrying both algorithms in the same program which share the same front and back ends (Figures 2.3 and 2.4). This hybrid approach would result in an overall reduction in time for a set of runs, and will also guarantee all runs reach the specified minimum distance value.
Applicability in a production environment: With the above-mentioned additions to make the algorithm finite in execution time with an acceptable convergence value, this GA-based hybrid method will result in a net time savings without sacrificing the quality of the solution.
Repeatability: A concern with using algorithms which utilize randomness is the repeatability of the solution for the same dataset. In my experience working with engineers, answers, even good answers, which are different for the same set of inputs, as regarded as suspect. With computer-generated randomness as used by the GA, the random-number generator could be seeded with a calculated value derived from the two input images. This would generate the exact same calculated value for the same set of images and parameter settings, which would generate in the same exact result. If the code is to be run on a variety of computers or operating system versions, care should be taken to ensure that the random number generators on the set of computers and operating systems all behave the same; generating the same set of "random" values. A custom random-number generator could be implemented as part of the GA code, which would run the same across different platforms. The drawback to this is that the random number generator provided with the OS has generally been tuned to perform well on the specific hardware. With the time-critical nature of the requirement, this traded-off would need to be weighed carefully.
Matrix Techniques: The technique of calculating an initial transformation matrix, and then using the GA to find the deltas to this initial matrix, proved to be good. Identifying the necessary matrix locations and not utilizing the remaining locations also proved to be useful in the performance of this GA implementation. The creation of a sub- matrix crossover which groups matrix values which are related to the matrix usage worked well; it was not the only crossover method required, but the addition of this method adied the algorithm in general.
[1] Dasgupta, D.; McGregor, D.R. (1992). Digital image registration using structured genetic algorithms. Proceedings of the SPIE - The International Society for Optical Engineering vol.1766 p.226-34.
[2] Nagao, T.; Agui, T.; Nagahashi, H. (1995). Fitting three-dimensional models to stereo images using a genetic algorithm. Proceedings of the SPIE - The International Society for Optical Engineering Conference Title: Proc. SPIE - Int. Soc. Opt. Eng. (USA) vol.2501, pt.1 p.129-39.
[3] Wallet, B.C.; Marchette, D.J.; Solka, J.L. (1996). A matrix representation for genetic algorithms. Proc. SPIE - Int. Soc. Opt. Eng. (USA) vol.2756 p.206-14.
[4] John J. Craig. (1986). Introduction to Robotics: Mechanics & Control. Addison-Wesely Publishing Company.
[5] F. S. Hill Jr. (1990). Computer Graphics. Macmillan Publishing Company. <> [6] David E. Goldberg. (1989). Genetic Algorithms in Search, Optimization & Machine Learning. Addison-Wesely. Publishing Company.
[7] Willian H. Press et. al. (1992). Numerical Recipies in C: The Art of Scientific Computing, second edition. Cambridge University Press.
[8] Robert Dain is a student in this class and should have his project included in this publication. He was a fellow Boeing employee, but has since left Boeing to pursue his fortune at Microsoft.