Figure 1.
Explanation of Pareto optimality in a two-objective optimization (e.g., functions and ) scenario: non-dominated solutions (blue circles) are Pareto-optimal and lie on the so-called Pareto front (purple dotted curve). Remaining solutions (orange circles) are dominated by those belonging to the Pareto front.
Figure 1.
Explanation of Pareto optimality in a two-objective optimization (e.g., functions and ) scenario: non-dominated solutions (blue circles) are Pareto-optimal and lie on the so-called Pareto front (purple dotted curve). Remaining solutions (orange circles) are dominated by those belonging to the Pareto front.
Figure 2.
Illustration of the 4-bit parity problem: given a 4-bit input string, the goal is to return 1/0 when the number of 1-bits is even/odd. It is worth noting that the parity value for a sequence made of 0-bits only is 1.
Figure 2.
Illustration of the 4-bit parity problem: given a 4-bit input string, the goal is to return 1/0 when the number of 1-bits is even/odd. It is worth noting that the parity value for a sequence made of 0-bits only is 1.
Figure 3.
Double-pole balancing problem: two poles are placed (green rectangles) are placed on the top of wheeled mobile cart (orange rectangle), which can move on a horizontal surface (light blue rectangle).
Figure 3.
Double-pole balancing problem: two poles are placed (green rectangles) are placed on the top of wheeled mobile cart (orange rectangle), which can move on a horizontal surface (light blue rectangle).
Figure 4.
Grid navigation problem: an agent (red circle) has to navigate in a grid world. The agent starts from one of four possible initial locations (yellow squares) and its goal is to reach the target location (red square). The agent can move left, up, right or bottom.
Figure 4.
Grid navigation problem: an agent (red circle) has to navigate in a grid world. The agent starts from one of four possible initial locations (yellow squares) and its goal is to reach the target location (red square). The agent can move left, up, right or bottom.
Figure 5.
Schematic of the considered EAs. (a) GGA; (b) OpenAI-ES; (c) SSS; (d) SSSHC.
Figure 5.
Schematic of the considered EAs. (a) GGA; (b) OpenAI-ES; (c) SSS; (d) SSSHC.
Figure 6.
Performance of GGA, OpenAI-ES, SSS and SSSHC during evolution. The shaded areas are bounded in the range (first and third quartiles of data). We use the logarithmic scale on the y-axis to improve readability. Fitness is averaged over 30 replications.
Figure 6.
Performance of GGA, OpenAI-ES, SSS and SSSHC during evolution. The shaded areas are bounded in the range (first and third quartiles of data). We use the logarithmic scale on the y-axis to improve readability. Fitness is averaged over 30 replications.
Figure 7.
Final performance achieved by the different methods (see
Table 5). Boxes are bounded in the range
, with the whiskers extending to data within
. Medians are indicated with yellow lines. The notation
indicates that the fitness values of the two considered methods do not statistically differ (Mann-Whitney U test with Bonferroni correction,
, see also
Table 6).
Figure 7.
Final performance achieved by the different methods (see
Table 5). Boxes are bounded in the range
, with the whiskers extending to data within
. Medians are indicated with yellow lines. The notation
indicates that the fitness values of the two considered methods do not statistically differ (Mann-Whitney U test with Bonferroni correction,
, see also
Table 6).
Figure 8.
Analysis of the algorithm performance on the different problems. Black lines mark the standard deviations. We use the logarithmic scale on the y-axis to improve readability. Bars denote the average fitness from 30 replications.
Figure 8.
Analysis of the algorithm performance on the different problems. Black lines mark the standard deviations. We use the logarithmic scale on the y-axis to improve readability. Bars denote the average fitness from 30 replications.
Figure 9.
Algorithm performance on the Ackley, Griewank, Rastrigin, Rosenbrock and Sphere test functions. Boxes are bounded in the range , with the whiskers extending to data within . Medians are indicated with yellow lines. We use the logarithmic scale on the y-axis to improve readability. Data represents the average fitness from 30 replications.
Figure 9.
Algorithm performance on the Ackley, Griewank, Rastrigin, Rosenbrock and Sphere test functions. Boxes are bounded in the range , with the whiskers extending to data within . Medians are indicated with yellow lines. We use the logarithmic scale on the y-axis to improve readability. Data represents the average fitness from 30 replications.
Table 1.
Parameters characterizing the double-pole balancing problem. The track extremes are placed at -2.4 m and 2.4 m, respectively. We denote the absence of information with the symbol “-”.
Table 1.
Parameters characterizing the double-pole balancing problem. The track extremes are placed at -2.4 m and 2.4 m, respectively. We denote the absence of information with the symbol “-”.
| Parameter |
Length |
Mass |
| Track |
4.8 m |
- |
| Cart |
- |
1.0 kg |
| Long pole |
1.0 m |
0.5 kg |
| Short pole |
0.1 m |
0.05 kg |
Table 2.
Initialization of the state variables in each trial of the double-pole balancing problem.
Table 2.
Initialization of the state variables in each trial of the double-pole balancing problem.
|
x |
|
|
|
|
|
| 1 |
-1.944 |
0 |
0 |
0 |
0 |
0 |
| 2 |
1.944 |
0 |
0 |
0 |
0 |
0 |
| 3 |
0 |
-1.215 |
0 |
0 |
0 |
0 |
| 4 |
0 |
1.215 |
0 |
0 |
0 |
0 |
| 5 |
0 |
0 |
-0.10472 |
0 |
0 |
0 |
| 6 |
0 |
0 |
0.10472 |
0 |
0 |
0 |
| 7 |
0 |
0 |
0 |
-0.135088 |
0 |
0 |
| 8 |
0 |
0 |
0 |
0.135088 |
0 |
0 |
Table 3.
Encoding of the input () and output () neurons used for 4-bit parity, double-pole balancing and grid navigation problems. Symbols are defined as follows: concerning 4-bit parity, (with ) states for the generic bit of the input string, whereas indicates the network output used to check parity. As regards double-pole balancing, x refers to the cart position, and denote the angle of the long and short poles, respectively, and is a flag indicating whether the trial might prematurely be stopped because either the cart is going out of the track () or the pole angles are above , while is the force applied to the cart that determines its motion. Lastly, with respect to grid navigation, the symbols and represent, respectively, the position of the agent and of the target locations, is the grid size and is the direction of the agent in the grid (with ).
Table 3.
Encoding of the input () and output () neurons used for 4-bit parity, double-pole balancing and grid navigation problems. Symbols are defined as follows: concerning 4-bit parity, (with ) states for the generic bit of the input string, whereas indicates the network output used to check parity. As regards double-pole balancing, x refers to the cart position, and denote the angle of the long and short poles, respectively, and is a flag indicating whether the trial might prematurely be stopped because either the cart is going out of the track () or the pole angles are above , while is the force applied to the cart that determines its motion. Lastly, with respect to grid navigation, the symbols and represent, respectively, the position of the agent and of the target locations, is the grid size and is the direction of the agent in the grid (with ).
| Problem |
|
|
|
|
|
| 4-bit parity |
|
|
|
|
|
| Double-pole balancing |
|
|
|
alert |
|
| Grid navigation |
|
|
|
|
|
Table 4.
List of parameter settings used for the different algorithms. Symbol refers to the number of replications, indicates the number of evaluation steps (i.e., the length of evolution), denotes the number of solutions forming the population. Concerning GGA, and represent, respectively, the number of reproducing solutions (i.e., the ones that have been selected) and the number offspring generated by each selected solution. The symbol is the mutation rate (i.e., the probability to modify one gene), while refers to the probability of performing (asexual) crossover. With respect to OpenAI-ES, denotes the learning rate and is the number of samples extracted from the Gaussian distribution. The symbol indicates the number of refinement iterations performed by SSSHC. Lastly, the symbol refers to the range of connection weights.
Table 4.
List of parameter settings used for the different algorithms. Symbol refers to the number of replications, indicates the number of evaluation steps (i.e., the length of evolution), denotes the number of solutions forming the population. Concerning GGA, and represent, respectively, the number of reproducing solutions (i.e., the ones that have been selected) and the number offspring generated by each selected solution. The symbol is the mutation rate (i.e., the probability to modify one gene), while refers to the probability of performing (asexual) crossover. With respect to OpenAI-ES, denotes the learning rate and is the number of samples extracted from the Gaussian distribution. The symbol indicates the number of refinement iterations performed by SSSHC. Lastly, the symbol refers to the range of connection weights.
| Parameter |
GGA |
OpenAI-ES |
SSS |
SSSHC |
|
30 |
|
|
|
100 |
1 |
50 |
50 |
|
10 |
- |
|
|
- |
|
Yes |
- |
Yes |
|
|
|
|
|
|
- |
|
|
- |
|
- |
|
- |
20 |
- |
|
- |
5 |
|
|
Table 5.
Fitness analysis of the different algorithms. Data is the average of 30 replications of the experiments. Best performance is reported in bold.
Table 5.
Fitness analysis of the different algorithms. Data is the average of 30 replications of the experiments. Best performance is reported in bold.
| GGA |
OpenAI-ES |
SSS |
SSSHC |
| 2915.823 [206.120] |
1291.433 [268.231] |
2581.040 [117.146] |
1384.418 [57.312] |
Table 6.
Statistical comparison between the considered methods according to the Mann-Whitney U test with Bonferroni correction, with significant differences indicated in bold. Table is symmetrical with respect to the main diagonal. The symbol “-” marks the absence of the corresponding entry. Data is the average of 30 replications of the experiments.
Table 6.
Statistical comparison between the considered methods according to the Mann-Whitney U test with Bonferroni correction, with significant differences indicated in bold. Table is symmetrical with respect to the main diagonal. The symbol “-” marks the absence of the corresponding entry. Data is the average of 30 replications of the experiments.
| |
GGA |
OpenAI-ES |
SSS |
SSSHC |
| GGA |
- |
|
|
|
| OpenAI-ES |
|
- |
|
|
| SSS |
|
|
- |
|
| SSSHC |
|
|
|
- |
Table 7.
Analysis of the performance collected by the different algorithms with regard to 4-bit parity, double-pole balancing, grid navigation and test function optimization. Bold values correspond to the best outcomes. Data is the average of 30 replications of the experiments.
Table 7.
Analysis of the performance collected by the different algorithms with regard to 4-bit parity, double-pole balancing, grid navigation and test function optimization. Bold values correspond to the best outcomes. Data is the average of 30 replications of the experiments.
| Problem |
GGA |
OpenAI-ES |
SSS |
SSSHC |
| 4-bit parity |
8.233 [2.246] |
6.633 [1.538] |
7.367 [2.331] |
7.800 [1.078] |
| Double-pole balancing |
994.654 [2.199] |
778.279 [384.148] |
993.708 [2.394] |
993.725 [1.601] |
| Grid navigation |
325.167 [57.660] |
306.142 [75.842] |
377.092 [97.762] |
188.908 [51.606] |
| Test function optimization |
1587.770 [223.944] |
200.379 [111.980] |
1202.873 [147.151] |
193.986 [43.340] |
Table 8.
Analysis of the performance collected by the different algorithms with regard to the Ackley, Griewank, Rastrigin, Rosenbrock and Sphere test functions. Last row reports the average fitness and the standard deviation. Bold values correspond to the best outcomes. Data is the average of 30 replications of the experiments.
Table 8.
Analysis of the performance collected by the different algorithms with regard to the Ackley, Griewank, Rastrigin, Rosenbrock and Sphere test functions. Last row reports the average fitness and the standard deviation. Bold values correspond to the best outcomes. Data is the average of 30 replications of the experiments.
| Test function |
GGA |
OpenAI-ES |
SSS |
SSSHC |
| Ackley |
4.331 [0.179] |
1.673 [0.586] |
2.107 [0.184] |
0.994 [0.282] |
| Griewank |
0.765 [0.094] |
0.217 [0.193] |
0.546 [0.146] |
0.120 [0.098] |
| Rastrigin |
1643.956 [66.687] |
332.113 [245.317] |
97.339 [53.305] |
39.996 [32.509] |
| Rosenbrock |
6207.401 [1112.739] |
650.228 [317.946] |
5868.017 [702.571] |
918.511 [209.908] |
| Sphere |
82.393 [12.567] |
17.665 [9.097] |
46.356 [6.654] |
10.307 [6.656] |
| Average |
1587.769 [2444.544] |
200.379 [314.326] |
1202.873 [2354.027] |
193.985 [374.800] |
Table 9.
Analysis of the weight size of the controllers evolved with the different EAs. Data is the average of 30 replications of the experiments.
Table 9.
Analysis of the weight size of the controllers evolved with the different EAs. Data is the average of 30 replications of the experiments.
| GGA |
OpenAI-ES |
SSS |
SSSHC |
| 0.316 [0.047] |
0.083 [0.045] |
0.124 [0.045] |
0.018 [0.015] |
Table 10.
Worst fitness value that can be obtained in each considered problem.
Table 10.
Worst fitness value that can be obtained in each considered problem.
| 4-bit parity |
Double-pole balancing |
Grid navigation |
Test function optimization |
| 16 |
1000 |
500 |
6450690.655 |