Submitted:
30 April 2024
Posted:
01 May 2024
You are already at the latest version
Abstract
Keywords:
MSC: 65H10; 68W10; 90C59
1. Introduction
2. Metaphor-less Optimization Algorithms
2.1. Jaya Algorithm
| Algorithm 1 Jaya pseudocode |
|
| Algorithm 2 Initial population generation |
|
| Algorithm 3 Boundary integrity control |
|
| Algorithm 4 Greedy selection |
|
2.2. Enhanced Jaya Algorithm
| Algorithm 5 EJAYA pseudocode |
|
2.3. Rao Optimization Algorithms
| Algorithm 6 Rao-1, Rao-2, and Rao-3 pseudocode |
|
2.4. Best-Worst-Play Algorithm
| Algorithm 7 BWP pseudocode |
|
2.5. Max–Min-Greedy-Interaction Algorithm
| Algorithm 8 MaGI pseudocode |
|
3. GPU Parallelization of Metaphor-Less Optimization Algorithms
3.1. Principles of CUDA Programming
3.2. Methodology Used for GPU-Based Parallelization
3.3. GPU-Based Acceleration of the Jaya Optimization Algorithm
| Algorithm 9 GPU-based parallel Jaya |
|
| Algorithm 10 Kernel for generating the initial population |
|
| Algorithm 11 Kernel for population update (Jaya) |
|
| Algorithm 12 Function to ensure boundary integrity |
|
| Algorithm 13 Kernel for greedy selection |
|
3.4. GPU-Based Acceleration of the Enhanced Jaya Algorithm
| Algorithm 14 GPU-based parallel EJAYA |
|
| Algorithm 15 Kernel for determining upper and lower attract points |
| Algorithm 16 Kernel for permuting the population |
|
| Algorithm 17 Kernel for determining the exploration strategy |
|
| Algorithm 18 Kernel for population update (EJAYA) |
3.5. GPU-Based Acceleration of the Rao Optimization Algorithms
| Algorithm 19 GPU-based parallel Rao-1, Rao-2, and Rao-3 |
|
| Algorithm 20 Kernel for selecting random solutions from the population |
|
| Algorithm 21 Kernel for population update (Rao algorithms) |
|
3.6. GPU-Based Acceleration of the BWP Algorithm
| Algorithm 22 GPU-based parallel BWP algorithm. |
|
| Algorithm 23 Kernel for population update (BWP) |
|
3.7. A Novel GPU-Based Parallelization of the MaGI Algorithm
| Algorithm 24 GPU-based parallel MaGI algorithm |
|
| Algorithm 25 Kernel for population update (MaGI) |
|
4. Computational Experiments
4.1. Experimental Setting and Implementation
4.2. Hardware Used in the Computational Experiments
- NVIDIA GeForce RTX 3090 GPU (Ampere architecture) with 10 496 CUDA cores and 24 GB of GDDR6X VRAM;
- NVIDIA Tesla T4 GPU (Turing architecture) with 2560 CUDA cores and 16 GB of GDDR6 VRAM;
- NVIDIA Tesla V100S PCIe GPU (Volta architecture) with 5120 CUDA cores and 32 GB of HBM2 VRAM;
- NVIDIA A100 PCIe GPU (Ampere architecture) with 6912 CUDA cores and 80 GB of HBM2e VRAM.
4.3. Test Problems
5. Results and Discussion







6. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Kelley, C.T. Solving Nonlinear Equations with Newton’s Method; SIAM: Philadelphia, PA, 2003. [Google Scholar]
- Pérez, R.; Lopes, V.L.R. Recent applications and numerical implementation of quasi-Newton methods for solving nonlinear systems of equations. Numer. Alg. 2004, 35, 261–285. [Google Scholar] [CrossRef]
- Rice, J.R. Numerical Methods, Software, and Analysis, 2nd ed.; Academic Press: Boston, 1993. [Google Scholar]
- Karr, C.; Weck, B.; Freeman, L. Solutions to systems of nonlinear equations via genetic algorithms. Eng. Appl. Artif. Intell. 1998, 11, 369–375. [Google Scholar] [CrossRef]
- Kotsireas, I.S.; Pardalos, P.M.; Semenov, A.; Trevena, W.T.; Vrahatis, M.N. Survey of methods for solving systems of nonlinear equations, Part I: Root-finding approaches, 2022. [CrossRef]
- Li, Y.; Wei, Y.; Chu, Y. Research on solving systems of nonlinear equations based on improved PSO. Math. Probl. Eng. 2015, 2015, 1–13. [Google Scholar] [CrossRef]
- Choi, H.; Kim, S.; Shin, B.C. Choice of an initial guess for Newton’s method to solve nonlinear differential equations. Comput. Math. with Appl. 2022, 117, 69–73. [Google Scholar] [CrossRef]
- Press, W.J.; Teukolsky, S.A.; Vetterling, W.T.; Flannery, P.B. Numerical Recipes in C++: The Art of Scientific Computing, 3rd ed.; Cambridge University Press: New York, 2007. [Google Scholar]
- Coley, D.A. An Introduction to Genetic Algorithms for Scientists and Engineers; World Scientific: Singapore, 1999. [Google Scholar] [CrossRef]
- Rao, R.V. Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems. Int. J. Ind. Eng. Comput. 2016, 7, 19–34. [Google Scholar] [CrossRef]
- Zhang, Y.; Chi, A.; Mirjalili, S. Enhanced Jaya algorithm: A simple but efficient optimization method for constrained engineering design problems. Knowl. Based Syst. 2021, 233, 107555. [Google Scholar] [CrossRef]
- Ribeiro, S.; Silva, B.; Lopes, L.G. Solving systems of nonlinear equations using Jaya and Jaya-based algorithms: A computational comparison. In Proceedings of the Proceedings of International Conference on Paradigms of Communication, Computing and Data Analytics.; 2023; pp. 119–136. [Google Scholar] [CrossRef]
- Rao, R.V. Rao algorithms: Three metaphor-less simple algorithms for solving optimization problems. Int. J. Ind. Eng. Comput. 2020, 11, 107–130. [Google Scholar] [CrossRef]
- Singh, R.; Gaurav, K.; Pathak, V.K.; Singh, P.; Chaudhary, H. Best–Worst–Play (BWP): A metaphor-less optimization algorithm. J. Phys. Conf. Ser. 2020, 1455, 012007. [Google Scholar] [CrossRef]
- Singh, R.; Pathak, V.K.; Srivastava, A.K.; Kumar, R.; Sharma, A. A new metaphor-less optimization algorithm for synthesis of mechanisms. Int. J. Interact. Des. Manuf. 2023. [Google Scholar] [CrossRef]
- Agushaka, J.O.; Ezugwu, A.E. Initialisation approaches for population-based metaheuristic algorithms: A comprehensive review. Appl. Sci. 2022, 12, 896. [Google Scholar] [CrossRef]
- Agushaka, J.O.; Ezugwu, A.E. Advanced arithmetic optimization algorithm for solving mechanical engineering design problems. PLOS ONE 2021, 16, 1–29. [Google Scholar] [CrossRef]
- Freitas, D.; Lopes, L.G.; Morgado-Dias, F. Particle swarm optimisation: A historical review up to the current developments. Entropy 2020, 22, 362. [Google Scholar] [CrossRef]
- Rao, R.V. Jaya: An Advanced Optimization Algorithm and its Engineering Applications; Springer: Cham, Switzerland, 2019. [Google Scholar]
- Zitar, R.A.; Al-Betar, M.A.; Awadallah, M.A.; Doush, I.A.; Assaleh, K. An intensive and comprehensive overview of JAYA algorithm, its versions and applications. Arch. Comput. Methods Eng. 2022, 29, 763–792. [Google Scholar] [CrossRef] [PubMed]
- Civicioglu, P. Backtracking search optimization algorithm for numerical optimization problems. Appl. Math. Comput. 2013, 219, 8121–8144. [Google Scholar] [CrossRef]
- Soyata, T. GPU Parallel Program Development Using CUDA; Taylor and Francis: Boca Raton, FL, 2018. [Google Scholar]
- Gogolińska, A. ; Mikulski,.; Piątkowski, M. GPU computations and memory access model based on Petri nets. In Transactions on Petri Nets and Other Models of Concurrency XIII, Koutny, M., Kristensen, L., Penczek, W., Eds.; Springer: Berlin, 2018. [Google Scholar] [CrossRef]
- Cheng, J.R.; Gen, M. Parallel genetic algorithms with GPU computing. In Industry 4.0 – Impact on Intelligent Logistics and Manufacturing; Bányai, T., Petrillo, A., De Felice, F., Eds.; IntechOpen: Rijeka, Croatia, 2020; pp. 69–93. [Google Scholar] [CrossRef]
- Wang, L.; Zhang, Z.; Huang, C.; Tsui, K.L. A GPU-accelerated parallel Jaya algorithm for efficiently estimating Li-ion battery model parameters. Appl. Soft Comput. 2018, 65, 12–20. [Google Scholar] [CrossRef]
- Jimeno-Morenilla, A.; Sánchez-Romero, J.L.; Migallón, H.; Mora-Mora, H. Jaya optimization algorithm with GPU acceleration. J. Supercomput. 2019, 75, 1094–1106. [Google Scholar] [CrossRef]
- Silva, B.; Lopes, L.G. , Learning Algorithms and Applications – OL2A 2023, Ponta Delgada, Azores, Portugal, September 27–29, 2023; Pereira, A.I.; Mendes, A.; Fernandes, F.P.; Coelho, J.P.; Teixeira, J.P.; Cascalho, J.; Lima, J.; Pacheco, M.F.; Lopes, R.P., Eds.; IPB: Bragança, Portugal, 2023; p. 87.equations. In Book of Abstracts of the III International Conference on Optimization, Learning Algorithms and Applications – OL2A 2023, Ponta Delgada, Azores, Portugal, September 27–29, 2023; Pereira, A.I., Mendes, A., Fernandes, F.P., Coelho, J.P., Teixeira, J.P., Cascalho, J., Lima, J., Pacheco, M.F., Lopes, R.P., Eds.; IPB: Bragança, Portugal, 2023; p. 87. [Google Scholar]
- Silva, B.; Lopes, L.G. , Learning Algorithms and Applications: Third International Conference, OL2A 2023, Ponta Delgada, Portugal, September 27–29, 2023, Revised Selected Papers, Part II; Pereira, A.I.; Mendes, A.; Fernandes, F.P.; Pacheco, M.F.; Coelho, J.P.; Lima, J., Eds.; CCIS, vol. 1982, Springer: Cham, Switzerland, 2024; pp. 368–381. https://doi.org/10.1007/978-3-031-53036-4_26.equations. In Optimization, Learning Algorithms and Applications: Third International Conference, OL2A 2023, Ponta Delgada, Portugal, September 27–29, 2023, Revised Selected Papers, Part II; Pereira, A.I., Mendes, A., Fernandes, F.P., Pacheco, M.F., Coelho, J.P., Lima, J., Eds.; CCIS, vol. 1982, Springer: Cham, Switzerland, 2024; pp. 368–381. [Google Scholar] [CrossRef]
- Silva, B.; Lopes, L.G. , Pizzo, Calabria, Italy, 14–20 June 2023; Sergeyev, Y.D.; Kvasov, D.E.; Nasso, M.C., Eds.; Università della Calabria, DIMES: Rende (CS), Italy, 2023; p. 190.equations. In Book of Abstracts of the 4th International Conference on Numerical Computations: Theory and Algorithms – NUMTA 2023, Pizzo, Calabria, Italy, 14–20 June 2023; Sergeyev, Y.D., Kvasov, D.E., Nasso, M.C., Eds.; Università della Calabria, DIMES: Rende (CS), Italy, 2023; p. 190. [Google Scholar]
- Silva, B.; Lopes, L.G. , GPU acceleration of the Enhanced Jaya optimization algorithm for solving large systems of nonlinear equations. In Numerical Computations: Theory and Algorithms (NUMTA 2023); Sergeyev, Y.D.; Kvasov, D.E.; Astorino, A., Eds.; Lecture Notes in Computer Science, Springer: Cham, Switzerland, 2024 (to appear).
- Silva, B.; Lopes, L.G. , P.; Camacho, D.; Yin, H.; Gonçalves, T.; Julian, V.; Tallón-Ballesteros, A.J., Eds.; LNCS, vol. 14404, Springer: Cham, Switzerland, 2023; pp. 107–119. https://doi.org/10.1007/978-3-031-48232-8_11.equations. In Intelligent Data Engineering and Automated Learning – IDEAL 2023; Quaresma, P., Camacho, D., Yin, H., Gonçalves, T., Julian, V., Tallón-Ballesteros, A.J., Eds.; LNCS, vol. 14404, Springer: Cham, Switzerland, 2023; pp. 107–119. [Google Scholar] [CrossRef]
- Silva, B.; Lopes, L.G. , A massively parallel BWP algorithm for solving large-scale systems of nonlinear equations. In 2023 IEEE High Performance Extreme Computing Conference (HPEC), Boston, MA, USA, 2023; 2023; pp. 1–6. [CrossRef]
- Bezanson, J.; Edelman, A.; Karpinski, S.; Shah, V.B. Julia: A fresh approach to numerical computing. SIAM Rev. 2017, 59, 65–98. [Google Scholar] [CrossRef]
- Besard, T.; Foket, C.; De Sutter, B. Effective extensible programming: Unleashing Julia on GPUs. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 827–841. [Google Scholar] [CrossRef]
- Gao, K.; Mei, G.; Piccialli, F.; Cuomo, S.; Tu, J.; Huo, Z. Julia language in machine learning: Algorithms, applications, and open issues. Comput. Sci. Rev. 2020, 37, 100254. [Google Scholar]
- Etiemble, D. 45-year CPU evolution: one law and two equations. In Proceedings of the Second Workshop on Pioneering Processor Paradigms, Vienne, Austria; 2018. [Google Scholar] [CrossRef]
- Moré, J.J.; Garbow, B.S.; Hillstrom, K.E. Testing unconstrained optimization software. ACM Trans. Math. Softw. 1981, 7, 17–41. [Google Scholar] [CrossRef]
- Friedlander, A.; Gomes-Ruggiero, M.A.; Kozakevich, D.N.; Martínez, J.M.; Santos, S.A. Solving nonlinear systems of equations by means of quasi-Newton methods with a nonmonotone strategy. Optim. Methods Softw. 1997, 8, 25–51. [Google Scholar] [CrossRef]
- Bodon, E.; Del Popolo, A.; Lukšan, L.; Spedicato, E. Numerical performance of ABS codes for systems of nonlinear equations. Technical Report DMSIA 01/2001, Universitá degli Studi di Bergamo, Bergamo, Italy, 2001.
- Ziani, M.; Guyomarc’h, F. An autoadaptative limited memory Broyden’s method to solve systems of nonlinear equations. Appl. Math. Comput. 2008, 205, 202–211. [Google Scholar] [CrossRef]
- Kelley, C.T.; Qi, L.; Tong, X.; Yin, H. Finding a stable solution of a system of nonlinear equations. J. Ind. Manag. Optim. 2011, 7, 497–521. [Google Scholar] [CrossRef]


| Jaya | EJAYA | ||||||
|---|---|---|---|---|---|---|---|
|
Pop.
(Vars) |
Prob.
no. |
CPU
time |
RTX 3090
time (gain) |
A100
time (gain) |
CPU
time |
RTX 3090
time (gain) |
A100
time (gain) |
| (500 ) | 1 | 39.90 | 0.59 (67.6 ) | 0.33 (121.9 ) | 54.98 | 1.56 (35.3 ) | 1.54 (35.8 ) |
| 2 | 49.72 | 0.66 (74.8 ) | 0.33 (151.0 ) | 56.23 | 1.65 (34.1 ) | 1.54 (36.5 ) | |
| 3 | 44.69 | 0.53 (83.6 ) | 0.28 (160.7 ) | 56.78 | 1.54 (36.9 ) | 1.50 (37.7 ) | |
| 4 | 50.11 | 0.63 (79.2 ) | 0.28 (176.4 ) | 58.86 | 1.64 (35.9 ) | 1.51 (39.0 ) | |
| 5 | 49.33 | 0.70 (70.2 ) | 0.31 (158.9 ) | 63.58 | 1.72 (36.9 ) | 1.51 (42.2 ) | |
| 6 | 42.70 | 0.54 (79.6 ) | 0.31 (135.7 ) | 57.49 | 1.56 (36.8 ) | 1.54 (37.4 ) | |
| 7 | 40.87 | 0.55 (74.7 ) | 0.31 (129.9 ) | 52.99 | 1.56 (33.9 ) | 1.54 (34.4 ) | |
| 8 | 42.18 | 0.54 (78.2 ) | 0.29 (146.7 ) | 53.54 | 1.54 (34.7 ) | 1.51 (35.4 ) | |
| 9 | 55.34 | 0.73 (75.3 ) | 0.32 (175.4 ) | 63.19 | 1.78 (35.5 ) | 1.53 (41.2 ) | |
| 10 | 57.69 | 0.74 (77.8 ) | 0.32 (182.8 ) | 62.75 | 1.76 (35.6 ) | 1.54 (40.7 ) | |
| (1000 ) | 1 | 152.28 | 1.07 (141.8 ) | 0.75 (202.5 ) | 195.02 | 3.31 (58.9 ) | 3.38 (57.6 ) |
| 2 | 163.46 | 1.45 (112.5 ) | 0.75 (216.9 ) | 198.86 | 3.70 (53.8 ) | 3.39 (58.7 ) | |
| 3 | 150.11 | 0.98 (153.7 ) | 0.67 (225.1 ) | 212.96 | 3.25 (65.5 ) | 3.30 (64.6 ) | |
| 4 | 154.95 | 1.38 (112.4 ) | 0.69 (224.7 ) | 217.28 | 3.65 (59.6 ) | 3.32 (65.5 ) | |
| 5 | 169.35 | 1.65 (102.5 ) | 0.79 (213.4 ) | 230.65 | 3.93 (58.7 ) | 3.33 (69.3 ) | |
| 6 | 146.03 | 1.03 (142.1 ) | 0.75 (194.0 ) | 214.38 | 3.31 (64.8 ) | 3.39 (63.3 ) | |
| 7 | 129.48 | 1.07 (121.4 ) | 0.75 (172.1 ) | 200.85 | 3.34 (60.2 ) | 3.39 (59.3 ) | |
| 8 | 129.78 | 1.00 (129.7 ) | 0.69 (188.3 ) | 194.61 | 3.26 (59.6 ) | 3.31 (58.7 ) | |
| 9 | 196.08 | 1.83 (107.2 ) | 0.75 (260.4 ) | 241.98 | 4.15 (58.3 ) | 3.38 (71.6 ) | |
| 10 | 203.50 | 1.82 (111.8 ) | 0.75 (270.8 ) | 240.76 | 4.11 (58.5 ) | 3.38 (71.2 ) | |
| (1500 ) | 1 | 370.98 | 1.92 (193.0 ) | 1.28 (289.7 ) | 459.26 | 5.65 (81.2 ) | 5.38 (85.4 ) |
| 2 | 346.46 | 2.66 (130.4 ) | 1.28 (270.6 ) | 470.60 | 6.48 (72.6 ) | 5.40 (87.1 ) | |
| 3 | 324.86 | 1.65 (197.1 ) | 1.12 (290.9 ) | 497.39 | 5.51 (90.3 ) | 5.24 (94.9 ) | |
| 4 | 339.46 | 2.57 (132.1 ) | 1.15 (294.4 ) | 525.25 | 6.49 (81.0 ) | 5.28 (99.5 ) | |
| 5 | 381.45 | 3.17 (120.3 ) | 1.33 (285.9 ) | 549.86 | 7.00 (78.5 ) | 5.29 (103.9 ) | |
| 6 | 319.96 | 1.77 (180.4 ) | 1.28 (249.6 ) | 497.15 | 5.62 (88.5 ) | 5.40 (92.0 ) | |
| 7 | 287.13 | 1.85 (155.4 ) | 1.28 (223.8 ) | 474.74 | 5.71 (83.1 ) | 5.41 (87.7 ) | |
| 8 | 292.39 | 1.70 (171.7 ) | 1.15 (255.3 ) | 462.30 | 5.57 (83.1 ) | 5.27 (87.7 ) | |
| 9 | 439.23 | 3.56 (123.4 ) | 1.28 (342.4 ) | 565.03 | 7.50 (75.4 ) | 5.41 (104.5 ) | |
| 10 | 457.05 | 3.54 (129.0 ) | 1.28 (356.7 ) | 567.82 | 7.41 (76.7 ) | 5.41 (105.0 ) | |
| (2000 ) | 1 | 737.74 | 2.82 (261.5 ) | 1.93 (383.0 ) | 949.89 | 8.74 (108.7 ) | 7.59 (125.1 ) |
| 2 | 689.07 | 4.21 (163.6 ) | 1.93 (357.5 ) | 950.15 | 10.19 (93.2 ) | 7.63 (124.5 ) | |
| 3 | 729.06 | 2.46 (295.8 ) | 1.65 (443.0 ) | 977.18 | 8.48 (115.3 ) | 7.36 (132.7 ) | |
| 4 | 744.16 | 4.08 (182.5 ) | 1.71 (434.7 ) | 1030.45 | 10.21 (100.9 ) | 7.42 (138.9 ) | |
| 5 | 878.30 | 5.13 (171.1 ) | 1.97 (444.8 ) | 1119.75 | 11.01 (101.7 ) | 7.46 (150.1 ) | |
| 6 | 714.79 | 2.68 (267.2 ) | 1.93 (371.0 ) | 975.85 | 8.76 (111.5 ) | 7.63 (127.8 ) | |
| 7 | 674.34 | 2.82 (239.4 ) | 1.93 (350.0 ) | 919.19 | 8.83 (104.1 ) | 7.64 (120.4 ) | |
| 8 | 678.22 | 2.57 (264.3 ) | 1.69 (400.4 ) | 932.70 | 8.55 (109.1 ) | 7.40 (126.0 ) | |
| 9 | 974.28 | 5.83 (167.0 ) | 1.93 (505.3 ) | 1106.00 | 11.78 (93.9 ) | 7.64 (144.8 ) | |
| 10 | 904.68 | 5.81 (155.8 ) | 1.93 (468.5 ) | 1105.26 | 11.98 (92.2 ) | 7.64 (144.7 ) | |
| Rao-1 | Rao-2 | Rao-3 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
Pop.
(Vars) |
Prob.
no. |
CPU
time |
RTX 3090
time (gain) |
A100
time (gain) |
CPU
time |
RTX 3090
time (gain) |
A100
time (gain) |
CPU
time |
RTX 3090
time (gain) |
A100
time (gain) |
| (500 ) | 1 | 24.03 | 0.55 (43.3 ) | 0.32 (74.2 ) | 55.51 | 0.67 (82.6 ) | 0.43 (129.6 ) | 62.39 | 0.67 (93.4 ) | 0.42 (147.6 ) |
| 2 | 27.01 | 0.66 (41.1 ) | 0.33 (82.0 ) | 50.78 | 0.76 (67.0 ) | 0.42 (120.5 ) | 76.75 | 0.75 (101.8 ) | 0.42 (183.2 ) | |
| 3 | 26.37 | 0.54 (49.2 ) | 0.28 (94.8 ) | 61.59 | 0.66 (93.9 ) | 0.40 (154.5 ) | 64.46 | 0.64 (100.0 ) | 0.40 (163.1 ) | |
| 4 | 30.90 | 0.63 (48.8 ) | 0.28 (109.1 ) | 56.52 | 0.75 (75.7 ) | 0.40 (143.0 ) | 79.82 | 0.75 (107.0 ) | 0.39 (206.2 ) | |
| 5 | 34.11 | 0.70 (48.8 ) | 0.31 (108.6 ) | 60.58 | 0.80 (76.0 ) | 0.42 (143.6 ) | 70.70 | 0.80 (87.9 ) | 0.42 (167.7 ) | |
| 6 | 25.76 | 0.55 (46.5 ) | 0.32 (81.7 ) | 58.38 | 0.66 (88.2 ) | 0.45 (129.4 ) | 59.20 | 0.66 (89.1 ) | 0.45 (132.7 ) | |
| 7 | 28.29 | 0.55 (51.3 ) | 0.32 (89.5 ) | 60.83 | 0.68 (90.0 ) | 0.44 (138.4 ) | 63.15 | 0.67 (93.9 ) | 0.44 (144.3 ) | |
| 8 | 29.80 | 0.55 (54.5 ) | 0.29 (103.8 ) | 61.48 | 0.65 (94.3 ) | 0.41 (151.4 ) | 65.34 | 0.66 (99.6 ) | 0.40 (162.8 ) | |
| 9 | 35.18 | 0.73 (47.9 ) | 0.31 (112.2 ) | 68.65 | 0.84 (81.8 ) | 0.38 (182.2 ) | 74.35 | 0.84 (88.1 ) | 0.37 (200.0 ) | |
| 10 | 32.55 | 0.74 (43.9 ) | 0.32 (103.1 ) | 65.59 | 0.84 (78.4 ) | 0.38 (173.7 ) | 73.43 | 0.83 (88.2 ) | 0.37 (197.0 ) | |
| (1000 ) | 1 | 82.91 | 1.05 (78.6 ) | 0.75 (110.4 ) | 219.10 | 1.39 (157.5 ) | 1.11 (198.1 ) | 214.49 | 1.39 (153.8 ) | 1.10 (195.0 ) |
| 2 | 90.39 | 1.41 (64.2 ) | 0.75 (119.9 ) | 204.32 | 1.75 (116.8 ) | 1.11 (183.9 ) | 283.81 | 1.76 (161.1 ) | 1.11 (256.4 ) | |
| 3 | 88.50 | 0.97 (90.8 ) | 0.67 (132.6 ) | 234.66 | 1.32 (177.7 ) | 1.03 (227.3 ) | 248.38 | 1.32 (188.7 ) | 1.03 (241.1 ) | |
| 4 | 101.40 | 1.38 (73.5 ) | 0.69 (146.9 ) | 227.08 | 1.72 (132.1 ) | 1.04 (218.1 ) | 292.75 | 1.72 (170.3 ) | 1.04 (281.8 ) | |
| 5 | 109.02 | 1.67 (65.3 ) | 0.79 (137.4 ) | 237.69 | 1.98 (120.2 ) | 1.07 (221.3 ) | 276.37 | 1.97 (140.1 ) | 1.07 (258.4 ) | |
| 6 | 86.80 | 1.04 (83.8 ) | 0.75 (115.9 ) | 227.92 | 1.36 (167.7 ) | 1.13 (202.5 ) | 231.80 | 1.37 (169.1 ) | 1.12 (206.7 ) | |
| 7 | 95.58 | 1.06 (90.0 ) | 0.75 (127.9 ) | 233.46 | 1.41 (165.1 ) | 1.12 (208.3 ) | 239.59 | 1.42 (169.2 ) | 1.12 (214.0 ) | |
| 8 | 101.57 | 1.00 (102.0 ) | 0.69 (147.4 ) | 236.84 | 1.34 (176.8 ) | 1.05 (225.9 ) | 248.91 | 1.34 (185.8 ) | 1.05 (237.9 ) | |
| 9 | 130.21 | 1.83 (71.2 ) | 0.75 (172.8 ) | 263.01 | 2.07 (126.8 ) | 0.97 (272.4 ) | 292.20 | 2.08 (140.6 ) | 0.97 (300.2 ) | |
| 10 | 112.72 | 1.84 (61.3 ) | 0.76 (149.2 ) | 242.10 | 2.11 (115.0 ) | 0.97 (249.8 ) | 290.69 | 2.11 (137.6 ) | 0.96 (301.3 ) | |
| (1500 ) | 1 | 190.52 | 1.82 (104.7 ) | 1.28 (148.8 ) | 515.54 | 2.56 (201.1 ) | 1.94 (265.5 ) | 507.35 | 2.57 (197.4 ) | 1.94 (262.0 ) |
| 2 | 210.81 | 2.65 (79.6 ) | 1.28 (164.6 ) | 472.17 | 3.34 (141.2 ) | 1.94 (243.6 ) | 620.31 | 3.34 (185.5 ) | 1.94 (320.4 ) | |
| 3 | 188.79 | 1.65 (114.7 ) | 1.12 (169.1 ) | 536.08 | 2.38 (225.0 ) | 1.78 (300.3 ) | 578.01 | 2.38 (242.6 ) | 1.78 (324.5 ) | |
| 4 | 221.14 | 2.58 (85.8 ) | 1.15 (191.7 ) | 520.62 | 3.25 (160.2 ) | 1.79 (290.6 ) | 651.36 | 3.26 (199.9 ) | 1.79 (364.6 ) | |
| 5 | 230.39 | 3.19 (72.3 ) | 1.34 (171.7 ) | 548.90 | 3.84 (142.8 ) | 1.87 (294.2 ) | 631.89 | 3.85 (164.1 ) | 1.86 (339.6 ) | |
| 6 | 189.74 | 1.77 (107.2 ) | 1.28 (147.8 ) | 525.59 | 2.49 (211.1 ) | 1.95 (269.4 ) | 539.88 | 2.49 (216.6 ) | 1.95 (277.2 ) | |
| 7 | 209.67 | 1.86 (113.0 ) | 1.28 (163.5 ) | 537.68 | 2.59 (207.3 ) | 1.96 (274.2 ) | 564.67 | 2.66 (212.6 ) | 1.96 (288.2 ) | |
| 8 | 218.59 | 1.69 (129.2 ) | 1.15 (190.9 ) | 543.32 | 2.42 (224.6 ) | 1.80 (301.1 ) | 558.39 | 2.42 (231.1 ) | 1.80 (310.4 ) | |
| 9 | 295.56 | 3.57 (82.9 ) | 1.28 (230.5 ) | 585.36 | 3.85 (152.1 ) | 1.48 (396.4 ) | 660.64 | 3.94 (167.5 ) | 1.52 (435.6 ) | |
| 10 | 252.91 | 3.56 (71.0 ) | 1.28 (197.4 ) | 575.19 | 3.86 (149.0 ) | 1.47 (392.4 ) | 663.25 | 3.85 (172.4 ) | 1.44 (461.2 ) | |
| (2000 ) | 1 | 362.13 | 2.81 (128.8 ) | 1.93 (188.0 ) | 961.60 | 4.08 (235.5 ) | 2.99 (321.4 ) | 948.69 | 4.07 (232.8 ) | 2.99 (317.6 ) |
| 2 | 426.73 | 4.26 (100.2 ) | 1.93 (221.4 ) | 882.08 | 5.42 (162.7 ) | 2.99 (295.3 ) | 1114.94 | 5.43 (205.2 ) | 2.98 (374.1 ) | |
| 3 | 351.16 | 2.46 (142.5 ) | 1.65 (213.4 ) | 1001.60 | 3.75 (267.3 ) | 2.72 (367.8 ) | 1043.39 | 3.75 (278.2 ) | 2.72 (383.4 ) | |
| 4 | 458.26 | 4.08 (112.2 ) | 1.71 (267.7 ) | 952.20 | 5.27 (180.6 ) | 2.76 (345.4 ) | 1113.26 | 5.27 (211.1 ) | 2.75 (404.2 ) | |
| 5 | 481.20 | 5.13 (93.7 ) | 1.97 (243.7 ) | 1067.28 | 6.32 (168.9 ) | 2.88 (370.2 ) | 1185.69 | 6.32 (187.7 ) | 2.88 (411.8 ) | |
| 6 | 363.18 | 2.67 (136.1 ) | 1.93 (188.5 ) | 1029.87 | 3.95 (260.8 ) | 2.99 (344.2 ) | 1008.06 | 3.95 (255.0 ) | 2.99 (337.1 ) | |
| 7 | 364.72 | 2.81 (129.9 ) | 1.93 (189.3 ) | 993.69 | 4.13 (240.6 ) | 3.01 (330.3 ) | 965.57 | 4.18 (230.9 ) | 3.01 (321.1 ) | |
| 8 | 370.64 | 2.56 (145.0 ) | 1.69 (218.8 ) | 978.14 | 3.83 (255.2 ) | 2.76 (354.5 ) | 989.94 | 3.83 (258.6 ) | 2.75 (359.7 ) | |
| 9 | 564.40 | 5.83 (96.9 ) | 1.93 (292.7 ) | 1171.44 | 6.48 (180.9 ) | 2.23 (524.3 ) | 1206.58 | 6.43 (187.5 ) | 2.21 (546.7 ) | |
| 10 | 525.56 | 5.83 (90.1 ) | 1.93 (272.2 ) | 1078.68 | 6.45 (167.2 ) | 2.21 (489.2 ) | 1258.08 | 6.50 (193.4 ) | 2.24 (561.8 ) | |
| BWP | MaGI | ||||||
|---|---|---|---|---|---|---|---|
|
Pop.
(Vars) |
Prob.
no. |
CPU
time |
RTX 3090
time (gain) |
A100
time (gain) |
CPU
time |
RTX 3090
time (gain) |
A100
time (gain) |
| (500 ) | 1 | 83.04 | 1.00 (83.3 ) | 0.53 (156.0 ) | 107.64 | 1.23 (87.4 ) | 0.72 (148.7 ) |
| 2 | 79.48 | 1.34 (59.4 ) | 0.65 (121.5 ) | 117.63 | 1.41 (83.2 ) | 0.76 (155.2 ) | |
| 3 | 83.83 | 1.03 (81.6 ) | 0.56 (150.4 ) | 94.22 | 1.03 (91.7 ) | 0.52 (180.4 ) | |
| 4 | 88.05 | 1.30 (68.0 ) | 0.58 (151.0 ) | 116.55 | 1.33 (87.7 ) | 0.63 (185.6 ) | |
| 5 | 84.39 | 1.37 (61.7 ) | 0.58 (145.6 ) | 123.60 | 1.43 (86.4 ) | 0.65 (191.4 ) | |
| 6 | 74.44 | 1.01 (73.9 ) | 0.54 (137.4 ) | 93.03 | 1.18 (78.6 ) | 0.69 (135.6 ) | |
| 7 | 83.18 | 1.02 (81.7 ) | 0.53 (157.0 ) | 90.15 | 1.20 (74.9 ) | 0.69 (130.1 ) | |
| 8 | 64.09 | 1.04 (61.4 ) | 0.54 (118.2 ) | 91.30 | 1.17 (78.2 ) | 0.65 (141.0 ) | |
| 9 | 92.16 | 1.28 (71.9 ) | 0.43 (214.1 ) | 120.75 | 1.46 (82.9 ) | 0.56 (216.4 ) | |
| 10 | 98.07 | 1.28 (76.5 ) | 0.44 (224.9 ) | 128.80 | 1.46 (88.0 ) | 0.58 (221.1 ) | |
| (1000 ) | 1 | 266.94 | 1.91 (139.8 ) | 1.19 (223.4 ) | 318.06 | 2.51 (126.6 ) | 1.83 (173.6 ) |
| 2 | 265.09 | 2.98 (89.0 ) | 1.59 (166.8 ) | 452.80 | 3.23 (140.0 ) | 1.87 (241.7 ) | |
| 3 | 280.87 | 1.99 (141.2 ) | 1.38 (204.1 ) | 317.79 | 2.03 (156.4 ) | 1.14 (278.1 ) | |
| 4 | 300.54 | 2.87 (104.8 ) | 1.44 (208.6 ) | 419.10 | 3.05 (137.4 ) | 1.51 (276.6 ) | |
| 5 | 291.19 | 3.25 (89.7 ) | 1.40 (207.4 ) | 447.20 | 3.65 (122.6 ) | 1.77 (253.2 ) | |
| 6 | 248.07 | 1.88 (132.3 ) | 1.22 (204.2 ) | 334.23 | 2.33 (143.3 ) | 1.62 (206.8 ) | |
| 7 | 291.98 | 1.96 (149.0 ) | 1.21 (241.2 ) | 320.02 | 2.47 (129.4 ) | 1.70 (187.7 ) | |
| 8 | 225.04 | 1.85 (121.8 ) | 1.10 (204.4 ) | 329.33 | 2.34 (140.7 ) | 1.58 (207.9 ) | |
| 9 | 293.84 | 3.22 (91.3 ) | 0.92 (318.8 ) | 409.07 | 3.74 (109.4 ) | 1.26 (325.9 ) | |
| 10 | 340.01 | 3.22 (105.5 ) | 0.89 (381.0 ) | 454.26 | 3.47 (130.8 ) | 1.27 (357.2 ) | |
| (1500 ) | 1 | 550.07 | 3.39 (162.4 ) | 2.06 (267.5 ) | 789.59 | 4.50 (175.6 ) | 3.18 (247.9 ) |
| 2 | 581.02 | 5.48 (106.0 ) | 2.65 (218.9 ) | 1008.38 | 6.14 (164.2 ) | 3.24 (310.8 ) | |
| 3 | 614.33 | 3.40 (180.8 ) | 2.30 (267.1 ) | 665.81 | 3.53 (188.5 ) | 2.10 (317.6 ) | |
| 4 | 658.30 | 5.32 (123.8 ) | 2.41 (273.6 ) | 878.04 | 5.77 (152.2 ) | 2.61 (336.7 ) | |
| 5 | 637.82 | 6.20 (102.8 ) | 2.34 (272.8 ) | 993.52 | 7.12 (139.5 ) | 3.15 (315.6 ) | |
| 6 | 563.17 | 3.28 (171.6 ) | 2.06 (273.3 ) | 750.12 | 4.09 (183.6 ) | 2.77 (270.5 ) | |
| 7 | 598.77 | 3.47 (172.3 ) | 2.06 (291.4 ) | 740.67 | 4.41 (167.8 ) | 3.00 (247.1 ) | |
| 8 | 484.28 | 3.18 (152.1 ) | 1.83 (264.7 ) | 746.81 | 4.12 (181.1 ) | 2.72 (274.7 ) | |
| 9 | 596.64 | 6.53 (91.4 ) | 1.69 (352.9 ) | 869.23 | 6.94 (125.2 ) | 2.17 (401.2 ) | |
| 10 | 755.36 | 6.59 (114.7 ) | 1.72 (439.1 ) | 1002.26 | 7.27 (137.8 ) | 2.23 (450.0 ) | |
| (2000 ) | 1 | 1190.73 | 5.37 (221.9 ) | 3.20 (372.5 ) | 1704.56 | 7.07 (241.2 ) | 4.86 (350.4 ) |
| 2 | 1280.47 | 8.78 (145.9 ) | 3.98 (321.9 ) | 1956.22 | 9.95 (196.6 ) | 4.94 (395.9 ) | |
| 3 | 1299.94 | 5.21 (249.4 ) | 3.40 (382.4 ) | 1442.29 | 5.75 (250.8 ) | 3.37 (427.5 ) | |
| 4 | 1338.01 | 8.50 (157.4 ) | 3.58 (373.4 ) | 1732.39 | 9.31 (186.1 ) | 4.00 (433.2 ) | |
| 5 | 1391.69 | 10.30 (135.1 ) | 3.57 (389.5 ) | 1929.49 | 11.68 (165.2 ) | 4.77 (404.7 ) | |
| 6 | 1231.50 | 5.09 (241.9 ) | 3.21 (383.9 ) | 1503.84 | 6.43 (233.9 ) | 4.33 (347.1 ) | |
| 7 | 1147.42 | 5.43 (211.4 ) | 3.21 (357.6 ) | 1515.59 | 6.95 (218.0 ) | 4.65 (326.1 ) | |
| 8 | 1160.41 | 4.92 (235.9 ) | 2.78 (418.0 ) | 1501.78 | 6.46 (232.4 ) | 4.11 (365.0 ) | |
| 9 | 1293.98 | 11.03 (117.3 ) | 3.01 (429.4 ) | 1690.29 | 11.69 (144.6 ) | 3.23 (523.4 ) | |
| 10 | 1436.48 | 11.04 (130.1 ) | 2.93 (490.7 ) | 1798.28 | 11.75 (153.1 ) | 3.28 (548.7 ) | |
| Alg. | Pop. (vars) | CPU time |
Tesla T4 time (gain) |
RTX 3090 time (gain) |
Tesla V100S time (gain) |
A100 time (gain) |
|---|---|---|---|---|---|---|
| Jaya | 5000 (500 ) | 47.25 | 0.82 (57.6 ) | 0.62 (75.9 ) | 0.36 (132.2 ) | 0.31 (153.6 ) |
| 10 000 (1000 ) | 159.50 | 2.72 (58.7 ) | 1.33 (120.1 ) | 0.80 (199.0 ) | 0.74 (216.9 ) | |
| 15 000 (1500 ) | 355.90 | 5.84 (61.0 ) | 2.44 (145.9 ) | 1.44 (247.7 ) | 1.24 (286.1 ) | |
| 20 000 (2000 ) | 772.46 | 10.16 (76.0 ) | 3.84 (201.1 ) | 2.28 (338.3 ) | 1.86 (415.5 ) | |
| EJAYA | 5000 (500 ) | 58.04 | 2.27 (25.5 ) | 1.63 (35.6 ) | 1.40 (41.6 ) | 1.53 (38.0 ) |
| 10 000 (1000 ) | 214.73 | 8.45 (25.4 ) | 3.60 (59.6 ) | 3.28 (65.4 ) | 3.36 (64.0 ) | |
| 15 000 (1500 ) | 506.94 | 23.61 (21.5 ) | 6.29 (80.5 ) | 5.76 (88.1 ) | 5.35 (94.8 ) | |
| 20 000 (2000 ) | 1006.64 | 40.25 (25.0 ) | 9.85 (102.2 ) | 8.85 (113.8 ) | 7.54 (133.5 ) | |
| Rao-1 | 5000 (500 ) | 29.40 | 0.82 (35.8 ) | 0.62 (47.4 ) | 0.36 (82.3 ) | 0.31 (95.6 ) |
| 10 000 (1000 ) | 99.91 | 2.71 (36.8 ) | 1.32 (75.4 ) | 0.80 (124.6 ) | 0.74 (135.9 ) | |
| 15 000 (1500 ) | 220.81 | 5.84 (37.8 ) | 2.43 (90.8 ) | 1.44 (153.7 ) | 1.24 (177.4 ) | |
| 20 000 (2000 ) | 426.80 | 10.16 (42.0 ) | 3.84 (111.0 ) | 2.28 (187.0 ) | 1.86 (229.6 ) | |
| Rao-2 | 5000 (500 ) | 59.99 | 1.22 (49.2 ) | 0.73 (82.2 ) | 0.49 (123.3 ) | 0.41 (145.7 ) |
| 10 000 (1000 ) | 232.62 | 4.46 (52.1 ) | 1.65 (141.4 ) | 1.25 (185.6 ) | 1.06 (219.6 ) | |
| 15 000 (1500 ) | 536.04 | 10.00 (53.6 ) | 3.06 (175.2 ) | 2.42 (221.7 ) | 1.80 (298.1 ) | |
| 20 000 (2000 ) | 1011.66 | 17.54 (57.7 ) | 4.97 (203.6 ) | 3.87 (261.2 ) | 2.75 (367.3 ) | |
| Rao-3 | 5000 (500 ) | 68.96 | 1.22 (56.6 ) | 0.73 (94.6 ) | 0.49 (141.7 ) | 0.41 (169.2 ) |
| 10 000 (1000 ) | 261.90 | 4.46 (58.7 ) | 1.65 (158.9 ) | 1.25 (209.5 ) | 1.06 (247.7 ) | |
| 15 000 (1500 ) | 597.58 | 10.00 (59.8 ) | 3.08 (194.3 ) | 2.42 (247.3 ) | 1.80 (332.7 ) | |
| 20 000 (2000 ) | 1083.42 | 17.52 (61.8 ) | 4.98 (217.8 ) | 3.87 (280.0 ) | 2.75 (393.7 ) | |
| BWP | 5000 (500 ) | 83.07 | 1.59 (52.3 ) | 1.17 (71.3 ) | 0.62 (133.3 ) | 0.54 (154.2 ) |
| 10 000 (1000 ) | 280.36 | 5.48 (51.2 ) | 2.51 (111.6 ) | 1.42 (197.1 ) | 1.23 (227.1 ) | |
| 15 000 (1500 ) | 603.98 | 11.95 (50.5 ) | 4.68 (128.9 ) | 2.63 (229.4 ) | 2.11 (286.1 ) | |
| 20 000 (2000 ) | 1277.06 | 20.95 (61.0 ) | 7.57 (168.8 ) | 4.31 (296.2 ) | 3.29 (388.6 ) | |
| MaGI | 5000 (500 ) | 108.37 | 1.96 (55.2 ) | 1.29 (84.0 ) | 0.75 (144.2 ) | 0.64 (168.1 ) |
| 10 000 (1000 ) | 380.19 | 7.00 (54.3 ) | 2.88 (131.8 ) | 1.81 (209.8 ) | 1.56 (244.3 ) | |
| 15 000 (1500 ) | 844.44 | 15.42 (54.7 ) | 5.39 (156.7 ) | 3.43 (246.1 ) | 2.72 (310.8 ) | |
| 20 000 (2000 ) | 1677.47 | 27.02 (62.1 ) | 8.70 (192.7 ) | 5.66 (296.5 ) | 4.15 (403.8 ) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).