Submitted:
02 April 2024
Posted:
02 April 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Strong Scaling Tests
2.1. DLR Combustor

2.2. Lid-Driven Cavity
3. Single Node Performance
4. Analysis Using Different Architectures
5. Profiling of L3 Cache and MPI Overhead
6. Modeling the FVOPS Behavior Using POP Metrics
7. Summary
Acknowledgments
References
- Axtmann, G., Rist, U. (2016). Scalability of OpenFOAM with Large Eddy Simulations and DNS on High-Performance Systems. In: Nagel, W.E., Kröner, D.H., Resch, M.M. (eds) High Performance Computing in Science and Engineering ´16. Springer, Cham. [CrossRef]
- C. M. Williams, E. Gedenk. HLRS Jahresbericht 2022, 2023. Available online: https://www.hlrs.de/de/about/profile/jahresbericht (accessed on 30 January 2024).
- exaFOAM project website. Available online: https://exafoam.eu (accessed on 30 January 2024).
- H. G. Weller, G. Tabor, H. Jasak, C. Fureby. A tensorial approach to computational continuum mechanics using object-oriented techniques. Comput. Phys. 12 (6): 620–631, 1998. [CrossRef]
- S. Ristov, R. Prodan, M. Gusev and K. Skala. Superlinear speedup in HPC systems: Why and when?, 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), Gdansk, Poland, pp. 889-898, 2016.
- Severin, M., 2019. Analyse der Flammenstabilisierung intensiv mischender Jetflammen für Gasturbinenbrennkammern (PhD Thesis). Universität Stuttgart.
- Ax, H., Lammel, O., Lückerath, R., Severin, M., 2020. High-Momentum Jet Flames at Elevated Pressure, C: Statistical Distribution of Thermochemical States Obtained From Laser-Raman Measurements. Journal of Engineering for Gas Turbines and Power 142, 071011. [CrossRef]
- Gruhlke, P., Janbazi, H., Wlokas, I., Beck, C., Kempf, A.M., 2020. Investigation of a High Karlovitz, High Pressure Premixed Jet Flame with Heat Losses by LES. Combustion Science and Technology 192, 2138–2170. [CrossRef]
- Gruhlke, P., Janbazi, H., Wollny, P., Wlokas, I., Beck, C., Janus, B., Kempf, A.M., 2021. Large-Eddy Simulation of a Lifted High-Pressure Jet-Flame with Direct Chemistry. Combustion Science and Technology 1–25. [CrossRef]
- AbdelMigid, T.A.; Saqr, K.M.; Kotb, M.A.; Aboelfarag, A.A. Revisiting the lid-driven cavity flow problem: Review and new steady state benchmarking results using GPU accelerated code. Alexandria engineering journal 2017, 56, 123–135. [Google Scholar] [CrossRef]
- MB1 Microbenchmark - Lid-driven cavity 3D. Available online: https://develop.openfoam.com/committees/hpc/-/tree/develop/incompressible/icoFoam/cavity3D (accessed on 30 Janaury 2024).
- GC2 Grand Challenge - DLR Confined Jet High Pressure Combustor. Available online: https://develop.openfoam.com/committees/hpc/-/tree/develop/combustion/XiFoam/DLRCJH (accessed on 30 Janaury 2024).
- Lesnik, Sergey and Rusche, Henrik, exaFOAM Grand Challenge GC2 - DLR Confined Jet High Pressure Combustor, DaRUS, V1. [CrossRef]
- Available online: https://perf.wiki.kernel.org (accessed on 1 Janaury 2024).
- Available online: https://icl.utk.edu/papi (accessed on 1 Janaury 2024).
- Available online: https://github.com/LLNL/mpiP (accessed on 1 Janaury 2024).
- Available online: https://pop-coe.eu/node/69 (accessed on 1 Janaury 2024).
- Available online: https://tools.bsc.es/extrae (accessed on 1 Janaury 2024).
- Ma, J.; Wu, K.; Jiang, Z.; Couples, G.D. SHIFT: An implementation for lattice Boltzmann simulation in low-porosity porous media, Phys. Rev. E 2010, 81, 056702. [Google Scholar] [CrossRef] [PubMed]







| Total grid elements | Runtime per time step [s] | Grid elements per rank | FVOPS per node | L3 ratio | MPI ratio | Parallel efficiency | IPC | IPC*PE |
|---|---|---|---|---|---|---|---|---|
| 27,000,000 | 8.651 | 210,938 | 3.1M | 0.823 | 0.056 | 0.96 | 0.31 | 0.299 |
| 15,625,000 | 4.267 | 122,070 | 3.6M | 0.703 | 0.067 | 0.95 | 0.4 | 0.379 |
| 8,000,000 | 1.599 | 62,500 | 5.0M | 0.520 | 0.089 | 0.92 | 0.73 | 0.643 |
| 5,359,375 | 0.646 | 41,870 | 8.3M | 0.284 | 0.187 | 0.82 | 1.41 | 1.096 |
| 3,375,000 | 0.261 | 26,367 | 12.9M | 0.083 | 0.279 | 0.77 | 1.72 | 1.238 |
| 1,953,125 | 0.144 | 15,259 | 13.6M | 0.054 | 0.353 | 0.71 | 1.95 | 1.214 |
| 1,000,000 | 0.075 | 7,813 | 13.4M | 0.050 | 0.469 | 0.65 | 2.01 | 1.088 |
| 421,875 | 0.040 | 3,296 | 10.5M | 0.095 | 0.625 | 0.54 | 2.07 | 0.876 |
| 125,000 | 0.025 | 977 | 5.0M | 0.266 | 0.792 | 0.37 | 1.82 | 0.536 |
| 15,625 | 0.017 | 122 | 0.9M | 0.342 | 0.875 | 0.19 | 1.33 | 0.324 |
| CPUs | Cores per node | Memory | L3 cache | |
|---|---|---|---|---|
| Hawk | AMD EPYC 7742 @ 2.25 GHz | 2 x 64 | 256 GB | 256 MB |
| Vulcan | Intel Xeon Gold 6138 @ 2,0 GHz | 2 x 20 | 192 GB | 27.5 MB |
| Armida ARM | Cavium ThunderX2 @ 2.5 GHz | 2 x 32 | 256 GB | 32 MB |
| Armida AMD | AMD EPYC 7313 @ 3.0 Ghz | 2 x 16 | 256 GB | 128 MB |
| Armida Intel 1 | Intel Xeon Gold 6330 @ 2.0 GHz | 2 x 28 | 256 GB | 42 MB |
| Armida Intel 2 | Intel Xeon Gold 6326 @ 2.9 GHz | 2 x 16 | 512 GB | 24 MB |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).