Preprint
Review

This version is not peer-reviewed.

Linear Solvers in OpenFOAM: A Technical Review and SIMPLE Convergence Study

Submitted:

21 May 2026

Posted:

22 May 2026

Read the latest preprint version here

Abstract
This article reviews the linear solvers available in OpenFOAM and assesses their impact on the convergence behaviour of the SIMPLE algorithm. The discretisation of transport equations in CFD results in large and sparse linear systems, for which the choice of linear solver strongly influences the computational time. Although the solver does not change the final discrete solution, the difference in speed and robustness between the solvers can be more than an order of magnitude. A brief overview is given of how the velocity and pressure fields are decoupled in OpenFOAM, followed by a detailed review of the main linear solver families, including direct methods, basic iterative methods, multigrid methods and Krylov subspace methods, with attention to their practical strengths and weaknesses. The performance of the most advanced solvers is evaluated on a full-scale non-reacting kiln case consisting of 2.3 million cells. The pressure-corrector equation is identified as the main bottleneck in the SIMPLE algorithm. The Conjugate Gradient (CG) solver with the Generalised Geometric–Algebraic MultiGrid (GAMG) preconditioner is found to be the fastest and most stable method, achieving speed-ups of up to a factor of 7 compared to the slower advanced methods. Using GAMG as preconditioner also improves the robustness of the Bi-CGStab method.
Keywords: 
;  ;  ;  ;  ;  

1. Introduction

Computational Fluid Dynamics (CFD) is a part of fluid mechanics that uses numerical analysis to solve advection–diffusion–reaction type equations related to a wide range of applications involving fluid flow, and heat and mass transfer. Discretisation is a technique in numerical analysis that transforms a continuous problem into a discrete counterpart, which can be solved numerically by means of numerical solution methods.
It first requires that the geometry is divided into small, non-overlapping finite elements (or finite volumes), usually called cells. The discretisation method then transforms the partial differential equations into discrete algebraic equations, which are integrated over these discrete cells. After the discretisation of each transport equation, the resulting set of discrete equations needs to be linearised, which leads to a linear system of equations in the form of A u = b . The resulting systems of equations can be solved numerically by linear solvers.
Although every CFD simulation depends heavily on solving such linear systems, the underlying solution methods typically receive little attention in everyday CFD work and are generally treated as a ’black box’. Since the choice of linear solver does not affect the final discrete solution but only the speed at which it is reached, the common approach is to rely on default linear solver settings or to compensate slow convergence with additional computational hardware. However, different linear solvers exhibit very different performance characteristics. Understanding these strengths and weaknesses can lead to significant speed-ups, often far exceeding what can be achieved by simply increasing computational resources. This is especially true with OpenFOAM, where the linear solver options are very diverse.
This paper provides a detailed review of the linear solver theory in OpenFOAM, in order to improve the general understanding and appreciation of the mathematical models, and to aid in speeding up the convergence without the need for stronger hardware. Details are provided in Section 2. The theory in this work partially follows Refs. [1,2,3,4,5].
The performance of the solution methods in OpenFOAM is demonstrated for an industrial case in Section 3.

1.1. Decoupling the Velocity and Pressure in CFD

Before discussing the linear solvers used in this work, it is important to briefly revisit how the velocity and pressure fields are decoupled in CFD. Once the velocity field is determined, solving the transport of a particular scalar (such as energy and the variables for turbulence) is relatively easy, as the flow field of that scalar is ’frozen’, and the main challenge is usually to determine the source terms. However, to calculate the velocity field requires solving the discretised Navier-Stokes equations numerically, which is difficult. Not necessarily because it is non-linear, but due to the coupling of the velocity and pressure in the momentum equation. Consider the incompressible steady-state Navier-Stokes equations with neglected gravity
· U = 0 ,
U · U = p ρ + · ν U ,
where the vector U (bold face) is the velocity, p is the pressure, ρ is the density, and ν is the kinematic viscosity. These equations consist of 4 equations and 4 unknowns: U x , U y , U z and p (note that the density here is not a variable but a reference value). This would intuitively mean that it is directly solvable, even if there is no explicit equation for p. However, the continuity equation is not really the ’closing’ equation, but acts as a restriction to the momentum equations in order to satisfy mass conservation.
The Semi-Implicit Method for Pressure-Linked Equations (SIMPLE) algorithm decouples the pressure and velocity fields, allowing for calculating them separately, then using one field to correct the other field iteratively until convergence is achieved.

1.2. The SIMPLE Algorithm

The SIMPLE algorithm is developed by Patankar [6] to solve the steady-state incompressible Navier-Stokes equations. The main idea is to use the continuity equation to derive an equation for the pressure after splitting it from the momentum equations. The pressure equation will then act as a corrector for the modified momentum equation. It starts by by writing the momentum equations (Eq. 2) in general matrix form:
A U = p ,
where A is the coefficient matrix resulting from discretising Eq. 2 using the Finite Volume Method. All these coefficients are known. Eq. 3 can be solved when the pressure is known from the previous iteration (starting from an initial guess at the first iteration), of which its gradient acts as the source term here. Also, the non-linearity appearing in the LHS of Eq. 2 is linearised by evaluating the mass fluxes using the velocity field from the previous iteration, resulting in a linear system for the velocity. The LHS of Eq. 3 is known in OpenFOAM as the velocity equation. The calculated velocity field does not yet satisfy continuity and is still a guess. This stage is called the momentum-predictor.
The derivation of the pressure equation starts from splitting the coefficient matrix A by extracting its diagonal into matrix D :
A U = D U H .
The reason is because a diagonal matrix can easily be inverted ( D 1 ) which is very useful. H is the residual vector containing the contributions of all the neighbour cells, and will be used as a source term for the pressure equation later. Combining Equations 3 and 4, and multiplying on both sides with D 1 results in an equation for the velocity field which is an explicit function of the pressure field:
U = D 1 H D 1 p .
In the OpenFOAM source code, the diagonal matrix D is defined as rAU, and the frequently recurring product D 1 H is known as HbyA. Now we can use the continuity equation to derive an explicit equation for the pressure. Substituting Eq. 5 into Eq. 1 leads to following Poisson equation, which is known as the pressure corrector:
· ( D 1 p ) = · ( D 1 H ) .
After solving for p in the above pressure equation, the velocity field (Eq. 5) can be corrected to satisfy continuity. Notice that the velocity is calculated explicitly and this is why this method is called Semi-Implicit. The problem now is that, once the velocity is corrected, the pressure equation is no longer satisfied because the matrix H depends on the velocity which has been updated. Therefore this process is iterated until both the velocity and pressure fields no longer change after correction.
The SIMPLE algorithm can be extended to problems beyond steady incompressible flows. In such cases, most transport variables such as chemical species and enthalpy are solved after the momentum predictor step in OpenFOAM, and the turbulence transport equations are solved after the pressure corrector step.

2. Solution Methods

2.1. Introduction

For each transport equation, the discretisation and linearisation process leads to a system of linear equations, which is written as the following general matrix equation:
A u = b ,
where A is the coefficient matrix resulting from the linearisation and mesh geometry, u is the vector that contains the unknown values of the dependent variable at the cell centroids which needs to be solved for, while b is the vector containing all the sources, constants and boundary conditions.
With the momentum-predictor in the previous section (Eq. 3), it is easy to see how it is translated into a general matrix equation:
  • A A ,
  • U u ,
  • p b ,
while in the pressure-corrector (Eq. 6) the translation is a bit less apparent:
  • D * A ,
  • p u ,
  • · ( D 1 H ) b ,
where D * is the resulting coefficient matrix after combining D 1 with the discretised Laplace operator coefficients of p.
The linear systems are solved by linear solvers, for which different solution methods exist, which are generally grouped into direct and iterative methods, each consisting of many sub-groups. Due to the linearisation process with the finite volume method, the coefficient matrix is very large (many cells required for accuracy), very sparse, and diagonal dominant. Iterative methods therefore have been more popular because they are more suited for this type of applications as they typically require lower computational cost per iteration and lower memory. The (modified) direct methods can, however, be used as a preconditioner, in which they are used to replace the coefficient matrix by a matrix for which the corresponding linear system of equations is easier to solve, while keeping the same solution. The solution methods that are used in OpenFOAM are discussed here.

2.2. Direct Methods

Even though direct methods, as stand-alone solvers, are not efficient for solving sparse systems of linear equations due to their high computational cost, their discussion will lead the way for introducing efficient iterative methods in the next sections.
Direct methods apply some form of Gaussian elimination techniques to solve the above mentioned system of linear equations. The modern version is the LU factorisation, which computes the lower and upper triangular matrix L and U, such that
A = L U .
For diagonal dominant matrices, the upper triangular matrix U is constructed by performing Gaussian elimination on matrix A, while the elements of L consist of the Gauss factors by which the rows of A are multiplied to get U. Furthermore, the diagonal elements of L are set equal to one, so for a 3x3 matrix, Eq. 8 becomes
a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 = 1 0 0 l 21 1 0 l 31 l 32 1 u 11 u 12 u 13 0 u 22 u 23 0 0 u 33 .
When L and U are found, the problem A u = b is split into two problems that are much easier to solve
L y = b , U u = y .
To avoid rounding errors it might be necessary to apply some form of pivoting, i.e. reordering the rows of matrix A. However for diagonal dominant matrices this is not necessary [?]. If A is Symmetric Positive Definite (SPD) the LU-factorisation reduces to the Cholesky factorisation of A.
A = C C T ,
Where C is a lower triangular matrix. This results to even more memory and compuational savings as only one triangular matrix needs to be computed.
The efficiency of the direct methods however is lost due to the occurrence of fill-in after factorisation, where many zero entries in the sparse matrix become nonzero during the factorisation process. This significantly increases the memory requirements and computational cost, thereby disrupting the advantages of sparse matrix storage, especially for large practical CFD applications. This leaves an opening for iterative methods.
It will be shown that direct methods are very suitable as preconditioners for the Krylov subspace methods where the sparsity pattern is deliberately kept after factorisation, known as incomplete factorisation, so that A LU or A CC T .

2.3. Basic Iterative Methods

The linear solvers in most CFD codes make use of iterative methods. Iterative methods are techniques created to obtain an approximate solution of linear systems. For the implementation of these methods, successive approximations are used. Starting from an initial guess u 0 , a new approximation u k is obtained at each iteration k, until an approximated solution is found that is close enough to the exact solution u.
The error vector is defined as
e k = u u k .
Solving the error vector is just as difficult as solving the exact solution of the discrete linear system A u = b. A computable measure of the quality of the approximation is therefore obtained from the residual,
r k = b Au k ,
which represents the imbalance of the approximation of the conservation laws, with r k = 0 being the exact solution. A common stopping criterion or tolerance ( ϵ ) for iterative methods is the relative residual, defined as the 2-norm of the residual of the k-th iteration divided by the 2-norm of the right-hand side,
r k = | | r k | | 2 | | b | | 2 ϵ .
The idea of an iterative method is that the matrix A is decomposed into two matrices, M and N. Such that A = M N ; and the original linear system A u = b transforms into:
Au = ( M N ) u = b .
rearranging terms we obtain:
Mu = Nu + b = ( M A ) u + b .
The latter system is used to perform an iterative process, finding at each iteration (k) a more accurate solution. Most of the iterative methods are derived from the following recurrence relation:
u k + 1 = M 1 ( M A ) u k + M 1 b ,
Or after rearranging and using Eq. 13
u k + 1 = u k + M 1 r k ,
where the matrix M is chosen such that M 1 can be determined easily, so for example a diagonal or triangular matrix.
Basic iterative methods (BIM) are obtained by decomposing the system matrix as A = D E F (see Figure 1) . D being the diagonal of A, and E and F are the strictly lower and upper parts. Based on the choice of M and N, different iterative methods can be obtained. Some of the BIMs are presented in Table 1, of which Jacobi and Gauss-Seidel are the options for OpenFOAM’s BIM solver smoothSolver.
Out of presented methods, the Successive Over-Relaxation method is significantly faster than the others for optimal damping parameters ω , which value is not straightforward to find. Nevertheless, BIMs in general are too slow for engineering problems as stand alone solvers, as they are only effective in eliminating the high frequency components of the error, but very slow with reducing the low frequency components (Figure 2). This gets worse for finer meshes.
Nevertheless, it is the ability of eliminating the high frequency error components (smoothing) why the BIMs serve as the second set of building blocks for more efficient iterative methods, in addition to the direct methods. As smoothers for the multigrid method or as preconditioners for the Krylov subspace methods.

2.4. Multigrid Methods

The convergence of a BIM can be accelerated with a multigrid method, which is known to be among the most efficient solvers for elliptic problems [7] , such as the steady-state transport equation of a particular scalar ϕ
· ( ρ U ϕ ) advection = · Γ ϕ diffusion + Q ϕ source .
The basic idea of a multigrid method is to first smooth the error with a BIM, then transferring the remaining low frequency error components (recursively) to a coarser grid without losing information, until the problem is small enough to solve directly, after which a correction can be transferred back to the fine grid. As a result, the number of iterations required for convergence becomes nearly independent of the mesh element size, and the total computational cost scales approximately linearly with the number of unknowns. This high efficiency explains why multigrid methods are widely used in many CFD codes [3,8,9] . Depending on the size of the 2D or 3D mesh, the speed-up can be one or multiple orders of magnitude as compared to the BIM [2].
The detailed procedure is a follows:
1.
Presmoothing - Starting from an initial guess u 0 , apply a few number ν 1 of iterations with a BIM to eliminate the high frequency error components (recall from Eq. 18).
u n h = u n 1 h + M 1 r n 1 h ,
where n = 1 , . . , ν 1 and the superscript h denotes the fine grid size.
2.
Restriction - To transfer the residual to a coarser grid of grid size H, we use the restriction operator I h H
r H = I h H r h ,
where superscript H denotes the coarse grid size.
3.
Solve coarse grid error - The residual can now be used to determine the coarse grid error by combining Eqs. 12 and 13
A H e H = r H .
This is known as the residual equation. As mentioned before, solving this equation is just as hard as solving the original system of linear equations. However, at the current grid level the problem is small enough to solve it directly and the obtained error vector can be used as an approximation at the fine grid in the following step.
4.
Prolongation - The error e H is projected into the fine grid using the prolongation operator I H h (inverse of I h H ), and the vector u ν 1 h is corrected with the approximated error.
u ν 1 h = u ν 1 h + I H h e H
5.
Postsmoothing - Apply a few number of postsmoothing ν 2 iterations as described in step 1 to obtain u 1 = : u ν 1 + ν 2 h
In the above procedure, the sequence described by steps 2 to 4 is known as the defect correction, or coarse grid correction (CGC) and can be summarised by the following equation
u n h = I I H h ( A H ) 1 I h H A u n 1 h .
Although the BIM and the CGC are individually not efficient, the combination of the two as applied above is very efficient as they effectively reduce all the frequency components of the error. The described procedure above is essentially a two-grid cycle when going through the steps once. When steps 1-2 (hence also steps 4-5) are done recursively within one loop we arrive at the multigrid method. Most of the times this is necessary as the long wavelengths of the error modes become shorter in the coarser mesh and therefore presmoothing is needed to eliminate them quickly. Applying a certain number of the restriction-step will lead to a negligible cost of solving the residual equation compared to a smoothing sweep at the finest grid.
Applying step 3 only once is called a V-cycle, and applying it multiple times can result in a W-cycle or an F-cycle, which are more accurate but require also more work. A schematic is shown in Figure 3. There are many strategies and choices to tune the multigrid method, such as the amount of smoothing sweeps, grid-levels, and cycles, with varying trade-offs between speed of solving a single iteration and the overall rate of convergence.
One of the approaches to construct the coarser grid is by applying the restriction operator directly to the system matrix, which is known as the algebraic multigrid (AMG). This method is relatively easy to implement, but works as a black-box solver as it does not have any geometric interpretation of the mesh. The geometric multigrid (GMG) on the other hand obtains the coarser grid by clustering (or agglomerating) the cells of the mesh.
OpenFOAM works with the generalised method of geometric-algebraic multigrid (GAMG) which by default operates as a GMG solver, but also gives the option to enable the older AMG implementation (of its predecessor) instead [?]. The GMG solver agglomerates the cells by pairing each cell with an unpaired neighbour cell with the largest shared face area. This is repeated for every coarse grid level. An examples is shown in Figure 4. If a cells does not have a match, it will be added to the neighbouring group. The system matrix is adjusted accordingly.
The GAMG is a V-cycle method where the user can select a smoother (Gauss Seidel being most favoured) and adjust certain parameters, such as the mesh size at the coarsest level, the amount of presmoothing and postsmoothing sweeps. The solver calculates automatically the amount of intermediate coarse grid levels. A practical example of how the GAMG solver is set up is shown in Sec. Section 3.
Next to multigrid methods, another class of solvers called Krylov subspace methods can be adopted.

2.5. Krylov Subspace Methods

In this section a few other solvers are presented that are currently used in OpenFOAM, which are also one of the most efficient methods for solving large linear systems [1]. These methods are based on projection processes onto Krylov subspaces K i , that is
K i ( A , r 0 ) = s p a n { r 0 , Ar 0 , A 2 r 0 , . . . , A i 1 r 0 } ,
where K is called the Krylov space of dimension i corresponding to matrix A and initial residual r 0 (Eq. 13). The basic idea of Krylov subspace methods is that, for large and sparse matrices, it is more efficient to use A only in matrix–vector products. In this way, the approximate solution is constructed as a polynomial in A applied to the initial residual, without explicitly forming or inverting the matrix [10].
The Conjugate Gradient (CG) method is a Krylov subspace method that is developed to minimize the following quadratic equation
F ( u ) = 1 2 u T A u b T u ,
where A is an SPD matrix. Solving this minimization problem (which is finding u * such that F ( u ) = 0 ) is equivalent to solving A u = b , and falls under the type of gradient methods (see Figure 5).
The solution u * is approached recursively via the relation
u k + 1 = u k + α k d k ,
where d k is the search direction for the next iteration. One way is to choose the residual vector as the search direction, which is known as the steepestdescent method. However, this can lead to the same search directions being computed more than once and oscillations occurring around the local minima, leading to very slow convergence. Therefore in the CG method, every search direction should be in a unique direction. This is accomplished by selecting a set of search directions that are A -orthogonal (or A - conjugate ) to the previous directions. This means that they satisfy the condition d k T A d j = 0 , for k j .
The CG method starts with the residual vector being chosen as the first search direction,
d 0 = r 0 = b A u 0 .
The factor α k in Eq. 27 ensures that the minimum point is found along the current search direction, which can be derived to the following expression
α k = d k T r k d k T A d k .
After obtaining the new value for u (Eq. 27), the new residual is calculated,
r k + 1 = r k α k A d k .
To make sure that the next search direction is A -conjugate to the previous, the following coefficient is used which is simplified to,
β k = r k + 1 T r k + 1 r k T r k ,
so that finally the next search direction can be calculated, which completes the algorithm,
d k + 1 = r k + 1 + β k d k .
If A is SPD and the denominators of Eq. 29 and Eq. 31 are equal to zero, it means that the CG method breaks down when the problem is already solved ( r k = 0), which makes the method robust. The rate of convergence of the CG methods depends on the spectral properties (i.e. the eigenvalue distribution) of matrix A and is related to the condition number of matrix A ,
κ 2 ( A ) = λ max ( A ) λ min ( A ) ,
where λ m a x and λ m i n are the largest and smallest eigenvalues of A , respectively. The more clustered the eigenvalues are, the smaller the value of κ 2 ( A ) , hence faster convergence. For CFD applications, the condition number is about the square of the maximum number of grid points, hence the standard CG method is slow [2].

2.5.1. Preconditioners

Preconditioners are used to accelerate the convergence by transforming the problem into a similar one, but with a more clustered eigenvalue distribution. This is done by multiplying the original system of equations by the inverse of the preconditioned matrix P as follows
P 1 A u = P 1 b .
In order for the preconditioned matrix P to be effective it must fulfill the following requirements. P should approximate A , the computation of P 1 should be cheap, and the condition number of the transformed system should be smaller than that of the original system of equations,
κ 2 ( P 1 A ) = λ max ( P 1 A ) λ min ( P 1 A ) < < κ 2 ( A ) .
For the CG method, P must also be an SPD matrix. Therefore, a good and common choice is the Cholesky factorisation (Eq. 11), which yields P = CC T . But due to fill-in, it takes large amounts of work and memory to construct C . Hence, the non-zero fill-in elements are disregarded, leading to the incomplete Cholesky factorisation of P .
The algorithm of the CG method (Eqs. 27 - 32) can be transformed into that of the preconditioned CG method by replacing r k with z k = P 1 r k . The modification of the CG method is summarized in Algorithm 1.
Algorithm 1 Preconditioned Conjugate Gradient (PCG) method
r 0 = b A u 0 and d 0 = P 1 r 0 ▹ Choose starting direction
for k = 0 , 1 , . . . , until convergence do
     z k = P 1 r k
     α k = d k T z k d k T A d k
     u k + 1 = u k + α k d k
     r k + 1 = r k α k A d k
     β k = r k + 1 T z k + 1 r k T z k
     d k + 1 = z k + 1 + β k d k
end for
Due to its superlinear convergence and for being robust, the preconditioned CG (PCG) method is one of the best methods for solving the heat equation and the pressure corrector equation (Eq. 6), as the coefficient matrices A that result from discretising these diffusion equations on the mesh are SPD.
For the general transport equations which contain the advection term, the coefficient matrix A is not symmetric. Therefore the PCG method is not applicable. A significant amount of Krylov subspace methods are developed for general non-SPD matrices A, but none have all the following three nice properties which the CG method has for SPD matrices:
1.
It is a Krylov subspace method: u k K k ( A , r 0 )
2.
Optimality: the minimizer of the quadratic function (Eq. 26) exists, hence convergence is guaranteed.
3.
Short recurrences, hence requiring low computing work and storage.

2.5.2. General matrices

The Bi-Conjugate Gradient (BiCG) method is a Krylov subspace method for general matrices that transforms the unsymmetrical system into a symmetrical one so that it can by solved using a modified CG method. This is done by rewriting Au = b as
0 A A T 0 u ^ u = b 0 ,
where u ^ is a dummy variable which is not solved for, but only used to convert the system. The update relations in the CG method for the residuals and search directions are augmented by their shadow equivalents which are based on A T ,
r ^ k + 1 = r ^ k α k A T d ^ k
and
d ^ k + 1 = r ^ k + 1 + β k d ^ k .
The orthogonality of the residual and search direction with their shadow counterparts is ensured via the relation
r ^ i T r j = d ^ i T Ad j = 0 , i j ,
hence the name bi-orthogonal or bi-conjugate. The BiCG method is very close to the CG method and generates the same solution as the CG method if A is SPD. It does however require two matrix-vector multiplications (with A and with A T ) instead of one, such that every iteration is nearly twice as expensive as compared to the CG method.
Although the BiCG method shares two of the nice properties of the CG method, there is no optimality for general matrices A , hence convergence is not guaranteed. Moreover, the irregular convergence can lead to serious break-down due to large rounding errors. The Biconjugate Gradient STABilized (Bi-CGSTAB) method [11] is a more robust variant where the recurrence relations r ˜ k = P k ( A ) r 0 are modified to a form
r ˜ k = Q k ( A ) P k ( A ) r 0
such that the multiplication with A T is avoided. For this recurrence, a k -th degree polynomial is taken of the form
Q k ( A ) = ( I ω 1 A ) ( I ω 2 A ) . . . ( I ω k A )
with suitable constants ω k in the k-th iteration that minimizes residual r k with respect to ω k , giving the method a semi-optimality property and smoother convergence behaviour as compared to the BiCG method. The Bi-CGSTAB method is currently the most recent and advanced Krylov subspace method that is implemented in the standard library of OpenFOAM, and can be preconditioned using e.g. the incomplete LU factorisation which is appropriate for general matrices. It is much faster than its predecessor, but sudden breakdown due to rounding errors still occur for the more difficult problems in this paper.

2.5.3. Other Krylov Subspace Methods

Although not used in this work, it is worth to mention the Generalized Minimal RESidual (GMRES) method [1], which falls under another class of Krylov subspace methods for solving general systems of linear equations. It is a stable method with respect to rounding errors, and has an optimality property which guarantees convergence. While the CG method aims at minimizing the A-norm of the error u u k (only possible for SPD matrices), the GMRES method finds orthogonal vectors (search directions) that minimize the Euclidean norm of the residual r k , leading to a comparable superlinear convergence behavior as the CG method. The main drawback however is that the whole sequence of search directions from the previous iterates have to be stored and multiplied with, such that the computational and memory requirements quickly become prohibitive if the method does not converge after a few iterations. Restarting the method after a certain number of iterations overcomes this limitation of long recurrences, but comes at the cost of losing the nice convergence properties due to throwing away all search information [12]. The generalized conjugate residual (GCR) method is similar to the GMRES method but with somewhat more floating point operations per iteration. However, the GCR method gives the ability to truncate to the last few search directions that are saved for the next iteration, which in return performs better than restarting the GMRES method.
Hybrid methods provide a balance between the short recurrences of the BiCG-type methods on one side and the optimality property of the GMRES-type methods on the other. First on the list is the GMRES Recursive (GMRESR) method [12] that consists of an inner and outer loop. The solution is first approximated in an inner-loop with the restarted GMRES method, after which the found search directions are condensed to the outer-loop, where they are used to approximate the solution with the truncated GCR method. The GMRESR method generally speeds up the convergence rate of the restarted GMRES method. Later improvements of the GMRESR-type methods are the GCRO-type methods, where the inner iteration loop takes place in a Krylov subspace orthogonal to the subspace of the outer loop (so-called subspace recycling), yielding further acceleration of the convergence rate[13] [14].
The final hybrid method worth mentioning, which leans more towards the BiCG-type methods, is the Induced Dimension Reduction (IDR) method. This method was already proposed before the Bi-CGSTAB method but has been revived to a more generalized variant IDR(s) [15]. The IDR(s) method is closely related to the Bi-CGSTAB method in the sense that the matrix-vector multiplication with A T is avoided via a minimizing polynomial. The main difference is that the generated residuals are forced to be in subspaces of decreasing order until an s-dimensional space is reached. For the right value of s (typically s 10.), this robust and efficient method is at least as fast as the Bi-CGSTAB method when s = 1 (original IDR method) and significantly faster for increased values of s, especially for more difficult problems. When s is large enough, the rate of convergence is nearly as good as full GMRES [16]. Hybrid methods such as the IDR(s) method and the GCRO-type methods are good candidates for the next generation of linear solvers for general matrices to implement in OpenFOAM.

3. Impact Linear Solvers on Convergence of the SIMPLE Algorithm

The speed of convergence towards a solution depends not only on the type of linear solvers used, but also on the type of problem that is being solved. This section evaluates the performance of the advanced linear solvers in OpenFOAM by conducting several numerical experiments on different cases.

3.1. Relative Tolerance

The solver algorithms of OpenFOAM are constructed in ’inner’ and ’outer’ loops. In the inner loops, the linear systems of equations are solved sequentially, and this is where the linear solvers do the work, which are referred to as ’inner’ iterations. The algorithm proceeds to the next linear system of equations in the inner loop once the current relative tolerance target is reached. In the outer loop, the flow develops to the next (pseudo) time step, which may or may not require a time integration, depending on whether the SIMPLE algorithm is used for steady-state problems, or the PISO algorithm for transient problems, or combinations thereof (PIMPLE). As the algorithms are iterative themselves, the linear solvers do not need to iterate to the smallest possible error in each loop before the algorithm progresses.
The 2D steady-state heat equation is a problem which does not require time integration or a flow to be developed. As such, this problem is solved within 1 outer iteration, and even up to 1 inner iteration with the most efficient solver, as is shown in Figure 6 and Figure 7.
In a fluid-related problem, however, the flow is physically developing on each outer iteration. Aiming for a low relative tolerance per inner loop can be more stable, but it also makes it much more time-consuming. The famous Pitz and Daily wind tunnel tutorial case [5] demonstrates that using a relative tolerance of 10 1 instead of 10 3 for all equations will lead to about twice as fast convergence, while a relative tolerance of 10 3 leads to smoother convergence with less oscillations, as is shown in Figure 8.

3.2. Numerical Experiment: Non-Reacting Flow

This section evaluates the performance of the linear solvers on a three-dimensional full scale, non-reacting and non-rotating kiln. The used solver is rhoSimpleFOAM, which is the standard solver for turbulent non-reacting steady-state compressible flows.
The geometry is shown in Figs. Figure 9 and Figure 10, where the air flow enters the domain at atmospheric pressure through the primary and secondary inlets with 0.15 kg/s and 1.6 kg/s, respectively, while the fuel inlet flow is excluded. Further boundary conditions are shown in Table 2. The mesh contains 2.3 million cells and all simulations are run for 5000 outer iterations on 20 cores in a single node.
Figure 11 shows the basic stream pattern in the longitudinal cross-section, where the annular shaped recirculation zone surrounding the burner is spotted.

3.2.1. Solver Combinations

The most advanced standard linear solvers of OpenFOAM have been tested on this applications, in different combinations which are shown in Table 3. When a Krylov subspace method is used, the solver name is preceded with a preconditioner. Instead of the IC and ILU factorisations, OpenFOAM implements the simplified diagonal-based variants as preconditioner, respectively the DIC and DILU, where off-diagonal terms are dropped. The reciprocal of the preconditioned diagonal is calculated and stored. The MG solver can also be used as a preconditioner. The settings for OpenFOAM’s MG solver (GAMG) is shown in Appendix A, and for these type of problems with complex shape and flows, the geometric variant of the GAMG method (agglomerator: faceAreaPair) is preferred, which is about twice as fast as the algebraic variant.

3.2.2. Results

Before discussing the linear solver performance, it is observed that by using the advanced solvers of OpenFOAM usually leads to 1 or 2 inner iterations per outer iteration for all variables, except with the pressure equation, which usually requires 1 or more orders of magnitude of inner iterations, before moving to the next outer iteration. Therefore the pressure-corrector equation is the major bottleneck that delays the solution to converge quickly. This occurs for most non-reacting flow cases. Figure 12 shows that the CG method with the DIC preconditioner is the least efficient method for the pressure equation, requiring 2 orders of magnitude of inner iterations on average to reach the relative target of 10 3 . The MG method performs significantly better, and the most efficient method is the combination of the former two: the CG method with the MG as preconditioner.
Figure 13 shows the overall performance of the different linear solver combinations. Three critera are used to determine the speed of convergence:
  • The time (in hours) to reach 5000 outer iterations,
  • The amount of outer iterations to reach convergence,
  • The time (in hours) to reach convergence.
The time needed to reach 5000 iterations with the different combinations is in line with the amount of required inner iterations for the pressure equation as is discussed above, since the pressure equation is the bottleneck.
When looking at the amount of outer iterations required to converge, the CG method with MG preconditioner also requires the least amount of iterations. This further accelerates the convergence time, which is a factor 2 to 3 times faster than the other solver combinations. However, if the solver is forced to a strictly compressible flow environment by setting the option transonic to on, the pressure equation becomes non-symmetric and the MG method will not work.
With regard to the other variables, the fastest solver is the Bi-CGSTAB method with the DILU preconditioner. The MG method, as solver or as preconditioner, is a bit slower because the other variables only require 1 or 2 inner iterations to reach the relative target of 10 3 , and one MG operation is computationally more expensive than one Bi-CGSTAB operation, and much more expensive than one DILU operation.
The simulation can be accelerated further by relaxing the relative target from 10 3 to 10 1 , as is shown in Figure 14. This however may lead to less stability. In fact when using the DICCG-DILUBiCGSTAB method, the solution will even quickly diverge, as can be seen in Figure 15, while the MGCG-DILUBiCGSTAB method remains stable and the convergence is accelerated with 25 % as compared with the relative tolerance of 10 3 . Comparing these two extremes, with on one side the slowest of the selected methods that is restricted to a relative target of 10 3 , and on the other side the fastest method that is stable at a relative target of 10 1 , a speed-up of nearly a factor 7 is reached. This speed up does not even consider the basic iterative solvers of OpenFOAM (smoothSolver).
Table 4. Simulation time of the different methods w.r.t time and outer iterations, using a relative tolerance of 10 3
Table 4. Simulation time of the different methods w.r.t time and outer iterations, using a relative tolerance of 10 3
Method Time to 5000 iter Convergence Time to Convergence
MG-DILUBiCGSTAB 3.4 h 3000 iter 2.2 h
MG-MG 4.3 h 3500 iter 3.1 h
DICCG-MG 5.3 h 3000 iter 3.6 h
DICCG-DILUBiCGSTAB 6.4 h 5000 iter 6.4 h
MGCG-DILUBiCGSTAB 2.3 h 2500 iter 1.2 h
MGCG-MGBiCGSTAB 2.7 h 2500 iter 1.4 h
Table 5. Simulation time of the different methods w.r.t time and outer iterations, using a relative tolerance of 10 1
Table 5. Simulation time of the different methods w.r.t time and outer iterations, using a relative tolerance of 10 1
Method Time to 5000 iter Convergence Time to Convergence
MG-DILUBiCGSTAB 2.0 h 3000 iter 1.2 h
MG-MG 2.5 h 4000 iter 2.0 h
DICCG-MG 2.7 h 3000 iter 1.8 h
DICCG-DILUBiCGSTAB Diverged
MGCG-DILUBiCGSTAB 1.5 h 3000 iter 0.9 h

3.2.3. Further Advancements

As discussed in the end of Sec. Section 2, faster linear solvers do exist than the ones provided by OpenFOAM. The state-of-the-art linear solvers and preconditioners can be found in the open source library Portable, Extensible Toolkit for Scientific computation (PETSc), and a plug-in into OpenFOAM is developed in [17]. Their work also discusses the very poor scalability of the MG preconditioner of OpenFOAM for massively parallel clusters (103 cores), whereas the DIC preconditioner is shown to have outstanding superlinear scalability. It is reported that the DIC preconditioner overtakes the MG preconditioner in convergence time when using more than 2000 cores for a 3D laminar lid driven cavity flow problem consisting of 64 millions cells. The PETSc counterpart of OpenFOAM’s MG preconditioner solves the scalability issue.

4. Conclusion

This work has demonstrated the large differences in speed and robustness between the linear solvers available in OpenFOAM, with convergence times varying by more than an order of magnitude. The pressure-corrector equation was the major bottleneck that delayed the solution from converging quickly. Several of the more advanced standard linear solvers were tested with a 20-core high-performance computer on a 3D full-scale kiln, consisting of 2.3 million cells.
The most efficient and stable solution method turned out to be the Conjugate Gradient (CG) solver, combined with the Generalised Geometric-Algebraic MultiGrid (GAMG) as a preconditioner. A speed-up factor of 7 was reached as compared with the slowest of the advanced methods. Even higher speed-ups were reached compared with the basic linear solvers of OpenFOAM. Also, the stability is enhanced when the GAMG is used as preconditioner for the Bi-CG Stabilized (Bi-CGSTAB) method instead of the simple diagonal-based preconditioners.
Finally, due to the iterative nature of the SIMPLE algorithm, a faster linear solver than the MG-CG combination was redundant for this case as each time step required 1 or 2 iterations.

Author Contributions

Conceptualization, M.A.; methodology, M.A.; software, M.A.; validation, M.A. and C.V.; formal analysis, M.A.; investigation, M.A.; resources, C.V.; data curation, M.A.; writing—original draft preparation, M.A.; writing—review and editing, C.V.; visualization, M.A.; supervision, C.V.; project administration, C.V.; funding acquisition, C.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Appendix A

The settings for OpenFOAM’s MG solver (GAMG) is shown here:
solver GAMG;
tolerance 1e-9;
relTol e-3; //e-1;
smoother GaussSeidel;
cacheAgglomeration true;
nCellsInCoarsestLevel 1500;
agglomerator faceAreaPair; //algebraicPair;
mergeLevels 1;
nPreSweeps 1; //1 for p, 0 for all other
nPostSweeps 2;
nFinestSweeps 2;
maxIter 800;
directSolveCoarsest false;
where the number of cells at the coarsest level is set to 1000, which is roughly the square root of the total amount of cells.

References

  1. Saad, Y. Iterative methods for sparse linear systems; SIAM, 2003. [Google Scholar] [CrossRef]
  2. Ferziger, J.H.; Peri’c, M. Computational Methods for Fluid Dynamics, 3rd ed.; Springer: Berlin, 2002. [Google Scholar]
  3. Versteeg, H.K.; Malalasekera, W. An Introduction to Computational Fluid Dynamics: The Finite Volume Method, 2nd ed.; Pearson Education: Harlow, 2007. [Google Scholar]
  4. Moukalled, F.; Mangani, L.; Darwish, M. The Finite Volume Method in Computational Fluid Dynamics: An Advanced Introduction with OpenFOAM and Matlab; Springer: Cham, 2016. [Google Scholar] [CrossRef]
  5. Greenshields, C. OpenFOAM user guide; OpenFOAM Foundation Ltd., 2017. [Google Scholar]
  6. Patankar, S.V. Numerical Heat Transfer and Fluid Flow; Hemisphere Publishing; McGraw-Hill, 1980. [Google Scholar] [CrossRef]
  7. Tielen, R.; Möller, M.; Göddeke, D.; Vuik, C. p-multigrid methods and their comparison to h-multigrid methods within Isogeometric Analysis. Comput. Methods Appl. Mech. Eng. 2020, 372, 113347. [Google Scholar] [CrossRef]
  8. Trottenberg, U.; Oosterlee, C.; Schüller, A. Multigrid, 1st ed.; Academic Press: London, 2001. [Google Scholar]
  9. Wesseling, P.; Oosterlee, C. Geometric multigrid with applications to computational fluid dynamics. J. Comput. Appl. Math. 2001, 128, 311–334. [Google Scholar] [CrossRef]
  10. Wesseling, P. Principles of Computational Fluid Dynamics, 1st ed.; Springer: Berlin, 2001. [Google Scholar] [CrossRef]
  11. van der Vorst, H.A. Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CGfor the Solution of Nonsymmetric Linear Systems. SIAM J. Sci. Stat. Comput. 1992, 13, 631–644. [Google Scholar] [CrossRef]
  12. van der Vorst, H.A.; Vuik, C. GMRESR: a family of nested GMRES methods. Numer. Linear Algebr. With Appl. 1994, 1, 369–386. [Google Scholar] [CrossRef]
  13. Amritkar, A.; de Sturler, E.; Świrydowicz, K.; Tafti, D.; Ahuja, K. Recycling Krylov subspaces for CFD applications and a new hybrid recycling solver. J. Comput. Phys. 2015, 303, 222–237. [Google Scholar] [CrossRef]
  14. Thomas, S.; Baker, A.; Gaudreaulte, S. Augmented MGS-CGS Block-Arnoldi Recycling Solvers. SIAM J. Sci. Comput. 2025, 47, A1458–A1485. [Google Scholar] [CrossRef]
  15. Sonneveld, P.; van Gijzen, M. IDR(s): A Family of Simple and Fast Algorithms for Solving Large Nonsymmetric Systems of Linear Equations. SIAM J. Sci. Comput. 2009, 31, 1035–1062. [Google Scholar] [CrossRef]
  16. Sonneveld, P. On the convergence behaviour of IDR (s). Reports of the Department of Applied Mathematical Analysis, 2010; pp. 10–08. [Google Scholar]
  17. Bnà, S.; Spisso, I.; Olesen, M.; Rossi, G. PETSc4FOAM: A Library to plug-in PETSc into the OpenFOAM Framework. Pr. White Pap. 2020. [Google Scholar]
Figure 1. Splitting of the coefficient matrix A
Figure 1. Splitting of the coefficient matrix A
Preprints 214755 g001
Figure 2. Smoothing process of a BIM
Figure 2. Smoothing process of a BIM
Preprints 214755 g002
Figure 3. Different multigrid cycles starting from the finest grid at the top, towards the coarsest grid at the bottom and back.
Figure 3. Different multigrid cycles starting from the finest grid at the top, towards the coarsest grid at the bottom and back.
Preprints 214755 g003
Figure 4. Geometric agglomeration using face pairs in three coarse grid levels 1, 2 and 3.
Figure 4. Geometric agglomeration using face pairs in three coarse grid levels 1, 2 and 3.
Preprints 214755 g004
Figure 5. Comparison of the convergence between the steepest descent and CG method for an n x n matrix of size 2. In theory, the CG method should converge after n steps.
Figure 5. Comparison of the convergence between the steepest descent and CG method for an n x n matrix of size 2. In theory, the CG method should converge after n steps.
Preprints 214755 g005
Figure 6. Example 2D steady-state heat equation problem in a rectangular surface with 4000 equally divided cells, along with its boundary conditions.
Figure 6. Example 2D steady-state heat equation problem in a rectangular surface with 4000 equally divided cells, along with its boundary conditions.
Preprints 214755 g006
Figure 7. Residual plot of the 2D steady-state heat equation problem.
Figure 7. Residual plot of the 2D steady-state heat equation problem.
Preprints 214755 g007
Figure 8. Comparison between two simulations of the Pitz and Daily wind tunnel tutorial case using 10 3 and 10 1 as relative tolerance, respectively. The pressure equation is solved with the GAMG method , while the other equations are solved with the Bi-CGSTAB method and ILU preconditioner.
Figure 8. Comparison between two simulations of the Pitz and Daily wind tunnel tutorial case using 10 3 and 10 1 as relative tolerance, respectively. The pressure equation is solved with the GAMG method , while the other equations are solved with the Bi-CGSTAB method and ILU preconditioner.
Preprints 214755 g008
Figure 9. global dimensions of the kiln, with the inner and outer radii of the inlets
Figure 9. global dimensions of the kiln, with the inner and outer radii of the inlets
Preprints 214755 g009
Figure 10. Inside view of the burner side of the kiln with the three inlets.
Figure 10. Inside view of the burner side of the kiln with the three inlets.
Preprints 214755 g010
Figure 11. Stream pattern inside the kiln for the studied case in this section.
Figure 11. Stream pattern inside the kiln for the studied case in this section.
Preprints 214755 g011
Figure 12. amount of inner iterations of the pressure equation per outer iteration, using a relative tolerance of 10 3 .
Figure 12. amount of inner iterations of the pressure equation per outer iteration, using a relative tolerance of 10 3 .
Preprints 214755 g012
Figure 13. Simulations durations of the different methods w.r.t time and outer iterations, using a relative tolerance of 10 3
Figure 13. Simulations durations of the different methods w.r.t time and outer iterations, using a relative tolerance of 10 3
Preprints 214755 g013
Figure 14. Simulations durations of the different methods w.r.t time and outer iterations, using a relative tolerance of 10 1
Figure 14. Simulations durations of the different methods w.r.t time and outer iterations, using a relative tolerance of 10 1
Preprints 214755 g014
Figure 15. Residuals of the studied case for the least efficient and most efficient solver combination.
Figure 15. Residuals of the studied case for the least efficient and most efficient solver combination.
Preprints 214755 g015
Table 1. Basic iterative methods.
Table 1. Basic iterative methods.
Method M N Iteration
Richardson I I A u k + 1 = ( I A ) u k + b
Jacobi D E + F u k + 1 = D 1 ( E + F ) u k + D 1 b
Damped Jacobi ( 1 / ω ) D E + F u k + 1 = ω D 1 ( E + F ) u k + ω D 1 b
Gauss-Seidel D E F u k + 1 = D 1 ( F u k + E u k + 1 ) + D 1 b
Successive Over-Relaxation D ω E ( 1 ω ) D + ω F u k + 1 = ω D 1 ( F u k + E u k + 1 ) + ( 1 ω ) u k + ω D 1 b
Table 2. Boundary conditions for the kiln model.
Table 2. Boundary conditions for the kiln model.
Variable Primary air Secondary air Walls Outlet
T[K] 293 773 zG* zG
Y C H 4 [-] 0 0 zG -
Y O 2 [-] 0.23 0.23 zG -
Y N 2 [-] 0.77 0.77 zG -
m ˙ [kg/s] 0.15 1.6 - -
*zG stands for the Neumann boundary condition zeroGradient.
Table 3. Tested solver combinations. The first solver of each method is used for the pressure equation, while the second solver is used for all other independent variables .
Table 3. Tested solver combinations. The first solver of each method is used for the pressure equation, while the second solver is used for all other independent variables .
Method p Eq. Other variables
MG-DILUBiCGSTAB Solver: MG Solver: Bi-CGSTAB, Prec: DILU
MG-MG Solver: MG Solver: MG
DICCG-MG Solver: CG, Prec: DIC Solver: MG
DICCG-DILUBiCGSTAB Solver: CG, Prec: DIC Solver: Bi-CGSTAB, Prec: DILU
MGCG-DILUBiCGSTAB Solver: CG, Prec: MG Solver: Bi-CGSTAB, Prec: DILU
MGCG-MGBiCGSTAB Solver: CG, Prec: MG Solver: Bi-CGSTAB, Prec: MG
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated