ANNs-BASED METHOD FOR SOLVING PARTIAL DIFFERENTIAL EQUATIONS : A SURVEY

: Conventionally, partial differential equations (PDE) problems are solved numerically through discretization process by using ﬁnite difference approximations. The algebraic systems generated by this process are then ﬁnalized by using an iterative method. Recently, scientists invented a short cut approach, without discretization process, to solve the PDE problems, namely by using machine learning (ML). This is potential to make scientiﬁc machine learning as a new sub-ﬁeld of research. Thus, given the interest in developing ML for solving PDEs, it makes an abundance of an easy-to-use methods that allows researchers to quickly set up and solve problems. In this review paper, we discussed at least three methods for solving high dimensional of PDEs, namely PyDEns, NeuroDiffEq, and Nangs, which are all based on artiﬁcial neural networks (ANNs). ANN is one of the methods under ML which proven to be a universal estimator function. Comparison of numerical results presented in solving the classical PDEs such as heat, wave, and Poisson equations, to look at the accuracy and efﬁciency of the methods. The results showed that the NeuroDiffEq and Nangs algorithms performed better to solve higher dimensional of PDEs than the PyDEns.


Introduction
Partial differential equation (PDE) consists of an equation specifying a relation between the derivatives of a function of one or more derivatives and the function itself [1]. The recent advances in modern science have been based on discovering the underlying PDE for the process in question [2]. Hence, the ability to solve PDEs fast and accurately is an active field of research and industrial interest [3]. Conventionally, PDE problems are solved numerically through discretization process by using finite difference approximations [4]. A number of methods for this approach include Runge-Kutta, finite difference, etc. are available for approximate the solution accurately and efficiently [5][6][7]. However, while these methods are efficient and well-studied, these traditional methods are require much memory space and time. Thus made the approximation computational process costly [7]. As alternative approach, we can replace traditional numerical discretization method with Artificial Neural Networks (ANNs) to approximates the PDE solution [8].
ANNs, as a core of machine learning, are ideal for solving a large scale of of ML tasks [9,10]. The simplest form of neural networks, called multilayer perceptrons, be the popular estimators function [11]. Furthermore, ANNs have been investigated since early 1990s to solve the PDE problems. Lee, & Kang [12], used a parallel architecture to solve the first order differential equations (ODEs) by implementing the Hopfield neural network models. Meade and Fernandez [13], on the other hand, solved the linear and non-linear ODEs by using the feed-forward neural networks. Lastly, Lagaris, Likas & Fotiadis [14], used artificial neural network for solving ODEs and PDEs by considering the initial and boundary conditions.
After took a long pause, the development of ANNs in PDE problem took more and more attempt to done in early 21st century. Malek et. al. [15] are used the hybrid neural network and Nelder-Mead simplex method to find numerical solutions of high-order PDE. While, Hussian et. al. [16] modified the neural network to solve PDE. After that, the DeepGalerkin-Method which uses deep neural networks for solving tasks at high dimensions also introduced [17,18]. Then, other method namely Physics Informed Neural Network (PINN) are itroduced by Lu et. al. [8] for solving PDE, and improved by Guo et. al. [19]. This improved PINN takes the physical information in PDE as a regularization term, which improves the performance of neural networks. However, This is potential to make Scientific Machine Learning as a new sub-field of research.
This possibility of using ANNs to solve PDE show nice advantages including continuous and differentiable solutions, good interpolation properties, and less memory-intensive [20]. Compared to the traditional methods, neural network approximation could taking advantage of the automatic differentiation [21], and could break the curse of dimensionality [22]. Given the interest in developing neural networks for solving PDE, makes a lots of an easier methods that allows researchers to solve problems [20]. We discovered that there are several methods that used to solve ordinary and partial differential equations (ODEs and PDEs) with neural networks: Nangs [23], PyDEns [24] and NeuroDiffEq [20].
In this paper, the comparisons between those three methods are discussed, regarding capability and the effectiveness in solving PDE. In section 2, we will overview general methodology of ANNs for solving PDE. In section 3, the approximation theory and error analysis which underlies each methods are also discussed. In section 4, the algorithms of each method for solving PDE are derived, and we demonstrate the results of each method for solving different types of PDE in section 5. Finally, we conclude the paper in section 6.

Artificial Neural Networks
Artificial Neural Networks (ANNs) are built quite a while, since first introduced back in 1943 by Warrent McCulloch and Walter Pitts. The inventors was inspired of how biological neurons in animal brain might work together to perform complex computations [9]. The early success of ANN was faded away at the early beginning, since left out behind other machine learning techniques. However, it turned around in the 1990s following the tremendous increased of computational power and huge quantities data available to train ANNs. This technology has been successfully applied to a wide variety of real-world applications [25], including in the differential equation problems [12] as discussed in this paper.
Perceptron (see Fig. 1a) is one of the simplest ANNs architectures which is heavily used today. Perceptron consists of more than one hidden layer called Multilayer Perceptron (See Fig. 1b). Here we will also overview the general methodology of neural networks to solve PDE. Consider the PDE of the form: with initial condition u(x, t = 0) = u 0 (x), and boundary conditions u(x, t) = g(x, t), x ∈ σΩ, L is a spatial differential operator and f is known as a forcing function.
Let us assume a four-layered perceptron ( Fig. 1b) with the inputs x, and t, which are hidden layers consisting with m and n hidden units (neurons) respectively. Its aim is to obtain the trained multilayer perceptron that approximates u(x, t), by using the approximate function u net (t, x; θ), for x and t are the inputs, and θ contains the adjustable parameter weights and biases. For every unit input x and t, the process begins from input layer to the first hidden layer described as follows where w xi and w ti are the weights of the inputs x and t to the first hidden layer respectively, and b 1 are the first hidden layer biases. Hence, it is activated by the hyperbolic tangent function (tanh) as follows [10] The next step is the process of the feeding of the second hidden layer by the first hidden layer which is based on the formula as follows where v ij are the weights of the first to the second hidden layers, and b 2 are the biases. When 6 became to output layer, it turned into the form where p j are the weights of the second hidden layers to the output layers. This feeding process also called Feed-Forward Neural Networks(FFNNs). Then, it is also easy to express the k-th derivatives of u net (x, t; θ) with automatic differentiation (AD) in terms for k = 1, 2, . . . , n.
The performance of the solution u net (x, t; θ) is measured through the computational of the loss function. Therefore, the goal of all methods solved PDE is to minimize the loss function as well. Each method mentioned earlier has different techniques to build the loss function. If the loss function reaches a near-zero value, we can assume that our ANNs is indeed a solution to the PDEs [20,23]. To ensure the loss function is minimum, we have to adjust the parameter weights and biases by updating them being optimum. Thus it is possible to use any optimization techniques. The three methods discussed in this study used Stochastic Gradient Descent (SGD) optimizer to speed up their convergence.

Approximation theories and error analysis of the methods
Three methods for solving PDEs discussed include PyDEns, NeuroDiffEq, and Nangs which all are based on ANN. Those methods differ by the way generating the data points, setting up the boundary conditions, as well as the loss functions.

PyDEns
PyDEns was built under of DeepGalerkin-Method as it was introduced in [17,18,24]. To approximate the solution u(t, x) in 1 using neural network u net (t, x; θ), the loss function which is associated with the training problem consists of [18,24]: Generate m points inside the batches of b 1 , b 2 , b 3 from the domain [0; T] × Ω, [0; T] × σΩ, and {x} × σΩ respectively. These are the uniform distribution v 1 , v 2 , v 3 . Then, for each point (x, t), do the following operation.

2.
Compute the accuracy of the approximation solutions which satisfy the differential operator: 3.
Compute on how well the approximation solutions satisfy the boundary conditions:

4.
Measure on how well the approximation solutions satisfy the initial condition: Combining the three terms 10, 11, and 12 above gives us the loss function associated with training the neural network: In other words, the u net (t, x; θ) is fitted to validate all components 11, 12, and 13. The next step is to minimize the loss function by using the stochastic gradient descent (SGD) [18].

NeuroDiffEq
The key idea of solving PDEs using NeuroDiffEq is by casting the trial approximate solution (TAS), u T (x, t; θ) [20]. The algorithms of NeuroDiffEq are defined as follows: 1.
Generate a set of input in m × n grid points P = (x, t), where x = (x 1 , . . . , x m ), t = (t 1 , . . . , t n ) inside the domain [0; T] × Ω, which is fed through the multilayer perceptrons. The generated points are drawn from whatever distribution chosen, such as the uniform distribution, equally-space, etc.

2.
Develop the TAS which is a known function consists of the input P and the outputs u net (x, t; θ). Furthermore, the TAS can be defined as the form of [26] where A(x) is chosen to satisfy the boundary conditions and F(x, t; u net ) is chosen to be zero for any x, t on the boundary. This produces a TAS which automatically satisfies the boundary conditions regardless the ANN outputs. This approach is similar to the trial function approach [14], but with a different form of the trial function. Modifying the TAS for initial conditions can also be done. In general, the transformed solutions have the form of [20] u 3.
The TAS is developed in order to minimize the error function [26] J It is noted that the first term of is the approximate solutions of the PDE itself, where P is the composed of a finite set of points within the PDE domain, while the second term of satisfies the Dirichlet (D) and Neuman (M) boundary/initial conditions, where P D and P M are the set of points where the boundary values g D and g M are specified respectively, andn(x, t) is the inwardly directed unit normal to the boundary at (x, t). The weighting factor η determines the relative importance of the two error components, [26].

Nangs
The scenario of Nangs method for solving PDE problems is quite similar to the PyDEns. It is not necessarily, for instance, to build any trial solution in order to minimize the loss functions. The only difference is that to define a set of points inside the domain, it is generated without using any distributions, instead, we build a mesh points as done in the traditional methods. For more detail, the algorithms of solving PDE using Nangs are described as follows: 1.
Define a set of mesh points of P = (x m , t n ) ∈ [0, T] × Ω inside the domain. These mesh points are the combination of the internal, boundary and initial points (see Fig.  2). In addition, they are used as the input values of the ANNs feed. The internal points of ANN are compared with the right hand side of the PDE itself by using the loss function 3.
The initial and boundary points of the PDE are compared with the outputs of the initial and boundary conditions of ANN 2 and 3 respectively by using the loss function:

PyDEns, NeuroDiffEq, and Nangs in solving heat equation
We discuss on how the three methods are vary to solve the PDE problems. As an illustration, we took some problems of PDE elliptic from [24], PDE hyperbolic, and PDE parabolic from [27]. Here we first solve the heat equation, which is an elliptic equation, by using the three methods. Given with the initial condition u(x, 0) = sin(πx), and boundary conditions u(0, t) = u(1, t) = 0.
In order to compare the effectiveness of each method, we used the same ANNs architecture which contains 3 hidden layers with 32 neurons each. The accuracy of this PDE solutions by using ANNs algorithm is evaluated up to 200 × 200 dimensions for the unit input (x, t). Here, all methods uses the same optimization technique, which is stochastic gradient descent (SGD). The number of iterations used in solving the heat equation problems vary for each method, the reasons would be explained in the next section. For better results, we can do several things, for instance, change the ANN architecture, turn on the hyperparameter tuning which is appropriate to the ANNs architecture, add the number of iterations, or change the optimization method. Except for PyDEns, as far as we know, PyDEns can only use SGD.

Solving heat equation using PyDEns
Compute the output optimize the weight and biases by applying the SGD optimizer. 8: end while

Solving heat equation using NeuroDiffEq
Solving PDE with NeuroDiffEq start with generating up to 200 × 200 data points inside the domain like other method did. However, NeuroDiffEq didn't separate into a batches. After generated the data points, we have to build TAS in order to satisfied the boundary/initial conditions 23 and 24. To know detail about TAS, you can refer to [26]. The detail of solving heat equation 22 are describe with algorithm 2. for each (x, t) ∈ P do 4: Compute Build TAS which satisfied the initial condition based on 24 as follows or it can also be expressed as u T (x, 0; θ) = sin(πx) + (1 − e 1 )u n (x, 0; θ). 6: Compute the loss function as in 13 by comparing the output in with equations 22, 23, and 24 as follows end for 8: Minimize the loss function by updating the weight and the biases by using SGD optimizer 9: end while

Solving heat equation using Nangs
To solve the heat equation 18 based on the Nangs method, firstly we define a set of points used as the training data. These points are the collected from the domain (x, t) which is built uniformly as a mesh points in the entire domain. Furthermore, the grid points are split into internal and and initial and boundary points. Note here, the initial and boundary points are the points which are exactly on the edge of the domain, while the internal points are inside of the domain (see Fig. 2). These points would have different associated loss function. The loss function computed is inside of the domain (x, t) ∈ [0, 1]. The algorithm goes in algorithm 3. Compute the outpus as follows Compute each internal points, and compare with the original PDE of equations 22 as follows Compute each initial and boundary points and compare with the original initial and boundary condition of the PDE as in 23 and 24 as follows end for 8: Optimize the weight and the biases by using the SGD optimizer. 9: end while 10: The entire progress was repeated until the final iteration reached.

Comparison Results and Discussion
In this section, the three methods PyDEns, NeuroDiffEq, and Nangs to solve the PDEs heat, wave, and Poisson are discussed. The detail of the three algorithms to solve the heat equation has been discussed on the previous section, similarly, we used the three methods to solve the PDE wave and PDE Poisson. For the better results, we did several things such as changed the ANN architecture, performed the hyperparameter tuning, and added the number of iterations. We compared the performances of the three methods with the analytical solutions. All results is presented in several tables and figures. Table 1 showed the comparisons methods for solving heat equations as in 22, 23, and 24. The visualization of the results is displayed in figures 3a, 3b, 3c, 3d, 3e, and 3f to see clearly the performance of each method.    As can be seen on the Table 1, PyDEns solved lower dimensions of the problem more accurately compared with the NeuroDiffEq and Nangs methods, where the MSE values of the three methods for solving 75 dimensions of PDE are respectively 2.2 × 10 −4 , 0.0031, and 0.0081. In contrast, we have to set lower iteration on PyDEns since it gave NaN values on 10.000 iteration. As you can see, it will affect to the MSE result. Meanwhile, the other two methods can consistently solved higher dimensions, i.e. 200 dimensions, than the PyDEns method, with the MSE values of PyDEns, NeuroDiffEq, and Nangs methods are respectively 0.0635, 1.5 × 10 −5 , and 1.9 × 10 −5 . However, in terms of computational times, Nangs method is the best one since it consumed only 14.13 minutes to solve 200 dimensional problems, compared to 27.35 minutes and 1.30 hours for PyDEns and NeuroDiffEq respectively solving the same dimensions.

Comparison between PyDEns, NeuroDiffEq, and Nangs methods in solving heat equation
The performances of the three methods for solving heat equations were clearly seen in figures 3a -3f. The accuracy of each method compared with the analytical solution was described as in Figure 4. The NeuroDiffEq curve gets closer to the analytical solution curve when solving 200 dimensional problems.

Comparison between PyDEns, NeuroDiffEq, and Nangs methods in solving Poisson equation
The Poisson equations used in this study has the form of [24]: with boundary conditions We presented all results of PDE Poisson solutions in Table 3 and Figures 7a, 7b, 7c, 7d, 7e, and 7f.  Table 3, the MSE values of PyDEns, NeuroDiffEq, and Nangs for solving 50 dimensional problems were respectively 0.0056, 0.0087, and 0.0873, whereas when solving 200 dimensional problems, the MSE values were respectively 0.0453, 0.0014, and 0.0018. In the terms of computational times, differently, PyDEns consumed the least computational times with 6.37 minutes only for solving 200 dimensions, compared to NeuroDiffEq and Nangs which took 1.29 hours and 22.59 minutes for solving the sam problems.

Advantages and disadvantages of PyDEns, NeuroDiffEq, and Nangs
Based on the results explained in the previous section, we can see that in general, each method has its own advantages and disadvantages in the different situations. In terms of accuracy performance, Nangs method consistently produced the lowest MSE compared with PyDEns and NeuroDiffEq. Similar trend to the computational times, Nangs method is the fastest one compared with PyDEns and NeuroDiffEq, although in the case of Poisson equation, PyDEns is the fastest one. NeuroDiffEq method potentially produced a small MSE for solving high dimensional problems, however, it was very costly. PyDEns on the other hand, was only well performed when solving low dimensional problems.
There are also some weaknesses of the three methods, PyDEns gave a NaN result if we enforced it to solve high-dimensional PDE problems. Meanwhile, NeuroDiffEq as mentioned before, is very ineffective method for solving any dimensional PDEs. Lastly, Nangs method is very recommended to be modified to perform better. What we thought regarding Nangs method was by changing the optimizer, and adding the number of iterations, Nangs method would be the best method for solving high dimensional PDE problems.

Conclussion
We have discussed PyDEns, NeuroDiffEq, and Nangs methods for solving the variations of PDE problems, include heat equation, wave equation, and Poisson equation, with ranging dimensions which are from 50 to 200. These methods allows us to train ANNs to approximately solve the PDEs. We compared them in terms of the accuracy and the efficiency. As a results, both NeuroDiffEq and Nangs showed a well performance for solving the higher dimensional PDE problems, whereas PyDEns showed the least performance. In fact, PyDEns potentially produced a NaN of MSE when solving more than 100 dimensions. To overcome this issue, we should cut down the number of iterations, however it would affect in obtaining the large of MSE. Thus, we highly recommended to use PyDEns if we only solved lower dimensions, compared to use NeuroDiffEq which is more costly for the same problems. We also recommend to change the ANNs architecture to maximize the model performance, for instance to change the number of layers, neurons, learning rate, method for picking a sample inside domains, and even change the optimization method.