Geomechanical Modelling Using Artificial Neural Networks Combined With Geostatistics

The principal minimum horizontal stress plays an important role in the study of reservoir characteristics, modeling of oil and gas reservoir, drilling, production and stimulating wells. However, it is currently not possible to measure the minimum horizontal stress along the wellbore as a geophysical parameter logging. Minimum horizontal stress is measured by leak-off test (LOT) at only several points in a well. In order to have the values all along the wellbore, experimental formulas were established to determine the minimum horizontal stresses for different fields. Then these formulas must be calibrated with LOT data whose number is usually limited, even sometimes unavailable. On the other hand, the empirical formulas of one field might not be accurate for another.


Introduction
Geomechanics and geophysics are essential tools for reservoir simulation, for assessing hydrocarbon reserves, as well as for drilling and production. In a geomechanical model, stress information is particularly important, especially the minimum horizontal stress (Shmin) which is obviously a significant parameter used for many kinds of work in petroleum engineering.
Theoretically, Shmin can be determined by equation relating to other geomechanical parameters such as pore pressure and vertical stress [1]. On the other hand, empirical formulas were also established to determine minimum horizontal stress indirectly through sonic wave velocity measured from logging [2]. In reality, Shmin values determined from these empirical formulas still need to be calibrated with values obtained from the leak-off test (LOT). However, the main inconvenient of LOT is the limitation of data because LOT is normally conducted at some discrete points along the wellbore, and the number of wells taking LOT data is often limited. Consequently, the small amount of LOT data may reduce the accuracy of the calibration. Moreover, as each field has its own geological characteristics, the application of the same formulas from one field to another will lead certainly to errors.
With the development of Artificial Intelligence, new methods based on artificial intelligence tools applied to technical fileds are increasingly popular. As one of artificial intelligence tools, artificial neural network (ANN) helps computers perform tasks for human with self-learning thinking.
Dowd and Saraς (1994) presented a brief description of ANN and explained the basics of the feedforward back-propagation network and the use of ANN in geostatistics. The multilayered feedforward back-propagation algorithm was used to predict the desired variogram values. This purely mathematical research concluded that geostatistical simulation via neural network is a possibility [3].

M. Kanevsky et al. (1996) used ANN and geostatistics methods for environmental
mapping. The authors drew a map of the nuclear radiation distribution around a reactor where a nuclear accident happened in 1986. From the radioactive data measured at various points, the geostatistical interpolation method was used to draw a map of the distribution of nuclear radiation of the whole region. Besides, another map was made using artificial intelligence. The authors compared geostatistical and ANN models by using cross-validation technique as well as validation data sets. The result showed that the method of using ANN was very promising in environmental and Earth sciences [4].

Workflow
Due to the complexity of this study, a schematic diagram step by step of this research is presented in Fig. 1. The models using ANN were generally built with the same steps, but the data was different for each case. Firstly, the influence of parameters such as depth, number of wells and number of stratifigraphies on the ANN models were studied. Secondly, these results will help to develop a model using combination of ANN and geostatistical method which in this study is the Kriging interpolation. A comparison between the results of this model with the results given by only geostatistical method was also effectuated so that the advantages of this new workflow can be drawn.

Data set
This study collected logging data from 4 wells, which were denoted as X1, X2, X3 and X4. These wells belong to Hai Thach -Moc Tinh Field in the continental shelf of Southeast Vietnam. Fig. 2 shows a view of these 4 wells in a two-dimensional space.
The data set included the geomechanical parameters of the wells, however, in this study's scope, only 4 geomechanical parameters were used as follows:  As the depth interval of sequential stratigraphy might not be the same for each well, mud windows were established in Fig. 4 in order to determine the range of data that should be taken so that the influence of stratigraphy could be studied. Three fundamental parameters govern the value of the density while drilling: the maximum pore pressure of the considered phase, the minimum fracturing pressure of the considered phase and the critical stability density of each formation. The Mud Window is the density range between pore and fracturing pressures (Chalez, 1999) [11]. However, a safe and intact Mud Window is the density range between the collapse pressure (or breakout pressure), at which the wellbore collapses due to the pressure difference between the formation and the well, and fracturing pressure. Two distinctive stratigraphies were observed from the mud windows in Fig. 4. Hence the data were divided into two groups in order to study the influence of stratigraphic uniformity on the results: • Data set 1: which included logging data of the wells from 2500 m to 3500 m. Data in this depth range belongs to the transition between the 2 stratigraphies. Data set 1 has 2000 data points and the data was taken every 0.5 m.
• Data set 2: which included logging data of the wells from 3300 m to 4000 m. Data in this depth range belongs to the second stratigraphy from the top down. Data set 2 has 1400 data points and the data was also taken every 0.5 m.
The number of data points of two data sets is eligible to build models and train artificial networks.

Estimation workflow of minimum horizontal stress using artificial neural networks
ANN is an artificial information processing system inspired by how neural networks work in the human brain. ANN is one of the techniques of artificial intelligence applied to solve specific problems of classification, pattern recognition and data prediction. ANN converted input data into output data by calculations performed in neurons. However, in order for the output value to be accurate and reliable, the network needs to be built and trained to describe accurately the nature of the relationship between the output data and the input data.
A typical artificial neural network has three main parts: Input layer; Hidden layers and Output layer (Fig. 5).
• Input layer: The input data is entered as vectors, and the number of vectors corresponding to the number of neurons of this input layer.
• Hidden layers: Hidden layers connect the input value to the output value. The neurons in hidden layers are mainly responsible for interpreting the input layer's neurons and then sending the information to the output layer's neurons.
• Output layer: The output data is organized as vectors, and the number of vectors corresponding to the number of neurons of this output layer.
The network is built through network training process which essentially adjusts weights through epochs so that the output values are reliable and have small error compared to the actual values. At the end of the training, the weight value will be saved and used to forecast output data when other input data is available.
Gradient descent algorithm is the most widely used algorithm for optimizing artificial neural networks by updating weights and bias. The Gradient descent formula is given as follows: Based on the basic theory of ANN, the following workflow was proposed to develop ANN models in this study: Step 1: Data collected and analyzed Data of the three wells X1, X2 and X3 were used to build artificial neural networks.
Parameters including TVD, PP and Sv are selected as input and the minimum horizontal stress is taken as the output of the networks. However, depending on the purpose of the network, there will be different data sets for each model, and this will be presented more in detail in step 2. Once the ANN models are built, the data of the remaining well, X4, will be used to estimate the minimum horizontal stress.
Each data set is randomly divided into 3 parts, including: • Training data: accounts for 70% of the data set. These values are used during network training.
• Test data: accounts for 15% of the data set. These values are used to test the effectiveness of the network during and after network training.
• Validation data: accounts for 15% of the data set. These values are used to check whether overfitting has occurred or not.
Step 2: ANN models building In this study, four artificial neural networks models denoted and X3, and output data is minimum horizontal stress of these three wells. Model C has two hidden layers with the numbers of neurons in each layer are 9 and 8.
Data set 1 is used for model C. Model C uses input data of three wells, hence one well more than models A and B. This is to assess the effect of the number of wells on the Shmin estimated result.
-Model D: Input data for this model is TVD, PP and Sv of the three wells X1, X2 and X3, and output data is minimum horizontal stress of these three wells. Model D has two hidden layers with the numbers of neurons in each layer are 9 and 10.
Data set 2 is used for model D. Model D uses data set 2 instead of data set 1 to assess the impact of stratigraphic consistency on the estimation.
For reminder, data set 1 and data set 2 were mentioned in Section 2.2. Fig. 6 shows the structure of artificial neural networks. The number of neurons in each layer of neural networks is summarized in Table 1. Figure 6 -The perception network for ANN models.
The transfer function of the first and second hidden layers in four mentioned models is hyperbolic tangent sigmoid function because it squashes to a wider numerical range between -1 and 1 and has asymptotic symmetry. On average, it is more likely to create output values that are close to 0, which is beneficial when forward propagating to subsequent layers. The hyperbolic tangent sigmoid function is shown as follow: Step 3: ANN models training, test and validation Network training is basically the process of adjusting weights and bias. Weight values were taken by default with random values at the start of network training. During the training process, in each epoch, an algorithm is used to adjust the weight values until the desired error of output value is reached. This process increases the ability of the network to predict reliable results when a different set of input data is used.
The results of the network training will be displayed as mean square error (MSE) and regression coefficient (R 2 ) of the three data sets training, test and validation.
The formulas for calculating MSE and R 2 are given as follows: Step 4: Reliability checking After step 3, the artificial neural networks now must be checked for accuracy and overfitting. The MSE and R 2 values in step 3 will be displayed on a performance plot and regression plots.
An artificial neural network is considered reliable if it achieves the following results: -Network error, network test and cross-validation are low. In this study, the maximum acceptable error is about 5% of the mean value of the calculated parameter.
-Network training error is stable, which means the error does not vary too much in the last epochs.
-Overfitting is not significant. Overfitting occurs when the MSE of the training error is much smaller than the MSE of the test error.
After making sure the ANN works well with high reliability, the network can then be used for further studies.
Step 5: Application of the models for the well X4 whose minimum horizontal stress need to be estimated.
Specifically, for each model, we have input data as follows: • Model A: Input data is PP and Sv of the well X4 in data set 1.
• Model B and model C: Input data is TVD, PP and Sv of the well X4 in data set 1.
• Model D: Input data is TVD, PP and Sv of the well X4 in data set 2.

Estimation workflow of minimum horizontal stress using a combination of artificial neural network and geostatistics
The geostatistical interpolation technique used in the study is Kriging (Oliver and Webster, 2014) [12]. Basic equation of Kriging is: All data used in this section are from data set 2. The steps for building the model combining ANN and geostatistics are detailed as follows: Step 1: An ANN model named E was built using input data TVD, PP, Sv of the well X1 and network's target is minimum horizontal stress of this same well X1. Fig. 6 and Table 1 show the structure of the model E. Step 2: After the model E was trained, tested and validated in step 1, it was then used for estimating minimum horizontal stress of the wells X2 and X3.
Step 3: Developing a minimum horizontal stress interpolation model (Model I) by Kriging with original data set 2 of the three wells X1, X2, X3. Then, we interpole Shmin of the well X4 by the model I.
Step 4: Developing a minimum horizontal stress interpolation model (Model II) by Kriging with Shmin of the well X1 is taken from the original data set 2 while Shmin of the wells X2 and X3 are generated by ANN model E from step 2. Then, we interpole Shmin of the well X4 by the model II.
Step 5: Analysis the results obtained from model I and model II.

Estimation of minimum horizontal stress using artificial neural networks
After the process of network construction, training and reliability test, the ANN training results for models A, B, C and D are presented in Table 2, Table 3, Table 4 and  Fig. 7, Fig. 8, Fig. 9 and Fig. 10 where performance of error values during network training of models A, B, C and D are presented respectively.     These models were then used to estimate minimum horizontal stress for the well X4.
The values of Shmin of the well X4, which were estimated by the models A, B, C and D are presented in Fig. 11, Fig. 12, Fig. 13 and Fig. 14 respectively. The results indicated a clear closed correspondence between the predicted and actual data.    This observation can be confirmed not only visually by Fig. 11, Fig. 12, Fig. 13   Model D has MSE = 1.8952 x 10 -9 , which is 42.18 times better than model C. The uniformity of stratigraphy therefore greatly affects the results. So, it is better if the data can be divided according to each stratigraphy layer.

Neural Network and Geostatistics
As mentioned above in Section 2, two models I and II were developed with model I conducted purely Kriging interpolation while model II effectuated a combination of ANN and geostatistics. The results of 2D minimum horizontal stress interpolation map and the regression analysis are presented in Fig. 15 and Fig. 16 for model I, and in Fig. 17 and Fig. 18 for model II respectively. The results of the two models I and II showed no significant difference. Although using only one data well, the method of combining ANN and Kriging did successfully build a interpolation model similar to the one obtained using Kriging method alone which used three data wells.
The MSE and R 2 analysis of the two models I and II is summarized in Table 7.     These results indicated clearly that the combination of ANN and geostatistics worked well with acceptable estimated values of minimum horizontal stress in comparison with the model that performed geostatistics only. Hence, it is safe to say that using data of only one well to build a model combining ANN and geostatistics can be applied in practice instead of using at least three wells while using just geostatiscal methods. The number of wells with data for modeling does not significantly affect the forecast results. In fact, having a well with the necessary data is not easy. Instead of drilling multiple wells and making measurements to get data, engineers can use the optimal number of wells and build the right models and thereby estimate the required parameters with high accuracy and good performance compared to drilling new wells.

Conclusion
This paper proposed a workflow combining artificial neural networks and geostatistics to build the principal minimum horizontal stress for the four wells of Hai Thach -Moc Tinh field in Nam Con Son Basin. In comparison with geostatistical method which requires at least data of three wells to develop an accurate interpolation model, the proposed method combining ANNs and geostatistics needed only data from one well, which contributes hugely in real life practice as the number of wells with data is often limited and hence without the need to drill additional wells, with this new proposed workflow it is still possible to predict the data of an area with high accuracy.