Provision of a New Method to Improve the Detection of Micro Seismic Events

Natural events such as floods, fires, tsunamis, earthquakes and others have nowadays caused serious damage to human beings and nature. The precise detection of these natural events and especially the earthquake has nowadays become the focus of many computer and geoscientific researchers. Computer science and machine learning algorithms have revolutionized early detection and prediction of these events. Hence, a fuzzy method has been initially used in this article to enhance the authenticity of data based on application of effective variables and then combination of neural network algorithms of the MLP perceptron and radial network of RBF in form of a collective learning system in order to more accurately identify seismic events on a small scale. It was observed after simulating the proposed method that the proposed method has significantly improved based on actual error and root-mean-square error (RMSE) criteria compared to basic methods.


Introduction
Study of seismic parameters can lead to a better understanding of the seismic event's detection mechanism and can improve some of the abilities to predict the earthquake.In many cases, the variation in seismic parameters can correlate with the occurrence of an earthquake and can be considered as a phenomenon to predict stronger aftershocks in the future [1] [2] [3] [4] [5].Hence, the analysis of seismic parameters based on the diversity of predictive signals before an earthquake is a common ground for research.[6] [7] [8] [9] [10].In data mining, neural networks and fuzzy inference systems have proven their power in science and engineering based on their consistent network.Neural network algorithms are capable of simulating the learning process similar to human brain's abilities.In fact, these networks are a set of neurons which act similar to biological nerves.Neural networks make learning both linear and nonlinear functions an effective tool for exploring sophisticated communications [11].Hence, we have been able to use MLP (Multi-Layer Perceptron) and RBF (Radial Basis Function) Neural Network Algorithms in the form of an Ensemble Learning System in order to pinpoint micro seismic events and perform detection more accurately for important events.The remainder of this article is divided as follows: the works done in the past have been evaluated in section 2. Method of collecting and analyzing data has been described in section 3. The proposed model is presented in section 4 along with the proposed architecture.The obtained results are mentioned in section 5 and the final conclusion will be provided in section 6.

Previous records
Sharma et al. [18] presented several approaches to detect signals from noises and also evaluated their accuracy and observed that the proposed approach can significantly increase the detection of seismic events.Fuxian et al. [19] proposed a waveform correlation approach based on an array (adapted filter) to increase the capability of detecting small-size seismic events with mechanisms and location similar to a near-major occurrence.After identifying weak events, they used a modified spectrograph method to identify the phases.This technique has been tested on a downhole monitoring data from micro-seizure events that are derived from hydraulic fractures.It has been shown in this article that an event with a signal-to-noise ratio of 6 dB which uses m / long-time shorttime stack array (STA / LTA) to detect the mean under a logical proportion of false alarms which is hardly detectable will be easily detectable by collation of stack array.Ahad Zamani and his colleague [20] examined temporary spatial variations in seismic parameters for September 10, 2008, Qeshm earthquake in southern Iran.For this purpose, artificial neural networks and also adaptive nerve fuzzy logic systems (ANFIS) were used.The ANFIS model and the base radial function (RBF) model have been implemented due to their efficiency in classification and forecasting.Franthey et al. [21] presented a new method for detecting seismic events, as well as their position using similarity of domains corrected by a source mechanism.The method of this article is to process multichannel microscope data which have been obtained in form of large arrays.Javan et al [22] provided an adaptive filtering method for making data downhole micro sections noise-free.This methodology uses apex-shift parabolic radon transformation.This algorithm has been implemented in two steps.In the first step, the mentioned transformation is applied to normalized root of mean square of the microscope data to detect an event.Radon coefficients are effectively calculated by limiting g the pathways for the radon operator's integration.In the second step, a new radon transformation is applied to each component to improve the recorded signal.

Data gathering and analysis
In present study, the limited experimental seismic method has been used to carry out related experiments.The used tools and devices are as follow: The SPseise3 seismograph with sensors (geophones) connected to the device at a distance of 2.5 meters from each other.A calibrated 20 Kg weight.Calibrated meter.
The environmental conditions of experiment included a natural environment in a land with rugged soil and asymmetric geometry and in three steps as follow: In an area without slope In an area with positive slope: in this step, the sensor is located in a point higher than landing level of the weight.
In an area with negative slope: in this step, the sensor is located in a point lower than landing level of the weight.The experiment was carried out by creating hit and shake on the ground through leaving 20kg weight from height of 1.5m, in a way that three times falling was performed from aforementioned heights in each of the three surfaces (without slope, positive slope-10% and negative slope-10%).Since three steps of experiment were carried out each of the surfaces and data were received from three sensors in each step, the total of obtained data equals to 3*3*3=27 which a sample of extracted signals has been shown in this section in diagrams.

Average velocity of p-wave
In this experiment, the average velocity of the initial wave propagation, which is important for analysis, is 500 m / s, which is obtained by dividing the distance between the sensor and the hypocenter (seismic focus) by the time that the signal is received by the device.

Sample rate
In this research, discrete sampling was performed on continuous wave with the time period of 267 microseconds.The numbers in the Excel data sheet are the wave amplitude in micro-volt, i.e. in every 267 microseconds, a point of the signal is read.In the following tables, examples of datasets are presented.In Table (4   After gathering data from different sensors in far, middle and near distances as well as in different statuses of non-slope, positive slope and negative slope, it was needed to combine data and produce a coherent dataset to determine the cluster of each sample.For this purpose, all of the gathered data were prepared in the form of Excel file.Therefore, in present study, the experiments were carried out on 12690 samples.

Describing the proposed method
In this section, the flowchart and architectural design of the proposed design for more accurate detection of micro seismic events with the help of fuzzy logic and the combination of MLP and RBF neural network algorithms are initially described in form of a collective learning system.Then, fuzzy logic and method of combining neural network algorithms of MLP perceptron and RBF are evaluated.Finally, the combination of methods and evaluation criteria will be described.The architecture of the proposed method has been shown in figure 1.

Figure (5): architecture of proposed method
As can be deduced from figure (1), seismic events are initially calculated by the devices described in the previous section and are recorded in the system.In the method proposed in this article, after entering recorded seismic events, the standard deviation gradient of the ground and the distance to the sensor were calculated according to (1) and (2).These parameters are effective in detecting seismic events more accurately and enhancing the authenticity of earthquake vibrational signals.
After calculating the mentioned parameters, the obtained results are applied to the fuzzy system (FIS).Used fuzzy system has 5 functional blocks which are 1) Rule Base Block (including some if-then rules) 2) Database Block (Definition of Multiple Membership Functions Based on Fuzzy Rules and Fuzzy Collections) 3) Decision System Unit (Providing Decision Making Based on Input Criteria and generated rules) 4) Fuzzy interface block (converting input parameters to fuzzy inputs) 5) Defuzzification interface (converting fuzzy inputs to real values).In the proposed fuzzy system, the inputs of the problem calculate the level of elongation with the generated rules so that these parameters can be more effective in detecting and predicting seismic events.New data are generated with more complete variables after calculating the elongation of the recorded signals.
Data which are needed to identify and predict seismic events more accurately are divided into two parts of experimental and training events.Events are divided based on balanced sampling logic.
Training events constitute 80% of the data for training algorithms and experimental events constitute 20% of the data in order to evaluate and validate the proposed method.After dividing events, training data are applied to MLP and RBF perceptron neural network algorithms so that they train their models.In the method proposed in this study, the combination of these two neural network algorithms is used to more accurately detect seismic events in the form of a collective learning system.Therefore, the experimental events enter collective system and some events are predicted for each sample which is better and more favorable than the main event.Each of the algorithms used in the collective learning system is described below.

Multilayer Perceptron Neural Network (MLP) Algorithm
A multilayer perceptron is actually a class of artificial feed neural networks in Feedforward.A MLP consists of at least three layers of nodes.Each node is a neuron which uses a nonlinear activation function except for the input nodes.MLP uses a supervised learning technique called Backpropagation for training [12] [13].As it has been shown in figure 2, A MLP neural network consists of three layers, an input layer, an output layer and a hidden layer.Neurons in the input layer act as buffer to distribute the x i signal between the neurons inside the hidden layer.Each j neuron in the hidden layer considers the output of its input signals of x i after multiplying them in the strength of the connection weights (W ji ) in form of sigma function.This relation has been shown in equation 1. (1)

= (∑ )
In which f is usually in form of a sigmoidal function or hyperbolic tangent.The outputs of the neurons in the output layer are also calculated in the same way.In Figure 2, you will see an overview of perceptron networks.

Figure (6): Multilayer perceptron based on backpropagation technique [14]
Training a network includes adjusting its weights using an algorithm.In [14], the standard BP algorithm is much simpler than the QP, DBD, and EDBD algorithms and its implementation only needs two parameters while three parameters are required for implementation of DBD and four parameters are required for implementation of QP and nine parameters are required for implementation of EDBD.One of the reasons that makes BP to be preferred is that the higher number of parameters needed will increase the probability of miscalculation and this can be a reason for weak performance of these several algorithms compared to BP [15].Therefore, BP core is used in this article in perceptron neural network.The BP algorithm [16] causes a change in the size of ∆ ( ) in the binding weight between i and j neurons in k rotation in this way: (2) In which α is called the learning factor, μ is impulse coefficient and ∆ ( − 1) is change in weight in the previous turn rotation.Training a MLP using the BP algorithm involves displaying a sequence with all of its (training) members (Input, goal, and output).The difference between the target output of Yd (k) and actual output of y (k) of MLP is distributed over the network for weight matching.A training cycle ends when each member of the training sequence has been provided to the network and its weight has been updated [15].

Radial Neural Network Algorithm (RBF)
Radial neural networks is a nonlinear statistical method and can be used to model the complex relations between inputs and outputs or to find fields in a data set.RBF is a type of feedforward neural network which consists of 3 layers which are the input layer, the hidden layer and the output layer.Each of these layers has different tasks [18].A general block diagram of a RBF network has been shown in figure 3.

Figure (7) : General structure of the RBF neural network
Outputs of the input layer in RBF networks are obtained by calculation of distance between network inputs and hidden layers' centers.The second layer is the linear hidden layer and outputs of this layer are weighted forms of the output of input layer.Each neuron of hidden layer has a parameter vector called center.Therefore, a general expression of the network can be as follows: (3) Norm is usually considered to be Euclidean distance and the base radial function is also considered to be Gaussian function which is defined as follows: (4)

( ) = (− . || − || )
In which I is the number of neurons in the hidden layer, J is the number neurons in the output layer, w ij is the weight of i th neuron and j th output, Ø is the base radial function, αi the diffusion parameter of i th neuron, x is input data vector, Ci is the vector of center of i th neuron, jβ is threshold output of j th neuron and y j is the net output for j th neuron [17].

Detection of seismic events with collective learning system
The combination of results obtained from both RBF and MLP neural network algorithms is used in collective learning system and a mean is taken from the obtained results.Combination of predictions and obtaining an accurate prediction is among important features of this system.The following equation is sued to obtain a precise prediction in form of a collective learning system: (5) In above equation, indicates number identified for the seismic event of i by the collective learning system, M is the number of algorithms used in collective learning system (M = 2), j is the counter of algorithms.W j is the weight assigned to each algorithm and finally Pj is the number detected for i-th event by MLP or RBF neural network algorithm.Levels of importance have been considered in this article for used algorithms.The accuracy of detecting seismic events was initially calculated by each of the mentioned algorithms.Then, a weight was assigned to it based on the accuracy of each algorithm.The weights assigned to the algorithms have numbers between 0 and 1.In the example below, the prediction or detection of seismic events has been shown for one sample.If the allocated coefficients by the specialist are not appropriate such as following conditions, the achieved numbers would be far from the real amounts, and this will increase the model error.Considering that the perceptron algorithm is more accurate in detecting seismic events in the above example, it is given more importance and number detected by this algorithm is multiplied by a greater number.Hence, we can use this method to detect seismic events more accurately.

Evaluation criteria
A review of the findings of scientific research from past to present shows that the most important statistics describing the random variable in the field of digital signal processing are as follows:

Mean
In T statistics, arithmetic mean is a type of measurement with tendency toward center which is calculated from sum of the values in a dataset divided by their number based on the result.

= ∑
In above equation, M represents the mean, n shows the number of samples (seismic events) and X shows the sample values.

Standard deviation
The standard deviation which is represented by σ symbol is one of the dispersion indicators in statistics which shows the average distance of data from the mean value.If the standard deviation of the set of data is close to zero, it shows that data are close to the mean and have a small dispersion while a large standard deviation reflects a significant dispersion of data.The standard deviation is equal to second root of variance.An example of standard deviation has been shown in figure 4.  In the above equation, N is the number of samples, Xi is the intended sample and is the mean of samples.The variance rate is calculated as follows when only a part of information is available: Positive kurtosis: kurtosis is positive when it is more than zero.In this case, distribution of data will be more stretched than the normal curve.Leptokurtic kurtosis distribution is at peak when it is greater than 0.5.This type of kurtosis is also known as sharp kurtosis.Data is close to the mean in positive kurtosis.
Negative kurtosis: kurtosis is negative when its value is less than zero.In this case, the shape of the curve is shorter than the normal curve, and it becomes platykurtic.
Natural kurtosis: the curve is normal when its kurtosis is equal to zero.Data is distributed equally in this distribution.
More Leptokurtic shape of possibility's density function leads to its higher kurtosis index.The level of kurtosis of a normal distribution is equal to 3. At the same time, the kurtosis is equal to fourth normalized torque.The equation related to kurtosis has been shown below: Low and high kurtosis can be observed in figures ( 6) and ( 7).The level of kurtosis in this article is calculated based on generated fuzzy rules.

Evaluation of result
In general, in this research, we first calculated the variance of standard deviation, slope, skewness and others and then calculated kurtosis using existing fuzzy rules, in the end, MLP and RBF neural network algorithms are combined in the form of a combined collective learning system and level of seismic events are detected more accurately.In all of the diagnostic and predictive algorithms, valuation of the real error criteria and the mean square error are the most important criteria which have always been the target.The following equation shows the method of calculating classification error of seismic events. ( The following equation shows the method of calculating mean real error and the squared error. (

= ( ^2)
Some results were obtained from simulation of the method presented in this article with the help of MATLAB and RAPIDMINER software applications which will be evaluated below.Figure 8 shows the process of detecting seismic events using the combined method in comparison with other methods.

Figure (11): the process of detecting seismic events using the combined method in comparison with other methods
As it is clear from the above figure, given that the MLP algorithm has less error than the RBF algorithm, the weight factor of this algorithm can be increased and the core of collective learning system can be improved.In general, the error rate of seismic events detection in the MLP-RBF method is better than the MLP and RBF methods.Figure 9 shows the comparison of the actual error of the earthquake detection in the proposed method with other methods.As it is compared in figure 9, the error rate of accurate seismic events in the combined method has improved by respectively about 6.8% and 29.5% compared to two MLP and RBF methods.Figure 10 shows compression of MAE in the proposed method with other two methods.

Figure (13): compression of MAE in the proposed method with other two methods
As it is compared in figure 10, the error rate of accurate seismic events in the combined method has improved by respectively about 13.15% and 50.14% compared to two MLP and RBF methods.Figure 11 shows compression of RMSE in the proposed method with other two methods.

Figure (14): comparison of squared errors in detecting seismic events in the proposed method and other methods
As it is compared in figure 11, the error rate of accurate seismic events in the combined method has improved by respectively about 3.46% and 15.97% compared to two MLP and RBF methods.

Conclusion
In this article, we have used the combination of two popular algorithms named MLP and RBF in form of a collective learning system to more accurately detect micro seismic events.It was observed by simulating the proposed method for more accurate detection of seismic events and vibration

RMSE
comparison of squared errors in detecting seismic events in the proposed method and other methods authenticity the level of improvement in detection of seismic events has significantly improved in the combined method compared to two MLP and RBF algorithms from the perspectives of detection error level compared to main vibrations and squares of errors.Hence, we can definitely use the combination of MLP and RBF neural network algorithms to accurately detect seismic events.

Figure ( 1 )
Figure (1): falling from a height of 1.5 m on a positive-sloping surface ), the columns related to t height indicate the heights of weight falling and the rows are for the slope of surface.S1 indicates positive slope and means the down to the top route.S2 indicates negative slope and meant the top to the down route and S3 indicates flat and non-slope surface.The columns of the second row indicate three sensors have been used in the experiment.Sensor A is the farthest sensor from the source (7.5m), sensor B is the middle one from the source (5m) and sensor C is the nearest sensor to the source (2.5m).For example, S7 indicates the received signal from sensor (A) i.e. the farthest sensor from the source in the first falling i.e. from the height of 0.5 m in non-slope surface.The rest of the experiment is in a same way, i.e.S27 indicates the signal received from the nearest sensor to the source (sensor C) in falling from the height of 1.5m in the negative slope surface (down to top route).Different statuses of the experiment with different heights and sensors have been shown in following figure.

Figure ( 8 )
Figure (8) : An example of the calculation of standard deviationStandard SD or variance means the distance from the normal level (mean) and standard deviation is a tool to measure the amount of data dispersion which is shown with σ when the data is from the entire population and is shown with s when data consists of a part of population.The variance of the whole population has been shown in the following figure.

=
In the above relation, x' represents the mean of samples.Ultimately, the standard deviation is calculated based on the following formula:We can use standard deviation to have a standard to determine what is normal and what is more or less4.4.3.KurtosisKurtosis in statistics and theory shows the degree to which the probability distribution is with peaks.Kurtosis is also among dispersion indicators and is a distance indicator.This indicator determines the level of concentration, dispersion, and elongation of data of a large distribution.There are three types of kurtosis which are: Figure (9): an example of low kurtosis

Figure ( 12 )
Figure (12): the comparison of the actual error of the earthquake detection in the proposed method with other methods