Hybrid Deep Learning-Based Prediction of Tunnel Lining Thickness Under Seismic Loading in the Northwestern Himalayas (Jammu Region)

Kaustav Chatterjee; Abdullah Ansari

doi:10.20944/preprints202606.1741.v1

Submitted:

23 June 2026

Posted:

24 June 2026

You are already at the latest version

Abstract

Tunnels are one of the important components of modern underground infrastructures facilitating movement of people, goods and conveyance of water from one place to another place. Tunnel lining in an important structural component of the tunnel facilitating in bearing the underground load from overburden rocks and soils. Over the last two decades, different earthquakes caused movement of the soil leading to damage to tunnel lining. This study aims to develop a preliminary designing tool using hybrid deep learning models for determining the thickness of tunnel lining in earthquake prone regions. The different input parameters considered for modelling were earthquake micro-zonation, type of fault, fault length, peak ground acceleration, source-to-site distance, maximum observed earthquake, total tunnel length, overburden pressure, diameter of the tunnel, quality of rock, prone to land slide or not and the output from the model is tunnel lining thickness. Two different types of hybrid deep learning models leveraging Convolutional Neural Network (CNN), Long-Short Term Memory (LSTM) Network and transformer architecture were developed to determine the thickness of tunnel lining. The performance (Mean Absolute Error (MAE)) of the CNN-transformer model (model 1) and LSTM-transformer model (model 2) on the test dataset were 19.64 mm and 32.00 mm, respectively. Model 1 was selected for determining the thickness of tunnel lining due to its high accuracy as compared to model 2. The deep learning models showed significant potential for computing the thickness of lining in earthquake prone regions.

Keywords:

CNN

;

LSTM

;

transformer

;

tunnel lining

;

earthquake

Subject:

Engineering - Civil Engineering

Introduction

In recent times, the construction of tunnels increased significantly to cater to the need of transportation infrastructure Ansari et al. [1] and increased urbanization in different parts of the world. Highway tunnels facilitate reducing congestion in urban areas and decreases fuel consumption and greenhouse gas by creating an optimal path for vehicle movement between different locations. Moreover, tunnels serve in reducing land scarcity in urban areas and provide enhanced aesthetic urban appearance. One important component related to structural safety of tunnels is the tunnel lining. Tunnel lining serves as the primary structural component that stabilizes underground excavations and ensures long-term serviceability under both static and dynamic loading conditions.

Design of tunnel thickness of tunnel lining is an important component of tunnel designing and construction. Conventionally, design and analysis of structural parameters of tunnel linings are performed using numerical techniques (Nunes. [2], Katebi et al. [3], Zhang et al. [4], Do and Dias. [5]). Nunes. [2] estimated the soil-tunnel interaction using in multi-layered ground using experimental and numerical studies, numerical studies were performed in Plaxis to estimate the soil-tunnel interaction using plane-strain finite element modelling. Katebi et al. [3] performed 3D finite element analysis using ABAQUS to determine the effect of ground stratification, depth of tunnel and surface building specification on tunnel lining loads. Zhang et al. [4] developed 2D plane strain finite element modelling in ABAQUS to understand effect of multi-layered soil stratification on structural behavior of tunnel lining.

Recently, Artificial Intelligence-based techniques such as machine learning have found applications for predicting different structural parameters of the tunnel lining (Yin et al. [6], Zhang et al. [7], Huang et al. [8], Ye et al. [9]). Zhang et al. [7] predicted the maximum bending moment within the tunnel lining using multivariate adaptive regression splines and decision tree regressor. Yin et al. [6] predicted the compressive strength of sprayed concrete lining from four input variables including water-binder ratio, superplasticizer, coarse aggregate and fine aggregate using Back Propagation Neural Network (BPNN), support vector regression, extreme learning machine, and random forest algorithm. Ye et al. [9] predicted the upward movement of tunnel lining from shield operational parameter, geometric parameters, geological conditions and anomalous condition using BPNN, general regression neural network, extreme learning model and support vector machine. Moreover, deep learning found its application in the tunneling for maintenance of tunnels. From these studies, AI based techniques have taken a pivotal role in tunneling industry and can be used for predicting different structural parameters of the tunnels.

In earthquake prone regions, the tunnel lining are the structural members resisting complex soil-structure interaction effects, including transient ground deformation, racking deformation, ovaling, and stress redistribution induced by earthquake shaking [10]. The performance of tunnel linings depends on multiple interrelated factors such as ground stiffness, in-situ stress state, overburden depth, groundwater conditions, and lining material properties. Traditionally, design approaches are based on elastic ground reaction models [5], empirical correlations [11,12], and code-based safety factors [13], which often simplify nonlinear behavior of soil and seismic loading variability. However, increasing urbanization and the expansion of transportation networks in earthquake-prone areas demand more accurate and adaptive methods for estimating lining thickness, ensuring structural safety while optimizing material usage and construction cost.

Past earthquakes across different countries including India, China, Taiwan, Japan, and Italy showed that damage to tunnel lining is generally caused by deformation of the ground such as sudden racking and ovaling deformation, differential settlement, slope failures near portals, and permanent ground or fault displacement rather than shaking of the tunnel. Case histories from the 2005 Muzaffarabad earthquake in India documented portal instabilities and lining distress linked to widespread landslide in steep terrain, Ansari et al. [14] while investigations after the 2008 Wenchuan earthquake report that recurrent patterns such as longitudinal or transverse cracking, spalling, and local collapse in severe cases, especially near faulted and highly fractured rock zones [15].

In the tunneling industry, there was an old assumption that tunneling mountain tunnels are earthquake-safe, however the 1999 Chi-Chi earthquake in Taiwan overturned the assumption with documented lining cracking, spalling, deformation, and heightened vulnerability in sections influenced by active faulting [16]. A study from Japan (e.g., 2011 Tohoku [17]) highlights that underground systems can still suffer functional and structural issues when strong shaking triggers settlement, joint distress, and deformation demands in connected tunnel-track (Aydan. [17]) or utility components. Seismic events in Italy events demonstrated that well-built tunnels may experience notable cracking and local damage in the event of coseismic faulting or strong ground deformation [18,19] intersects the alignment, reinforcing that lining failure is often the structural fingerprint of ground discontinuity and deformation incompatibility rather than simple strength deficiency.

Throughout the world, mitigation strategies for protecting tunnel linings against seismic forces focus on accommodating ground deformation rather than increasing structural strength of the lining. Modern industry practices emphasize performance-based design approaches that account for soil-structure interaction, nonlinear ground behavior, and fault-crossing effects [20]. Flexible segmental linings with ductile reinforcement detailing are widely adopted to enhance deformation capacity, while seismic joints and compressible layers are introduced near portals and fault zones to absorb differential movement [21]. In high-risk areas, ground improvement techniques such as jet grouting, deep soil mixing, and rock bolting are employed to stabilize surrounding strata and reduce deformation demand on the lining [22]. Base isolation concepts and energy-dissipating elements have also been explored in underground metro systems to limit stress transfer during strong shaking. Additionally, advanced numerical modeling tools and probabilistic seismic hazard assessments are increasingly integrated into design workflows to optimize lining thickness and reinforcement [23]. Collectively, these mitigation measures reflect a global shift from purely strength-based design toward resilience-oriented strategies that prioritize controlled deformation, redundancy, and rapid post-earthquake service recovery.

State of the art research on tunnel lining design in seismic regions has moved beyond simplified elastic analytical solutions toward integrated, performance-based and data-driven frameworks that explicitly consider nonlinear soil-structure interaction, fault rupture effects, and spatial variability of geotechnical properties. Advanced three-dimensional finite element and finite difference simulations are now routinely used to capture racking deformation, ovaling response, and stress redistribution under multi-directional ground motion inputs [24]. Probabilistic approaches incorporating seismic hazard uncertainty and reliability-based design concepts have further refined thickness optimization by linking performance levels to acceptable damage states [25]. More recently, machine learning and hybrid deep learning models have emerged as powerful tools for predicting structural response parameters, enabling rapid mapping between geotechnical-seismic input features and optimal lining characteristics [26]. These approaches integrate large simulation datasets, feature selection techniques, and metaheuristic optimization algorithms to enhance prediction accuracy and generalization capability. Collectively, contemporary research reflects a paradigm shift toward intelligent, data-driven, and resilience-oriented tunnel design methodologies that aim to balance structural safety, economic efficiency, and long-term serviceability in earthquake-prone regions.

This study aims to develop a preliminary design tool, leveraging hybrid deep models consisting of CNN, LSTM architecture and transformer architecture for determining the thickness of tunnel lining in earthquake prone regions. Data about different parameters related to tunnel construction were acquired from different literature [1,14,27,28] majorly focusing on Jammu region in NW Himalayas. Two Hybrid deep learning models were developed using 7 input parameters including earthquake micro-zonation, fault type, maximum observed earthquake, observed earthquake, length of tunnel, quality of rock and the output from the deep learning model is the tunnel lining thickness. Figure 1 shows the overall methodology adopted in this study.

Study Area

The Jammu region of Jammu and Kashmir represents a geo-dynamically active segment of the northwestern Himalayas, forming a transitional zone between the Shivalik Hills, Pir Panjal Range, and the Great Himalaya. The region is tectonically controlled by major thrust systems such as the Main Boundary Thrust (MBT) and Main Central Thrust (MCT), resulting in high seismicity, complex stress regimes, and active fault reactivation [29]. Its geomorphology is governed by major river systems including the Chenab River, Tawi River, and Ravi River, which contribute to slope instability, erosion, and variable subsurface conditions. From an infrastructure perspective, the region hosts critical lifeline projects such as the Udhampur Srinagar Baramulla Rail Link (USBRL), incorporating iconic structures like the Chenab Bridge, along with hydropower installations such as the Salal Dam as shown in Figure 2.

Notably, Jammu has evolved into a tunnel-dominated engineering corridor, with numerous long tunnels including the Pir Panjal Railway Tunnel as well as several ongoing tunnels in the Ramban-Banihal-Reasi sectors. These tunnels traverse heterogeneous lithology, faulted zones, and high overburden conditions, often encountering squeezing ground, shear zones, and groundwater ingress. The combination of active tectonics, complex geology, and dense underground infrastructure makes the Jammu region critically important for earthquake engineering studies of tunnels [14]. It provides a real-world testbed for analyzing seismic wave propagation effects, fault-tunnel interaction, lining response under dynamic loading, and performance-based design approaches. Consequently, this region is highly suitable for developing fragility models [30], resilience frameworks, and AI-integrated predictive methodologies for underground infrastructure in seismically active mountainous terrains.

Data

Earthquake Parameters

The different earthquake parameters considered in this study are fault/thrust, SSD, fault length, microzonation, maximum observed earthquake, and peak ground acceleration. Among the different earthquake parameters considered in this study, micro-zonation, fault governing the earthquake of the tunnel were modelled as categorical variable and the other parameters including SSD, fault length, peak ground acceleration, and maximum observed earthquake was modelled as continuous variable.

The northwestern Himalayan region, particularly Jammu and Kashmir, represents a highly active seismotectonic domain governed by the ongoing convergence between the Indian and Eurasian plates. This tectonic interaction is manifested through a series of major compressional and transpressional fault systems, making seismic microzonation a critical prerequisite for the design and assessment of underground infrastructure. The regional seismicity is primarily controlled by prominent thrust systems, including the Panjal Thrust (PT), Jwalamukhi Thrust (JT), Reasi Thrust (RT), Balapur Thrust (BT), Main Boundary Thrust (MBT), Main Central Thrust (MCT), and segments of the Himalayan Frontal Thrust (HFT), which collectively define the deformation regime and seismic hazard distribution [14]. The above-mentioned thrusts are considered as an input variable (seven different categories) for determining the lining thickness.

From earthquake engineering perspective, among the different thrust considered in this study, the MBT and MCT are identified as the most seismically significant structures due to their high slip rates, active deformation characteristics, and potential to generate moderate to large magnitude earthquakes. These fault systems, with rupture lengths ranging from tens to several hundreds of kilometers (~100-300 km), are capable of producing earthquakes in the range of 5.50 to 7.8, with maximum observed magnitudes (M_max) approaching 7.0 to 7.5. Such large-magnitude events result in substantial seismic energy release, leading to elevated ground motion intensity and deformation demand in surrounding geological media. The observed earthquake Mobs considered in this study lies between 5.50 and 7.83 with mean, standard deviation, median, 25 percentile and 75 percentile as 7.24, 0.92, 7.81, 6.30 and 7.83, respectively.

Fault geometry and source characteristics play a fundamental role in governing seismic response. Longer fault lengths are directly associated with larger rupture areas and higher magnitude events, which in turn induce significant ground deformation, including increased shear strain and racking distortion in underground structures. This is particularly critical for tunnels intersecting or located near major fault corridors, where fault–structure interaction, differential ground movement, and stress concentration effects become dominant. Consequently, tunnel linings in these regions are subjected to higher deformation demands, necessitating enhanced design considerations such as increased lining thickness, flexibility optimization, and improved reinforcement detailing. The fault length considered in this study ranged from 88.75 km and 325.25 km with mean, standard deviation, median, 25 percentile and 75 percentiles as 217.64 km, 95.46 km, 245.86 km, 91.41 km and 325.25 km.

In addition to fault characteristics, SSD exerts a strong influence on seismic demand. Near-field conditions, characterized by short epicentral distances, generate high-amplitude ground motions and velocity pulses, resulting in amplified Peak Ground Acceleration (PGA) often exceeding 0.3-0.5 g, and increased deformation demand on tunnel linings. This leads to pronounced shear deformation, higher flexibility ratios, and elevated lining stresses, increasing the likelihood of cracking, spalling, and structural distress. Conversely, with increasing distance, seismic waves attenuate, reducing both PGA and deformation demand, thereby lowering the probability of moderate to extensive damage. The SSD considered in this study varied between 4.10 km and 38.10 km with mean, standard deviation, median, 25 percentile and 75 percentile as 19.70 km, 8.54 km, 19.10 km, 12.33 km and 26.55 km, respectively and PGA lies between 0.18g to 0.83g with mean, standard deviation, median, 25 percentile and 75 percentile as 0.67g, 0.18g, 0.71g, 0.69 g and 0.82g, respectively.

To capture this spatial variability in seismic hazard, microzonation frameworks have been developed by integrating multiple seismic and geotechnical parameters. These include PGA, spectral acceleration, shear wave velocity (V_s30), amplification ratio, resonance frequency, liquefaction susceptibility, overburden thickness, groundwater conditions, and local geological characteristics. These parameters are systematically combined using multi-criteria decision-making approaches such as the Analytic Hierarchy Process (AHP) within a GIS environment. Based on the resulting Seismic Hazard Index (SHI), the region is classified into distinct hazard zones A-C (e.g., high, moderate, and low), reflecting variations in site response and seismic vulnerability.

From a tunneling perspective, these zonation outcomes are crucial, as they directly influence deformation mechanisms and structural performance. Soft soil deposits and sedimentary basins amplify seismic waves, increasing racking deformation, while fractured and weathered rock masses induce localized stress concentrations. Furthermore, fault proximity introduces additional complexities such as fault rupture effects and differential displacement across tunnel alignments. Therefore, microzonation-informed design enables the identification of critical zones, including high-amplification regions, fault-controlled corridors, and weak subsurface conditions, facilitating more reliable estimation of tunnel lining forces, deformation demand, and overall seismic resilience in this complex Himalayan tectonic setting.

Tunneling and Geomaterial Parameters

The different parameters considered in this study are tunnel length, depth of overburden rock, diameter of tunnel, modelled as continuous variables. The tunnel length considered in this study varied between 255.00 m and 12750 m with mean, standard deviation, median, 25 percentile and 75 percentiles as 4385.72 m, 3708.65 m, 3038.50 m, 1483 m and 5959.00 m, respectively. The diameter of the tunnel lies between 5 and 8 m with mean, standard deviation, median, 25 percentile and 75 percentile as 6.49 m, 1.10 m, 6.00 m, 6.00 m, and 7.00 m, respectively. The overburden depth of rock above the tunnel varies between 98 m and 1320 m, with mean, standard deviation, median, 25 percentile and 75 percentile as 488.17 m, 284.92 m, 386.00 m, 256.00 m and 654 m, respectively. The tunnel lining thickness varies between 300 to 750 mm, with mean, standard deviation, median, 25 percentile and 75 percentile as 536.07 mm, 127.66 mm, 600.00 mm, 450.00 mm, and 600 mm, respectively. Figure 3 shows the schematic representation of mountain tunnel and the tunneling parameters.

In this study, Rock Quality Designation (RQD) was used for modelling geomaterial properties. In the Jammu and Kashmir region of the NW Himalayas, rock quality is highly variable and strongly controlled by tectonic disturbance along major thrust systems such as MBT and MCT. Field and borehole investigations reported in regional studies, including those referenced by Ansari et al. [28], indicate that rock mass conditions range from fair to very poor in faulted and sheared zones, with low RQD values, closely spaced discontinuities, weathered schist and phyllite formations, and crushed gouge material near fault traces. Such tectonically fractured rock masses exhibited reduced stiffness, low shear strength parameters, and high deformability, increasing susceptibility to seismic amplification and lining distress in tunnels. RQD varied between 33.00 and 54 Geological Strength Index (GSI) (mean = 40.40 GSI, SD= 5.06 GSI, median =40 GSI, 25 percentile = 36 GSI, 75 percentile = 44 GSI).

Additionally, the steep Himalayan topography combined with intense seasonal rainfall and seismic shaking significantly elevates landslide potential, particularly in highly jointed metamorphic rocks and colluvial deposits [31]. Earthquake-induced slope failures and rockfalls are common secondary hazards, especially where tunnels intersect unstable slopes or portal zones. The combination of weak rock mass quality, active tectonics, high relief gradients, and groundwater infiltration creates a geotechnical environment where both static and dynamic instabilities can occur, directly influencing tunnel alignment selection, support design, excavation method, and long-term serviceability of underground transportation infrastructure. The susceptibility of landslides in the vicinity of a tunnel was modelled as categorical variable with two categories yes or no.

Data Augmentation

Deep learning models generally provide accurate solutions to complex engineering problems. However, one of the requirements of developing deep learning models is the huge volumes of training data. Generally, obtaining large volumes of data is arduous in civil engineering [32,33,34], civil engineer often resort to different data augmentation techniques for developing deep learning models. Chatterjee et al. [32] performed data augmentation using normal distribution for generating profile data of highway railway grade crossings. In this study, data augmentation was performed on different tunnel geometric parameters including tunnel length, overburden depth of tunnel and diameter of tunnel using a normal distribution with zero mean and standard deviation of ten percent of the value. Using the data augmentation technique, 42840 samples were created for generating the training dataset.

Data Preparation

In this study, one of the advanced deep sequence-to-sequence deep learning architectures was used for modelling. In order to employ the sequence-to-sequence models, the input data was converted into a sequence. The input data sequence starts with the earthquake parameters and ends with the tunnel geometric parameters. This type of sequence was motivated from the concept of sentence where several words combined together to form a sentence. Similar to a sentence, the input different parameters were combined together to form a sequence.

Deep Learning

Deep Learning is type of neural network with many hidden layers, developed inspired by the functioning of the human brain. In human brain, different neurons work together to solve a complex problem, in neural network different artificial neurons work together to solve a complex problem. In the last decade, deep learning has become state-of-the-art data-driven technique and found its application in different fields including natural language processing, autonomous driving, image captioning, speech recognition, etc. Figure 4 shows a schematic representation of deep neural network. The light blue circle shows the input layer of the neural network, receives the input data, the orange circles represent the hidden layer of the neural network, performs different computation on the input data and the green circle represents the output layer of the neural network, it delivers the output from the network. The growth of deep learning was fueled by the modernization in the computer hardware system such graphic processing unit and tensor processing unit, availability of large volumes of data from different sensors and Internet over things (IoT) and advancements in the data processing algorithms.

In this study, hybrid deep learning models were used for determining the tunnel lining thickness from different parameters. Hybrid deep learning models are a special type of model developed using two different types of deep learning architectures, these models leverage the processing capabilities of both architectures. For example, Convolutional Neural Network (CNN) can efficiently detect objects from images and transformer architecture can process language data efficiently. Combining these architectures together can enhance the processing ability of both architectures and was used for determining the thickness of tunnel lining from different earthquake related parameters, tunnel geometric parameters and rock properties. Moreover, another deep learning model using LSTM and transformer architecture was developed, LSTM architecture determines the relationship between the input and output parameters using the LSTM cell and the transformer architecture determines the relationship using the multi-head attention layer. The architecture was developed on the philosophy that the LSTM layer will perform the job of feature extraction and the transformer layer would process the features using multi-head attention to determine the relationship between the input and output data.

LSTM Architecture

LSTM [35] is one of the popular deep learning architectures widely employed for sequence-to-sequence modelling and time series analysis. The LSTM architecture performs its operation using the LSTM cell, which facilitates storing information over an extended period of time. The cell performs the job of discarding irrelevant information, incorporating new information and producing output to subsequent stages using three different gates namely: (a) forget gate, (b) input gate and (c) output gate, respectively. Figure 5 shows the schematic representation of LSTM architecture.

The forget gate executes the job of discarding irrelevant information from the LSTM cell, the input gate carries out the job of including new information into the LSTM cell and output delivers the information from LSTM cell.

The forget gate executes the job of discarding irrelevant information from the LSTM cell, equation 1 represents the mathematical computation performed in the forget gate. The forget gate receives input from current state and hidden state from the preceding cell to compute the forget gate parameter.

h_{x} = σ (W_{x} . [k_{x - 1}, i_{x}] + b_{x}

(1)

σ (x) = \frac{1}{1 + e^{- x}}

(2)

Where

h_{x}

represents the forget gate parameter,

W_{x}

represents the weight matrix of the forget cell,

[k_{x - 1}, i_{x}]

represents the concatenation of the hidden state from the preceding cell

k_{x - 1}

and input at the current state

i_{x}

,

b_{x}

represents the bias vector,

x

represents the variable under consideration, and

σ

represents the sigmoid activation function.

The input gate of the LSTM cell performs the function of incorporating new information into the cell and the mathematical computation is represented in equation 3, equation 4 shows the computation of

{A'}_{x}

, equation 5 shows the computation of

t a n h

activation function and equation 6 shows the computation of updated cell state. The input gate receives input from the current state and hidden state from the previous cell and computes the input gate parameter

j_{x}

.

j_{x} = σ (W_{y} . [k_{x - 1}, i_{x}] + b_{y})

(3)

{A'}_{x} = t a n h (W_{z} . [k_{x - 1}, i_{x}] + b_{z})

(4)

t a n h (x) = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}}

(5)

A_{x} = h_{x} \times A_{x - 1} + i_{x} \times A_{x}^{'}

(6)

Where,

j_{x}

represent the input gate parameter,

W_{y}

and

W_{z}

represents weight matrices of the input gate and candidate cell state, respectively.

b_{y}

and

b_{z}

represents the bias vector of the input gate and candidate cell state, respectively.

t a n h

represents the activation function,

E_{x}

is the updated cell state.

The output gate in the LSTM cell performs the job of regularizing the information to the next cell. The computation performed in the output gate is shown in equation 7 and the computation of the hidden state is shown in equation 8. The output gate receives input from the current state and information of hidden state from the preceding cell.

B_{x} = σ (W_{h} . [k_{x - 1}, i_{x}] + b_{a})

(7)

k_{x} = B_{x} \times t a n h (A_{x})

(8)

Where,

B_{x}

represents the output gate parameter,

b_{a}

represents the bias output gate, and

W_{h}

represents the weight matrix of the output gate.

CNN Architecture

Convolutional Neural Network (CNN) is a popular deep learning framework used in image processing and similar jobs. CNN architecture consists of three different layers namely convolutional layer, max-pooling layer and fully connected layer. The convolutional layer performs the job of extracting information from the input data, the max-pooling layer performs the jobs of reducing the dimension of the input data and fully-connected layer of the CNN executes the job predicting the output from the CNN network. Figure 6 shows the schematic representation of the CNN architecture.

Transformer Architecture

Transformer architecture was developed by Vaswani et al. [36] for performing different types of jobs in machine translation and is building block of large language models. In civil engineering, transformer architectures were used for identifying cracks from road surface [35], determining profiles of highway-railway grade crossings [32], etc. This architecture poses an advantage over conventional sequence-to-sequence architecture, LSTM, due to its ability to capture long-term dependencies in the data. Another advantage of transformer architecture over LSTM architecture is fast data processing speed of input data due to its capability of parallel processing [36,39,40].

The original transformer architecture presented by Vaswani et al. [36] consists of encoder and decoder blocks, however different studies [32,40] used encoder-only transformer architecture. In this work, encoder-only transformer architecture was used for determining the thickness of tunnel lining. Figure 7 shows a schematic representation of the transformer architecture. Encoder block of the transformer architecture multi-head attention layer, position-wise feed-forward neural network, dropout layer, layer normalization, and residual connections. Another important component of the transformer architecture is the positional encoding; this block provides information about position of different components in the input sequence. This block is important in transformer architecture as the transformer architecture lacks mathematical operator to know position about different components in the input sequence. Equation 9 and 10 shows the mathematical representation of the positional encoding layer.

P E (p o s i t i o n, 2 a) = s i n (p o s i t i o n / 10000^{2 a / b})

(9)

P E (p o s i t i o n, 2 a + 1) = c o s (p o s i t i o n / 10000^{2 a / b})

(10)

Where

p o s i t i o n

represents the token of the input sequence, a represents the dimension of the position index vector, and b represents the dimension of the representation.

The multi-head self-attention layer provides information about the relationship between components of the input data. The mathematical computations in the multi-head attention layer are presented in equation 11 and 12. The feed-forward neural network layer performs mathematical transformation on the output from mult-head self attention, assisting the model to detect complex relationship in the data. Mathematically it is defined in equation 13.

A t t e n t i o n (H) = σ (\frac{E F^{T}}{\sqrt{b_{k}}}) G

(11)

E = H W_{e}; F = H W_{f}; G = H W_{g}

(12)

Where,

H

represents the position encoded input,

σ

represents the Softmax function,

E, F,

and

G

represents the query, key, and value, respectively,

b_{k}

represents the dimension of key, and

W_{e}

,

W_{f}

, and

W_{g}

represents the weight matrices, and T represents the transpose matrices.

F F N (h_{j}) = W_{3} . R e L U (h_{i} . W_{4} + c_{4}) + c_{3}

(13)

Where

h_{i}

represents the input vector at position

j

,

W_{4}

and

W_{3}

represents the weight matrix of the first and second linear layer respectively,

c_{4}

and

c_{3}

represents the bias vector of the first and second linear layers respectively, and

R e L U

represents the rectified linear unit activation function.

The dropout layer in the architecture assists in preventing overfitting of the model, the residual connections facilitates the model development process by solving the problem of vanishing gradient and layer normalization stabilizes the training of the network.

Results and Discussion

Model Development

Two hybrid deep learning models (a) LSTM-transformer model and (b) CNN-transformer model were developed. Figure 8a shows the architecture of the CNN-transformer model. In the CNN-transformer hybrid model, CNN layer receives the input data and performs feature extraction from the input data. Subsequently, the transformer blocks present in the model process the extracted feature using multi head-attention layer. Figure 8b shows the architecture of the LSTM transformer hybrid model. The LSTM layer receives the input data and processes the data using LSTM cell and subsequently the transformer architecture processes the data using multi-head attention technique. Hyperparameter tuning was performed as a part of the model development process, the different parameters tuned include learning rate, number of units in the LSTM layer, number of filters of CNN layers, and number of transformer encoder block were varied to determine the optimum architecture for this purpose. Table 2 shows the different hyperparameters tuned during the process of model development.

Model Performance

Table 3 shows the performance of the deep learning models on training, validation and test datasets. Mean absolute error and root mean squared error are defined in equation 14 and 15, respectively. The CNN-transformer model achieved a MAE and Root Mean Squared Error (RMSE) of 19.64 mm and 25.28 mm, respectively on the test dataset. The test MAE and RMSE indicate that error on the test dataset are of similar magnitudes. The LSTM-transformer model achieved an MAE and RMSE of 32 mm and 54.61 mm, respectively on the test dataset. The differences in MAE and RMSE on the test dataset indicate the presence of large errors. Figure 9 shows the scatter plot of predicted and actual tunnel lining values on the test dataset. The CNN-transformer model performed satisfactorily on the test dataset with majority of predicted points are around the perfect fit line. On the contrary, the performance of the LSTM-transformer model was not satisfactory on the test dataset, with errors of large magnitude.

M A E = \frac{\sum_{j = 1}^{M} | x_{j} - \hat{x_{j}} |}{M}

(14)

R M S E = \sqrt{\frac{\sum_{j = 1}^{M} (y_{i} - {\hat{y_{i}})}^{2}}{N}}

(15)

Where

M A E

and

R M S E

represents the mean absolute error,

x_{j}

is the predicted tunnel lining thickness from the deep learning model,

\hat{x_{j}}

is the actual tunnel lining thickness, and

M

is the number of data points.

Conclusions

This study developed a preliminary design tool leveraging hybrid deep learning models for determining the thickness of tunnel linings in earthquake-prone regions utilizing different earthquake parameters, tunnel geometric parameters, rock quality and occurrence of landslides. For this study, data from different tunnel construction projects in the Jammu region of India was selected due to high seismic activities in that region. The high seismic activity in that region can be attributed to the presence of 7 different faults. Earthquakes of magnitude between 5.50 and 7.83 and PGA ranging between 0.33 g and 0.83 were observed in that location. Moreover, the region is susceptible to landslides due to presence of fractured rock in that region.

The different earthquake parameters considered for the deep learning model are fault type, SSD, Fault length, PGA_max, and M_obs and the different tunnel geometric parameters considered for modelling are tunnel length, depth of overburden rock, and diameter of tunnel. Two hybrid deep learning models namely: CNN-transformer hybrid model and LSTM-transformer hybrid model were developed to predict the tunnel lining thickness. On the test dataset, the CNN-transformer model achieved MAE, RMSE and R² of 19.64 mm, 25.28 mm, and 0.92, respectively. Similar RMSE and MAE indicate errors of similar magnitudes. The LSTM-transformer model achieved MAE, RMSE and R² of 32.00 mm, 54.61 mm, and 0.64, respectively on the test dataset. The RMSE and MAE of the LSTM-transformer reveal error about high magnitude. Between the two models, the performance of CNN-transformer model is better as compared to the LSTM-transformer model due to its better performance. Therefore, CNN-transformer model can be used as a preliminary design tool.

Author Contributions

Study conception and design: KC, AA; Data collection: KC, AA; Data analysis: KC, AA; Manuscript preparation: KC, AA. All authors reviewed the manuscript.

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Data Availability Statement

The data supporting this study's findings are available from the corresponding authors upon reasonable request.

Declaration of Conflicting Interests

The authors have no relevant financial or non-financial interests to disclose.

References

Ansari, A.; Rao, K. S.; Jain, A. K. Seismic response and fragility evaluation of circular tunnels in the Himalayan region: Implications for post-seismic performance of transportation infrastructure projects in Jammu and Kashmir. Tunn. Undergr. Space Technol. 2023b, 137, 105118. [Google Scholar]
Nunes, M. A. An investigation of soil-tunnel interaction in multi-layer ground. 2008. [Google Scholar] [PubMed]
Katebi, H.; Rezaei, A. H.; Hajialilue-Bonab, M.; Tarifard, A. Assessment the influence of ground stratification, tunnel and surface buildings specifications on shield tunnel lining loads (by FEM). Tunn. Undergr. Space Technol. 2015, 49, 67–78. [Google Scholar] [CrossRef]
Zhang, D. M.; Huang, H. W.; Hu, Q. F.; Jiang, F. Influence of multi-layered soil formation on shield tunnel lining behavior. Tunn. Undergr. Space Technol. 2015, 47, 123–135. [Google Scholar] [CrossRef]
Do, N. A.; Dias, D. Tunnel lining design in multi-layered grounds. Tunn. Undergr. Space Technol. 2018, 81, 103–111. [Google Scholar] [CrossRef]
Yin, X.; Gao, F.; Wu, J.; Huang, X.; Pan, Y.; Liu, Q. Compressive strength prediction of sprayed concrete lining in tunnel engineering using hybrid machine learning techniques. Undergr. Space 2022, 7(5), 928–943. [Google Scholar] [CrossRef]
Zhang, W.; Li, Y.; Wu, C.; Li, H.; Goh, A. T. C.; Liu, H. Prediction of lining response for twin tunnels constructed in anisotropic clay using machine learning techniques. Undergr. Space 2022, 7(1), 122–133. [Google Scholar] [CrossRef]
Huang, H.; Wang, C.; Zhou, M.; Qu, L. Compressive strength detection of tunnel lining using hyperspectral images and machine learning. Tunn. Undergr. Space Technol. 2024, 153, 105979. [Google Scholar] [CrossRef]
Ye, X. W.; Zhang, X. L.; Zhang, H. Q.; Ding, Y.; Chen, Y. M. Prediction of lining upward movement during shield tunneling using machine learning algorithms and field monitoring data. Transp. Geotech. 2023, 41, 101002. [Google Scholar] [CrossRef]
Liu, S.; Shi, Y.; Sun, R.; Yang, J. Damage behavior and maintenance design of tunnel lining based on numerical evaluation. Eng. Fail. Anal. 2020, 109, 104209. [Google Scholar] [CrossRef]
Hashash, Y. M.; Hook, J. J.; Schmidt, B.; Yao, J. I. C. Seismic design and analysis of underground structures. Tunn. Undergr. Space Technol. 2001, 16(4), 247–293. [Google Scholar] [CrossRef]
Kroetz, H. M.; Do, N. A.; Dias, D.; Beck, A. T. Reliability of tunnel lining design using the hyperstatic reaction method. Tunn. Undergr. Space Technol. 2018, 77, 59–67. [Google Scholar] [CrossRef]
Zhang, Z.; Gong, R.; Zhang, H.; He, W. The Sustainability performance of reinforced concrete structures in tunnel lining induced by long-term coastal environment. Sustainability 2020, 12(10), 3946. [Google Scholar] [CrossRef]
Ansari, A.; Rao, K. S.; Jain, A. K. Seismic vulnerability of tunnels in Jammu and Kashmir for post seismic functionality. Geotech. Geol. Eng. 2023a, 41(2), 1371–1396. [Google Scholar]
Wang, Z. Z.; Zhang, Z. J. S. D. Seismic damage classification and risk assessment of mountain tunnels with a validation for the 2008 Wenchuan earthquake. Soil Dyn. Earthq. Eng. 2013, 45, 45–55. [Google Scholar] [CrossRef]
Lu, C. C.; Hwang, J. H. Damage of new Sanyi railway tunnel during the 1999 Chi-Chi earthquake. In Geotechnical Earthquake Engineering and Soil Dynamics; 2008; Volume IV, pp. 1–10. [Google Scholar]
Aydan, Ö. Crustal stress changes and characteristics of damage to geo-engineering structures induced by the Great East Japan Earthquake of 2011. Bull. Eng. Geol. Environ. 2015, 74(3), 1057–1070. [Google Scholar]
Romeo, S.; Di Matteo, L.; Melelli, L.; Cencetti, C.; Dragoni, W.; Fredduzzi, A. Seismic-induced rockfalls and landslide dam following the October 30, 2016 earthquake in Central Italy. Landslides 2017, 14(4), 1457–1465. [Google Scholar] [CrossRef]
Durante, M. G.; Di Sarno, L.; Zimmaro, P.; Stewart, J. P. Damage to roadway infrastructure from 2016 Central Italy earthquake sequence. Earthq. Spectra 2018, 34(4), 1721–1737. [Google Scholar] [CrossRef]
Guo, Q.; Yu, Q.; Yin, S.; Vo Thanh, H.; Soltanian, M. R.; Liu, D.; Dai, Z. Seismic stability assessment methodology and application for deep-buried tunnels considering seismic source mechanisms. Int. J. Geotech. Eng. 2026, 20(1), 113–129. [Google Scholar]
Arnau, O.; Molins, C. Three dimensional structural response of segmental tunnel linings. Eng. Struct. 2012, 44, 210–221. [Google Scholar] [CrossRef]
Xu, Q.; Zhu, Y.; Lei, S.; Liu, Z.; Zhao, W. Underpassing Underpinning Method with Active–Passive Disturbance Control: A Case Study of High-Risk Metro Station Construction in Shallow Urban Strata. Arab. J. Sci. Eng. 2025, 1–19. [Google Scholar]
Alao, J. O. The emerging roles of 3D and 4D geophysical and geological modelling in evaluating seismic risks: a critical review. Earthq. Res. Adv. 2025, 100399. [Google Scholar]
Neuner, M.; Schreter, M.; Gamnitzer, P.; Hofstetter, G. On discrepancies between time-dependent nonlinear 3D and 2D finite element simulations of deep tunnel advance: A numerical study on the Brenner Base Tunnel. Comput. Geotech. 2020, 119, 103355. [Google Scholar] [CrossRef]
Li, Y.; Conte, J. P.; Gill, P. E. Probabilistic performance-based optimum seismic design framework: illustration and validation. Comput. Model. Eng. Sci. 2019, 120(3), 517–543. [Google Scholar] [CrossRef]
Paul, R.; Mishra, S.; Khatti, J. Role of artificial intelligence (AI) techniques in tunnel engineering-a scientific review. Indian Geotech. J. 2025, 1–31. [Google Scholar]
Ansari, A.; Seshagiri Rao, K.; Jain, A. K. Seismic microzonation of the himalayan region considering site characterization: Application toward seismic risk assessment for sustainable tunneling projects. Nat. Hazards Rev. 2024, 25(1), 04023052. [Google Scholar] [CrossRef]
Ansari, A.; Zahoor, F.; Rao, K. S.; Rathod, G. W.; Mir, B. A. Integrating MHVSR and MSOR techniques with JFIM for seismic vulnerability assessment of sites and buildings in Jammu and Kashmir, NW Himalayas. Phys. Chem. Earth Parts A/B/C 2025, 104062. [Google Scholar]
Ansari, A.; Rao, K. S.; Jain, A. K. Application of microzonation towards system-wide seismic risk assessment of railway network. Transp. Infrastruct. Geotechnol. 2024, 11(3), 1119–1142. [Google Scholar]
Ansari, A.; Zahoor, F.; Rao, K. S.; Jain, A. K. Liquefaction hazard assessment in a seismically active region of Himalayas using geotechnical and geophysical investigations: a case study of the Jammu Region. Bull. Eng. Geol. Environ. 2022, 81(9), 349. [Google Scholar] [CrossRef] [PubMed]
Aziz, K.; Mir, R. A.; Ansari, A. Precision modeling of slope stability for optimal landslide risk mitigation in Ramban road cut slopes, Jammu and Kashmir (J&K) India. Model. Earth Syst. Environ. 2024, 10(3), 3101–3117. [Google Scholar] [CrossRef]
Chatterjee, K.; Li, J. Q.; Ansari, F.; Munna, M. R.; Parajulee, K.; Schwennesen, J. Hybrid LSTM-Transformer Models for Profiling Highway–Railway Grade Crossings. J. Transp. Eng. Part A Syst. 2026, 152(2), 04025138. [Google Scholar]
Chatterjee, K.; Desai, M.; Li, J. Application of Large Language Models in Geotechnical Engineering: A Movement Towards Safe and Sustainable Future. Geotechnics 2026, 6(2), 38. [Google Scholar] [CrossRef]
Desai, M.; Chatterjee, K. Application of Machine Learning Techniques for Prediction of Soil Water Characteristics Curve: A State of the Art Review. 2026. [Google Scholar] [CrossRef] [PubMed]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9(8), 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Ansari, F.; Chatterjee, K.; Li, J. Q.; Wang, K.; Golalipour, A. Multi-Object Pavement Surface Feature Detection with CNN and Transformer Deep Learning Architecture. In Airfield and Highway Pavements; 2025; pp. 350–359. [Google Scholar]
Islam, S.; Elmekki, H.; Elsebai, A.; Bentahar, J.; Drawel, N.; Rjoub, G.; Pedrycz, W. A comprehensive survey on applications of transformers for deep learning tasks. Expert Syst. With Appl. 2024, 241, 122666. [Google Scholar]
Zheng, Y.; Li, X.; Xie, F.; Lu, L. Improving end-to-end speech synthesis with local recurrent neural network enhanced transformer. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); IEEE, May 2020; pp. 6734–6738. [Google Scholar]
Anik, B. T. H.; Islam, Z.; Abdel-Aty, M. A time-embedded attention-based transformer for crash likelihood prediction at intersections using connected vehicle data. Transp. Res. Part C Emerg. Technol. 2024, 169, 104831. [Google Scholar]

Figure 1. Methodological framework for the prediction of tunnel lining thickness, illustrating the integration of literature review, selection of seismic and geotechnical input parameters, development of hybrid deep learning models (CNN, LSTM, and Transformer architectures), and evaluation of model performance in terms of accuracy and overfitting control.

Figure 2. Regional geological, tectonic, and infrastructure characteristics of the Jammu region, Jammu and Kashmir, highlighting its relevance for tunnel engineering (Source: The map was prepared by the authors).

Figure 3. Schematic representation of highway tunnel showing different geometric parameters.

Figure 4. Schematic representation of the neural network with input layer, three hidden layer and output layer.

Figure 5. Schematic representation of the LSTM architecture.

Figure 6. Schematic representation of the CNN architecture.

Figure 7. Representation of transformer architecture.

Figure 8. Architecture of hybrid deep learning models (a) CNN-transformer model, (b) LSTM-transformer model.

Table 2. Hyperparameters of developed models.

Hyperparameters	CNN-transformer model	LSTM-transformer model
Learning rate	0.00001	0.00001
Batch size	64	64
Number of units in LSTM layer	-	32
Number of filters in the CNN block	32	-
Optimization function	Adam	Adam
Number of transformer encoder block	2	2
Transformer encoder (number of head)	2	2
Transformer encoder (head size)	32	32

Table 3. Performance of deep learning model.

	Training			Validation			Testing
	MAE (mm)	RMSE (mm)	R²	MAE (mm)	RMSE (mm)	R²	MAE (mm)	RMSE (mm)	R²
CNN-transformer deep learning model	25.14	41.23	0.90	9.47	11.52	0.98	19.64	25.28	0.92
LSTM-transformer deep learning model	69.33	107.42	0.38	54.99	86.18	0.28	32.00	54.61	0.64

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Hybrid Deep Learning-Based Prediction of Tunnel Lining Thickness Under Seismic Loading in the Northwestern Himalayas (Jammu Region)

Abstract

Keywords:

Subject:

Introduction

Study Area

Data

Earthquake Parameters

Tunneling and Geomaterial Parameters

Data Augmentation

Data Preparation

Deep Learning

LSTM Architecture

CNN Architecture

Transformer Architecture

Results and Discussion

Model Development

Model Performance

Conclusions

Author Contributions

Funding

Data Availability Statement

Declaration of Conflicting Interests

References

MDPI Initiatives

Important Links

Subscribe