A Hybrid Physics-Augmented Neural Network for Dynamic System Modeling with Partially Known Dynamics

Laurin Ludmann; Jaeyoun Choi; Jens Neubeck; Andreas Wagner; Chuchu Fan

doi:10.20944/preprints202606.0534.v1

Submitted:

03 June 2026

Posted:

08 June 2026

You are already at the latest version

Abstract

This paper introduces Hybrid Physics-Augmented Neural Network (HyPA-Net), a hybrid modeling framework that integrates physics-based linear time-invariant models with artificial neural networks (ANNs) to address dynamic system modeling when only partial physical knowledge is available. The approach leverages the interpretability and robustness of established physical models while using ANNs — such as Long Short-Term Memory architectures—to capture unknown or nonlinear system behaviors. The methodology normalizes state and input variables for compatibility with ANN training, and expands traditional recursive state-space equations for efficient backpropagation over sequences. Vehicle dynamics, specifically using a rear wheel steering test case, validate the proposed framework. Various HyPA-Net configurations are benchmarked against pure physics-based and pure data-driven models, demonstrating improved prediction accuracy and model flexibility. The experimental results confirm that hybrid models yield superior performance over strict physical approaches and can implicitly learn submodel dynamics within a unified, yet modular architecture, opening avenues for applications in domains where partial physics-based knowledge is available but insufficient on its own.

Keywords:

dynamic system modeling

;

hybrid modeling

;

physics-augmented neural networks

;

vehicle dynamics

;

state space models

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

1. Introduction

The field of dynamic system modeling has seen a significant shift towards data-driven approaches in recent years. This is mainly due to the increasing availability of computing power and the success of machine learning techniques in various application domains. In this context, artificial neural networks (ANNs) have emerged as a powerful tool for modeling complex systems [1,2,3,4,5,6,7]. ANNs are capable of learning complex patterns from data and can be used to model systems with unknown dynamics.

Traditional ANNs process inputs through interconnected layers of neurons with nonlinear activation functions, excelling at modeling nonlinear relationships through weighted connections and backpropagation [8,9]. For sequential data, recurrent neural networks (RNNs) introduced memory via hidden states that propagate temporal information, enabling applications in speech recognition and time series forecasting [10,11]. Long short-term memory (LSTM) networks, introduced by [12], addressed the vanishing gradient limitations of RNNs through gated mechanisms, achieving breakthroughs in machine translation and long-context tasks by selectively retaining information across long sequences. LSTM networks are well-suited for time series data and have proven to be able to capture long-term dependencies in the input sequence [13,14,15,16]. Modern architectures, on the other hand, like Transformers, fundamentally changed sequence modeling through self-attention mechanisms that capture global dependencies without recurrence, though their quadratic complexity prompts adaptations for time series applications [17,18,19].

Although primary and popular fields of research are natural language processing with large language models (LLMs), image processing, and general agents, ANNs have also been successfully applied in the field of dynamic system modeling [6,20]. However, since ANNs generally only model the input-output relationship without providing insights into the underlying physics or internal dynamics of the system, they are considered black-box models, which limits their applicability in safety-critical systems, where understanding the underlying dynamics is crucial.

In this paper, we propose a novel approach to dynamic system modeling that combines the strengths of physics-based models and artificial neural networks. The approach, called Hybrid Physics-Augmented Network (HyPA-Net), uses a physical model as a foundation and augments it with an artificial neural network to model unknown parts or nonlinearities of the system. In theory, any physical representation of the system can be used. In this work, we focus on linear time-invariant (LTI) state-space models for the known physics, while extensions to nonlinear representations are also possible. By explicitly modeling known system dynamics and complementing them with ANNs for unknown nonlinearities, HyPA-Net provides a hybrid modeling framework that combines the adaptability and interpretability of physics-based models with the flexibility of ANNs.

2. Related Work

In this section, we review related work in the field of dynamic system modeling with a focus on approaches that combine physical modeling with artificial neural networks.

2.1. State Space Models with Neural Networks

The class of structured state space sequence models (S4) recently emerged as a promising approach to sequence modeling inspired by the mathematical foundations of LTI systems and related to RNNs. In the seminal High-Order Polynomial Projection Operators (HiPPO) framework, a principled approach to online signal compression through dynamic memory mechanisms was introduced by [21]. The subsequent Mamba architecture demonstrates that state space models can outperform attention-based transformers in both accuracy and computational efficiency [22]. It has further been shown by [23] that S4 models in general are a variation of the transformer architecture, which is more efficient and effective for long sequences, leveraging the implicit knowledge of the system dynamics. This line of work has been extended by many others, leading to different variations of the S4 architecture, such as H3 [24], Hyena [25], RetNet [26] and others. However, these models all learn the dynamics of the system from data, without explicit incorporation of known physical laws or constraints. This generally requires more and better data as well as more computing time.

2.2. Physics-Informed Neural Network Approaches

Physics-Informed Neural Networks represent a class of neural networks that incorporate physical laws, such as partial differential equations, directly into their training process. They can handle a variety of complex equations by minimizing a loss function that combines data fitting and physics residuals [27,28]. These approaches have been successfully applied to a wide range of problems, including fluid dynamics [29], heat transfer [5], and power systems [30,31]. To some extent, they have demonstrated their ability to be more feasible than classical numerical methods. Training such hybrid models presents unique challenges, particularly in balancing the contribution of physical constraints with data-driven learning objectives. While the physics is integrated, e.g. in the loss function, the ANN itself still remains a black box.

2.3. Neural Ordinary Differential Equations

Neural Ordinary Differential Equations (Neural ODEs) have introduced a novel approach to modeling dynamic systems by approximating continuous-time dynamics using neural networks. Instead of specifying a discrete sequence of hidden layers, the derivative of the hidden state is parameterized using a neural network [32]. The advantage of Neural ODE approaches lies in their ability to learn complex temporal dependencies while maintaining temporal consistency across different sampling rates. Variations such as Graph Neural ODEs [25], Stiff Neural ODEs ([33]) and Bayesian Neural ODEs [34] have been successfully proposed to address specific problems. Challenges remain, for example, regarding the numerical integration of ODEs, particularly in terms of stability and convergence [35] and their robustness is still unclear [36]. Furthermore, Neural ODEs remain a black box.

2.4. Koopman Neural Networks

In Koopman theory, nonlinear dynamical systems are represented as linear dynamics on functions of the observable state, and their spectral decomposition characterizes the behavior of the nonlinear system. Finding finite-dimensional representations of the Koopman operator means finding coordinate transformations where the nonlinear dynamics appear linear [37]. Dynamic Mode Decomposition is a numerical method to approximate the spectral objects of the Koopman operator, e.g. eigenvalues and modes, whereas a Koopman neural network learns these representations [38,39,40,41]. All methods recast nonlinear dynamics as approximately linear evolution in an enlarged space of observables or latent coordinates, but they typically treat the underlying physics only implicitly, if at all, and operate in abstract feature spaces rather than in the original physical state space. In contrast, our method begins from an explicit LTI state-space model of the known physics and augments only the unknown or nonlinear residual dynamics with a neural network, so that the core representation remains firmly anchored in physically meaningful states and inputs. This design yields several important advantages. The physics grounding enforces structural consistency with known laws, the separation between modeled and learned components enhances interpretability of both the global model and embedded submodels, and the focus on learning only residual dynamics can substantially improve data efficiency compared to approaches that must infer the entire dynamical map from data alone.

3. Problem Statement

The aim of this paper is to develop a hybrid modeling framework that integrates partially known physical dynamics with data-driven learning in order to accurately capture the behavior of complex dynamic systems. We specifically consider systems where the governing equations can be described in a state-space formulation, but where only a subset of the dynamics is analytically available. In many real-world cases, the system dynamics can be decomposed into a known and an unknown residual component. Traditional physics-based models often approximate the real dynamics, which leads to systematic errors, whereas purely data-driven methods attempt to learn the full mapping, often requiring large datasets, suffering from poor generalization outside the training domain and offer no insights into the internal dynamics of the system if they are not trained directly. The central problem is therefore to jointly exploit partial physical knowledge and data-driven learning to approximate the residual dynamics while retaining interpretability, stability, and data efficiency. In particular, this work aims to design a hybrid model based on a first-principles physical model in combination with a parameterized neural augmentation mechanism that can

faithfully reproduce the observed trajectories,
generalize to unseen inputs and operating regimes,
remain consistent with the known physical structure, and
implicitly model subsystems in a dynamic system without the need to isolate the subsystem.

4. Methodology

In this section, we present our approach and the general network architecture of HyPA-Net, consisting of two main components: a physics-based model and an ANN. Those two components can be combined in different ways to achieve the desired balance between interpretability and flexibility, always taking into account the specific requirements of the application domain. The physics-based model represents the known system dynamics, while the ANN complements the physics-based model by modeling the unknown or neglected nonlinearities in the system.

An example of a general network architecture is shown in Figure 1. In this case, an arbitrary input u is passed to two different sub-models. The first is known and modeled using a first-principal physical model. However, the second one is unknown or too complex to model directly. Therefore, an ANN is introduced to capture its dynamics. The respective outputs of both sub-models

γ_{1}

and

γ_{2}

are then passed on to another sub-model, which is again modeled using a first principle physical model generating the overall output x.

While there are various ways to represent dynamic systems, we focus on the discrete state space representation of LTI systems.

Equation (1) describes the transition of the system’s state

x

from one time step to the next, indicated by the index k [42]. The state matrix

A

describes the system’s dynamics, while the input matrix

B

describes how the input

u

influences the state.

\begin{matrix} x_{k + 1} = A x_{k} + B u_{k} \end{matrix}

(1)

Discrete-time state space models are found in a variety of literature, e.g. [42,43,44,45]. Linear state space models have proven particularly valuable in a wide range of applications from economic forecasting [46,47] to aerospace systems [48,49]. These models excel at representing higher-order linear systems, predicting future system states, and serving as foundational components in more complex estimation frameworks such as Kalman filtering [50]. However, their linear nature inherently limits their ability to capture nonlinear phenomena that are ubiquitous in real-world systems.

With a given initial state

x_{0}

and a sequence of inputs

u_{0}, u_{1}, \dots

, the state trajectory

x_{1}, x_{2}, \dots

can be computed iteratively. However, two problems arise when using Equation (1) to compute the state trajectory:

1.: The state space equations are based on physical units, i.e., the state and input variables have physical meanings. ANNs operate on normalized data, i.e., the input and output variables are normalized, e.g., to zero mean and unit variance.
2.: The state trajectory is computed recursively, i.e., the state at time step $k + 2$ depends on the state at time step $k + 1$ . This recursion interferes with the backpropagation algorithm used to train the ANN.

To address these problems, we propose a novel approach that combines the state space representation of LTI systems with ANNs while solving the issues mentioned above.

The key idea is to normalize the state and input variables as well as the state space equations before feeding them into the ANN and to compute the state trajectory in parallel using the state space equations. As a first step, we normalize the state and input variables using the mean and standard deviation of the training data

\begin{matrix} \hat{x} = \frac{x - μ_{x}}{σ_{x}} and \hat{u} = \frac{u - μ_{u}}{σ_{u}} \end{matrix}

(2)

with

μ_{x}

and

σ_{x}

being the mean and standard deviation of the state variables, and

μ_{u}

and

σ_{u}

respectively. Within this paper, all variables with a hat

\hat{□}

are normalized variables.

Using the backtransformation from Equation (2) and inserting into the state space Equation (1) yields the normalized state space equation

\begin{matrix} x_{k + 1} = A ({\hat{x}}_{k} σ_{x} + μ_{x}) + B ({\hat{u}}_{k} σ_{u} + μ_{u}) \end{matrix}

(3)

With the help of this equation, the physical states can be calculated based on the normalized states and inputs. To calculate the normalized states, Equation (2) has to be incorporated on the left hand side of Equation (3) as well as on the right hand side and rearranged. This yields

\begin{matrix} {\hat{x}}_{k + 1} = \frac{A ({\hat{x}}_{k} σ_{x} + μ_{x}) + B ({\hat{u}}_{k} σ_{u} + μ_{u}) - μ_{x}}{σ_{x}} \end{matrix}

(4)

Because the normalization is an element-wise operation, the associativity and distributivity laws apply here. Equation (4) thus can be rewritten as

\begin{matrix} {\hat{x}}_{k + 1} & = \frac{A σ_{x}}{σ_{x}} {\hat{x}}_{k} + \frac{A μ_{x}}{σ_{x}} + \frac{B σ_{u}}{σ_{x}} {\hat{u}}_{k} + \frac{B μ_{u}}{σ_{x}} - \frac{μ_{x}}{σ_{x}} \\ = \hat{A} {\hat{x}}_{k} + \hat{B} {\hat{u}}_{k} + \hat{H} \end{matrix}

(5)

with

\hat{A} = \frac{A σ_{x}}{σ_{x}}

(6)

\hat{B} = \frac{B σ_{u}}{σ_{x}}

(7)

\hat{H} = \frac{A μ_{x}}{σ_{x}} + \frac{B μ_{u}}{σ_{x}} - \frac{μ_{x}}{σ_{x}}

(8)

where

H

represents a constant normalization offset. Equation (5) is the final form of the normalized state space equation. It requires normalized states and inputs and calculates the normalized states at the next time step. It is therefore able to compute the state trajectory in the scale of the ANN. This solves the scale problem mentioned above.

For the second problem, the recursive nature of the LTI state space equation, we expand the state space equation to a fixed number of time steps and precalculate the recursion. Introducing a target sequence length N and expanding state and input to

\begin{matrix} \hat{X} = [\begin{matrix} x_{k + 1} \\ x_{k + 2} \\ ⋮ \\ x_{k + N} \end{matrix}] \in R^{N * n} and \hat{U} = [\begin{matrix} u_{k} \\ u_{k + 1} \\ ⋮ \\ u_{k + N - 1} \end{matrix}] \in R^{N * m} \end{matrix}

(9)

requires Equation (5) to be expanded to

\begin{matrix} \hat{X} & = & [\begin{matrix} \hat{A} \\ {\hat{A}}^{2} \\ ⋮ \\ {\hat{A}}^{N} \end{matrix}] x_{k} + [\begin{matrix} \hat{B} \\ \hat{A} \hat{B} \\ ⋮ \\ {\hat{A}}^{N - 1} \hat{B} \end{matrix}] u_{k} + [\begin{matrix} 0 \\ \hat{B} \\ ⋮ \\ {\hat{A}}^{N - 2} \hat{B} \end{matrix}] u_{k} + 1 + \dots + [\begin{matrix} 0 \\ 0 \\ ⋮ \\ \hat{B} \end{matrix}] u_{k} + N \\ + [\begin{matrix} \hat{H} \\ \hat{A} \hat{K} + \hat{H} \\ ⋮ \\ {\hat{A}}^{N - 1} \hat{H} + {\hat{A}}^{N - 2} \hat{H} + \dots + \hat{H} \end{matrix}] \\ = & [\begin{matrix} \hat{A} \\ {\hat{A}}^{2} \\ ⋮ \\ {\hat{A}}^{N} \end{matrix}] x_{k} + [\begin{matrix} \hat{B} & 0 & \dots & 0 \\ \hat{A} B & \hat{B} & \dots & 0 \\ {\hat{A}}^{2} \hat{B} & \hat{A} B & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\hat{A}}^{N - 1} \hat{B} & {\hat{A}}^{N - 2} \hat{B} & \dots & \hat{B} \end{matrix}] \hat{U} \\ + [\begin{matrix} \hat{H} \\ \hat{A} \hat{K} + \hat{H} \\ ⋮ \\ {\hat{A}}^{N - 1} \hat{H} + {\hat{A}}^{N - 2} \hat{H} + \dots + \hat{H} \end{matrix}] \end{matrix}

(10)

or in short

\begin{matrix} \hat{X} = \underset{̲}{\hat{A}} {\hat{x}}_{k} + \underset{̲}{\hat{B}} \hat{U} + \underset{̲}{\hat{H}} \end{matrix}

(11)

Equation (11) represents the final form of the normalized state space equation for a fixed number of time steps. This enables the parallel computation of the state trajectory in the scale of the ANN and solves the recursion problem mentioned above.

5. Experiments

This paper uses vehicle dynamics as a test case to demonstrate the effectiveness of the proposed approach. With modern vehicles featuring an increasing number of active systems that influence their behavior, inferring an exact model becomes increasingly difficult. However, this modeling technique can easily be adapted to accommodate other systems and robots. As it is only a test case, the formulation of the vehicle dynamics is briefly described in this chapter. We chose the single track model from [51] of a passenger car as it is a well-known, simple, and widely used model in the field of vehicle dynamics. It is a simple yet well established model validated for small side slip angles or lateral acceleration of up to around

a_{y} = 4 m / s^{2}

. For further details, the reader is referred to the literature, e.g., to [52,53] or [54].

As our test vehicle has rear wheel steering, we extend the standard bicycle model from [54] to include rear wheel steering dynamics in case of physical modeling of the rear wheel steering, assuming a linear relationship between the steering wheel angle and the rear wheel angle. This bicycle model is a two-degree-of-freedom model that describes the vehicle dynamics in the lateral and yaw directions, having the side slip angle

β

and the yaw rate

\dot{ψ}

as states

x = {[\begin{matrix} \dot{ψ}, β \end{matrix}]}^{⊺}

. The side slip angle is the angle between the direction of movement of the vehicle and the longitudinal axis of the body frame coordinate system, and the yaw rate is the rate of turning around the vertical axis. The input to the model is the steering angle of both wheels, represented as

u = {[\begin{matrix} δ_{f}, δ_{r} \end{matrix}]}^{⊺}

if the vehicle has a front wheel steering (FWS) and rear wheel steering (RWS). The state space matrices of the vehicle dynamics for the continuous model is given by

A = [\begin{matrix} - \frac{c_{f f 0}, f l_{f}^{2} + c_{f f 0}, r l_{r}^{2}}{v I_{zz}} & - \frac{c_{f f 0}, f l_{f} - c_{f f 0}, r l_{r}}{I_{zz}} \\ - 1 - \frac{c_{f f 0}, f l_{f} - c_{f f 0}, r l_{r}}{v m^{2}} & - \frac{c_{f f 0}, r + c_{f f 0}, f}{v m} \end{matrix}]

(12)

B = [\begin{matrix} \frac{c_{f f 0}, f l_{f}}{I_{zz}} & - \frac{c_{f f 0}, r l_{r}}{I_{zz}} \\ \frac{c_{f f 0}, f}{v m} & \frac{c_{f f 0}, r}{v m} \end{matrix}]

(13)

with

c_{f f 0}, f

and

c_{f f 0}, r

being the cornering stiffness of the front and rear axles,

l_{f}

and

l_{r}

are the distances from the center of gravity to the front and rear axles, v the vehicle velocity,

I_{zz}

the yaw moment of inertia, and m the vehicle mass. For the implementation, the resulting state space representation is discretized using a zero-order hold approach. As we will compare different variations of the model, we also use a variant of the model without rear wheel steering. In this case, the second column of the input matrix

B

is removed, resulting in a single-input system which only uses the front wheel angle

u = δ_{f}

as input.

Vehicle dynamics data were captured on the Handling Roadway (HRW) test bench [55,56]. The HRW, depicted in Figure 2, allows for the simultaneous capture of longitudinal, lateral, and vertical dynamics of the vehicle. For this study, the vehicle is set to a constant velocity, while the steering wheel robot excites the lateral degree of freedom. In total, three different dedicated experiments were conducted for training and validation of the network, resulting in three different dedicated datasets. Both training data sets consisted of stochastic excitation with an excitation spectrum typical for a human driver. The maximum steering wheel angle is

δ_{sw, \max} = 35 °

, which results in a static lateral acceleration of around

a_{y} = 7 m / s^{2}

for the vehicle velocity of

v_{x} = 33.3 m / s

. The power spectral density of the steering wheel excitation realizes

δ_{sw, \max}

up the the break frequency of 1 Hz and subsequently exhibits a

1 / f^{2}

roll-off up the the cut-off frequency of 5 Hz. The validation set also used stochastic excitation with a different random seed and a frequency of 0.8 Hz, but additionally included specific deterministic maneuvers like a lane change.

Three purely data-driven approaches were used as baseline benchmarks: an LSTM, a Transformer, and a Mamba, all of which were trained end-to-end. Additionally, the same data were used to parametrize a first principal physics-based LTI model.

Conversely, three different HyPA-Net setups were trained and tested. In all cases, the steering wheel angle

δ_{sw}

is used as input to the network. For the first variant (V1), the FWS model calculates the corresponding front wheel steering angle

δ_{f}

as a function of

δ_{sw}

using an LTI system. Since the FWS in the test vehicle is a fixed gear ratio mechanical transmission, a linear relationship with a constant steering ratio is a very close approximation and can be experimentally verified. The RWS model, on the other hand calculates the rear wheel steering angle

δ_{r}

as a function of

δ_{sw}

using an ANN. The rear wheel steering in the test vehicle is an active system controlled by an electronic control unit with no mechanical transmission line. It uses mainly the steering wheel angle, but also changes its behavior based on other influences like vehicle velocity or yaw rate. Furthermore, the exact control algorithm is unknown, so a simple linear relationship is therefore not sufficient to model the rear wheel steering. Instead, an LSTM, a Transformer, and a Mamba network are used to model the unknown dynamics of the rear wheel steering, so V1 is tested with three different ANN architectures. However, there is no explicit training of the ANN for the rear wheel steering. Training is done implicitly by training the whole model. A flow chart of this variant is shown in Figure 3a)

As the training data contains lateral accelerations of up to

a_{y} = 7 m / s^{2}

, the small side slip angle assumption is violated, meaning the single track model is not necessarily valid anymore. Deviations between the ground truth and the prediction are to be expected. It is furthermore expected that these deviations will be also addressed by the ANN modeling the rear wheel steering incorporating not only the behavior of the true RWS but also the deviations from the STM. To address these deviations, a correction model can be introduced. This correction model is placed after the single track model to ensure a better fit to the training data. In this case, only LSTM networks are used. A flow chart of this version of the HyPA-Net is shown in Figure 3b). For this variant (V2), the training process is done in two stages: first, training the RWS LSTM network and then training the correction model.

The third variant uses a physical RWS model as a simplification and only corrects for the unknown dynamics (see Figure 3c). Using a fixed velocity for the test setup, a ratio between the steering wheel angle

δ_{sw}

and the rear wheel steering angle

δ_{r}

can be calculated which is incorporated in the LTI system. However, it is clear from the data, that there is a neglected temporal relationship between the steering wheel angle and the rear wheel steering angle. Therefore, an LSTM network is used to correct for the unknown parts of the vehicle dynamics after the STM model.

The training process of the HyPA-Net is the same as for a standard ANN. It is implemented in PyTorch and the training is done on a GeForce RTX 3090 GPU. At a sampling rate of

1024 Hz

, the training and validation data was downsampled to

32 Hz

to ensure bigger differences between two time steps. The NAdam optimizer [57] was used with a learning rate of

5 e - 4

. The MSE is used as loss function. The output of the network is the yaw rate

\dot{ψ}

and the side slip angle

β

based on the input steering wheel angle

δ_{sw}

. The sequence length was set to 100, meaning that the network processes 100 time steps at once, and a batch size of 32. In all setups, the training was done for a maximum of 50 epochs with early stopping based on the validation loss.

6. Results

To evaluate the performance of the different setups, different metrics are used. Table 1 shows the results of the different setups for the validation data sets. The setups are compared using the following metrics: mean squared error (MSE), mean absolute error (MAE), mean absolute percentage error (MAPE) and

R^{2}

score. As expected, the physical single track model shows the worst performance across all metrics as the single track model is not capable of capturing the full vehicle dynamics. On the other hand, the end-to-end LSTM model shows the best performance, while Transformer and Mamba also perform well. The hybrid models (V1, V2, V3) fall in between and enjoy very similar and competitive performance. V1 is the model with the implicit RWS training. The results show that the implicit RWS model is able to capture the dynamics of the rear wheel steering without explicit training and therefore improve the overall performance. Notably, the combination of V1 with the two S4 models yield similar performance as the V1 LSTM combination where MAE and MSE are better for the S4 combinations and MAPE and

R^{2}

better for the LSTM on, although the Transformer and Mamba perform worse than the LSTM as end-to-end networks. Because of similar performance and less training effort for the LSTM architecture for this scenario, Transformer and Mamba are subsequently disregarded. V2 is the model that uses both RWS and the correction model as LSTM networks. This model performs the best among the hybrid models, even being on par with the end-to-end LSTM model for the

R^{2}

score. An ANN as an output of the model improves the performance as expected because it is able to match the training data as close as possible. Similar behavior is observable with other ANN instead of an LSTM. V3 is the model with the explicit RWS and a correction LSTM model. This variant also shows improved performance in comparison to the first principle physics model, because the LSTM at the end is able to capture all neglected dynamics and also compensate for an ill-defined rear wheel steering angle. While this variant is certainly better performing, there is no benefit with regard to the characterization of the potential unknown subsystem of the RWS. In general, the results show that the hybrid approach is able to improve the performance of the model compared to the purely physics-based model because of the addition of the residual neural network, while still benefiting from the interpretability of the physical model. Furthermore, the hybrid approach is also competitive with end-to-end ANN with regard to prediction accuracy.

While for the training of the networks, no information about the measured rear wheel steering angle

δ_{r}

is used, it is used as an additional variable for the evaluation of the setups to evaluate the performance of the implicit training of submodels within the HyPA-Net framework. Figure 4 shows the comparison of the ground truth and the different predictions for the rear wheel steering angle. The plot shows that the implicit RWS model (V1, orange line) is able to predict the rear wheel steering angle quite well. The same is true for V2 (red line). This confirms that the HyPA-Net framework is able to implicitly learn the dynamics of a submodel, in this case the rear wheel steering, without explicit training of the submodel and within the context of a larger model. However, it is also visible that both learned RWS models also incorporate dynamic behavior, that is not related to the RWS but still improves the overall performance in the context of the larger model. This is due to the fact that the overall vehicle dynamics are not fully captured by the single track model because of the neglected tire dynamics and the violation of the small side slip angle assumption. The RWS model is able to capture some of these neglected dynamics, which is why it is able to improve the overall performance of the model. Incorporating a correction model or even a dedicated submodel for the tire partially addresses this issue, but isolating the tire dynamics is difficult as the required measurements of forces and torques at the tire are not available in the current setup.

In the domain of vehicle dynamics, transfer functions are often used as a way of characterizing a vehicle’s dynamic behavior. The yaw transfer function describes the relationship between the steering input

δ_{sw}

and the yaw rate

\dot{ψ}

of the vehicle. Figure 5 shows the yaw transfer function for the different setups. The single track model shows in its transfer function a more damped behavior compared to the ground truth. It is not able to capture the full dynamics of the system. The learned RWS (V1 LSTM) improves on this, resulting in smaller deviations from the ground truth. The other hybrid models with the correction (V2, V3), as well as the baseline ANNs show a transfer function that is even closer to the ground truth, supporting the above results. This indicates that the correction part at the end is able to adjust the model’s predictions to better match the true system dynamics.

7. Conclusions

This study presents a novel way to combine first-principles models with ANNS in a way, that the ANN operates in the normalized data domain, the physical meaning of the values are preserved and a subsystem modeling is possible though implicit training within the ANN. The results show that the hybrid approach is able to improve the performance of the model compared to the purely physics-based model through a data driven approach, while still benefiting from the interpretability of the physical model. While ANN models need to be retrained if the underlying system dynamics change, traditional methods can often be adapted to new conditions with minimal adjustments. However, those methods are often found to be less accurate than ANNs, especially in complex systems with nonlinear dynamics. The hybrid approach in this paper combines the strengths of both ANNs and classical control methods, allowing for a more flexible and accurate modeling of complex systems. This highlights the potential for hybrid approaches that leverage the strengths of both ANNs and classical methods. However, the results shown in this study only reflect the capabilities of the current implementation with one specific architecture and one specific dataset. This limits the generalization of the findings, so further research is needed to validate and extend these findings. In an ongoing work, we aim to generalize the HyPA-Net framework to incorporating nonlinear and time variant physical models to allow for a more flexible and accurate modeling of complex systems with nonlinear dynamics. Furthermore, we plan to apply the introduced method to different systems.

Author Contributions

Conceptualization, Laurin Ludmann; methodology, Laurin Ludmann; software, Laurin Ludmann; validation, Laurin Ludmann and Jaeyoun Choi; formal analysis, Laurin Ludmann; resources, Andreas Wagner and Chuchu Fan; writing—original draft preparation, Laurin Ludmann; writing—review and editing, Jaeyoun Choi, Jens Neubeck and Andreas Wagner; visualization, Laurin Ludmann; supervision, Jens Neubeck, Andreas Wagner and Chuchu Fan; project administration, Jens Neubeck; funding acquisition, Laurin Ludmann, Andreas Wagner and Chuchu Fan. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the MIT International Science and Technology Initiatives Seed Fund.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

ANN	Artificial Neural Network
FWS	Front Wheel Steering
HRW	Handling Roadway
HyPA-Net	Hybrid Physics-Augmented Neural Network
LLM	Large Language Model
LSTM	Long Short-Term Memory
LTI	Linear Time Invariant
MAE	Mean Absolute Error
MAPE	Mean Absolute Percentage Error
MSE	Mean Squared Error
ODE	Ordinary Differential Equation
RNN	Recurrent Neural Networks
RWS	Rear Wheel Steering
S4	Structured State Space Sequence

References

Serrano, G.; Jacinto, M.; Ribeiro-Gomes, J.; Pinto, J.; Guerreiro, B.J.; Bernardino, A.; Cunha, R. Physics-Informed Neural Network for Multirotor Slung Load Systems Modeling. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 12592–12598. [CrossRef]
Fabiani, F.; Stellato, B.; Masti, D.; Goulart, P.J. A neural network-based approach to hybrid systems identification for control. 2024. [Google Scholar] [CrossRef]
Krätschmer, A.; Lutchen, R.; Reuss, H.C. AI-Based Diagnostic Tool for Offline Evaluation of Measurement Data on Test Benches. In 21. Internationales Stuttgarter Symposium; Proceedings; Bargende, M., Reuss, H.C., Wagner, A., Eds.; Springer Fachmedien Wiesbaden: Wiesbaden, 2021; pp. 203–214. [Google Scholar] [CrossRef]
Milojevi, S.; Bodza, S.; Cimniak, V.; Angerbauer, M.; Rether, D.; Grill, M.; Bargende, M. Data-Driven Modeling: An AI Toolchain for the Powertrain Development Process. In Proceedings of the SAE Technical Paper Series. SAE International400 Commonwealth Drive, Warrendale, PA, United States, 2022, SAE Technical Paper Series. SAE Technical Paper Series. [CrossRef]
Cai, S.; Wang, Z.; Wang, S.; Perdikaris, P.; Karniadakis, G.E. Physics-Informed Neural Networks for Heat Transfer Problems. J. Heat Transf. 2021, 143. [Google Scholar] [CrossRef]
Da Lio, M.; Bortoluzzi, D.; Rosati Papini, G.P. Modelling longitudinal vehicle dynamics with neural networks. Veh. Syst. Dyn. 2020, 58, 1675–1693. [Google Scholar] [CrossRef]
Zhou, T.; Gao, S.; Wang, J.; Chu, C.; Todo, Y.; Tang, Z. Financial time series prediction using a dendritic neuron model. Knowl.-Based Syst. 2016, 105, 214–224. [Google Scholar] [CrossRef]
Rabczuk, T. Machine Learning in Modeling and Simulation: Methods and Applications . In Computational Methods in Engineering and the Sciences Series, 1st ed.; Springer International Publishing AG: Cham, 2023. [Google Scholar]
Karim, S.A.A. (Ed.) Intelligent systems modeling and simulation III: Artificial intelligent, machine learning, intelligent functions and cyber security. In Studies in systems, decision and control; Springer: Cham, 2024; Vol. 553. [Google Scholar]
Lipton, Z.C.; Berkowitz, J.; Elkan, C. A Critical Review of Recurrent Neural Networks for Sequence Learning. 2024. [Google Scholar]
Salehinejad, H.; Sankar, S.; Barfett, J.; Colak, E.; Valaee, S. Recent Advances in Recurrent Neural Networks, 2017.
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Hermansdorfer, L.; Trauth, R.; Betz, J.; Lienkamp, M. End-to-End Neural Network for Vehicle Dynamics Modeling. In Proceedings of the 6th International Congress on Information Science and Technology; Piscataway, NJ, El Mohajir, M., Ed.; 2020; pp. 407–412. [Google Scholar] [CrossRef]
Sarker, I.H. Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Comput. Sci. 2021, 2, 420. [Google Scholar] [CrossRef]
Wang, Z.; Ho, D.W.C.; Liu, X. State estimation for delayed neural networks. IEEE Trans. Neural Netw. 2005, 16, 279–284. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef]
Lim, B.; Arik, S.O.; Loeff, N.; Pfister, T. Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting, 2019.
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need, 2017.
Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. AAA1 2021, 35, 11106–11115. [Google Scholar] [CrossRef]
Ludmann, L.; Zeitvogel, D.; Krantz, W.; Neubeck, J.; Wagner, A. Detection of Driving Dynamics Anomalies Using Deep Learning. In 2024 Stuttgart International Symposium on Automotive and Engine Technology; Casal Kulzer, A., Reuss, H.C., Wagner, A., Eds.; ATZ live, Springer Vieweg: Wiesbaden and Heidelberg, 2024; pp. 382–393. [Google Scholar] [CrossRef]
Gu, A.; Dao, T.; Ermon, S.; Rudra, A.; Re, C. HiPPO: Recurrent Memory with Optimal Polynomial Projections, 2020.
Gu, A.; Goel, K.; Ré, C. Efficiently Modeling Long Sequences with Structured State Spaces, 2021.
Dao, T.; Gu, A. Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality, 2024.
Fu, D.Y.; Dao, T.; Saab, K.K.; Thomas, A.W.; Rudra, A.; Ré, C. Hungry Hungry Hippos: Towards Language Modeling with State Space Models, 2022.
Poli, M.; Massaroli, S.; Nguyen, E.; Fu, D.Y.; Dao, T.; Baccus, S.; Bengio, Y.; Ermon, S.; Ré, C. Hyena Hierarchy: Towards Larger Convolutional Language Models, 2021.
Sun, Y.; Dong, L.; Huang, S.; Ma, S.; Xia, Y.; Xue, J.; Wang, J.; Wei, F. Retentive Network: A Successor to Transformer for Large Language Models, 2023.
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations. CoRR 2017.
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations. CoRR 2017.
Cai, S.; Mao, Z.; Wang, Z.; Yin, M.; Karniadakis, G.E. Physics-informed neural networks (PINNs) for fluid mechanics: a review. Acta Mech. Sin. 2021, 37, 1727–1738. [Google Scholar] [CrossRef]
Huang, B.; Wang, J. Applications of Physics-Informed Neural Networks in Power Systems - A Review. IEEE Trans. Power Syst. 2023, 38, 572–588. [Google Scholar] [CrossRef]
Misyris, G.S.; Venzke, A.; Chatzivasileiadis, S. Physics-Informed Neural Networks for Power Systems. 2020 IEEE Power Energy Society General Meeting (PESGM) 2020. [CrossRef]
Chen, R.T.Q.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D.K. Neural ordinary differential equations. 32nd Conference on Neural Information Processing Systems 2018.
Kim, S.; Ji, W.; Deng, S.; Ma, Y.; Rackauckas, C. Stiff neural ordinary differential equations. Chaos An. Interdiscip. J. Nonlinear Sci. 2021, 31, 093122. [Google Scholar] [CrossRef]
Dandekar, R.; Chung, K.; Dixit, V.; Tarek, M.; Garcia-Valadez, A.; Vemula, K.V.; Rackauckas, C. Bayesian Neural Ordinary Differential Equations. CoRR 2022.
Zhu, A.; Jin, P.; Zhu, B.; Tang, Y. On Numerical Integration in Neural Ordinary Differential Equations. International Conference on Machine Learning 2022, pp. 27527–27547.
Yan, H. ; Du Jiawei; Tan, V.Y.F.; Feng, J. On Robustness of Neural Ordinary Differential Equations. CoRR 2022. 2022.
Brunton, S.L.; Budišić, M.; Kaiser, E.; Kutz, J.N. Modern Koopman Theory for Dynamical Systems, 2021. [CrossRef]
Tu, J.H. Dynamic Mode Decomposition: Theory and Applications; 2013.
Colbrook, M.J. The Multiverse of Dynamic Mode Decomposition Algorithms, 2023. [CrossRef]
Yeung, E.; Kundu, S.; Hodas, N. Learning Deep Neural Network Representations for Koopman Operators of Nonlinear Dynamical Systems 2019. pp. 4832–4839. [CrossRef]
Xiao, Y.; Zhang, X.; Xu, X.; Liu, X.; Liu, J. Deep Neural Networks With Koopman Operators for Modeling and Control of Autonomous Vehicles. IEEE Trans. Intell. Veh. 2023, 8, 135–146. [Google Scholar] [CrossRef]
Puente León, F.; Jäkel, H. Signale und Systeme, 7., überarbeitete auflage ed. In De Gruyter Oldenbourg Studium; De Gruyter Oldenbourg: Berlin and Boston, 2019. [Google Scholar]
Heer, B.; Maußner, A. (Eds.) Dynamic general equilibrium modeling: Computational methods and applications, 2 ed.; Springer: Berlin and Heidelberg, 2009. [Google Scholar] [CrossRef]
Isermann, R.; Münchhof, M. Identification of Dynamic Systems: An Introduction with Applications; Springer-Verlag Berlin Heidelberg: Berlin, Heidelberg, 2011. [Google Scholar]
Unbehauen, H. Klassische Verfahren zur Analyse und Synthese linearer kontinuierlicher Regelsysteme, Fuzzy-Regelsysteme, 15., überarbeitete und erweiterte auflage ed.; Vol. 1, Studium Automatisierungstechnik, Vieweg + Teubner: Wiesbaden, 2008. [CrossRef]
Granger, C.; Newbold, P. Forecasting Economic Time Series; Elsevier, 1986. [CrossRef]
Choutri, S.E.; Djehiche, B.; Mazhar, O. Empirical validation of novel non-asymptotic bounds on the least squares estimator for LTI systems with applications in economics. Asian J. Econ. Bank. 2025. [Google Scholar] [CrossRef]
Lopez, M.S.; Prasad, J.V.R. Linear Time Invariant Approximations of Linear Time Periodic Systems. J. Am. Helicopter Soc. 2017, 62, 1–10. [Google Scholar] [CrossRef]
Lavretsky, E.; Wise, K.A. (Eds.) Robust and Adaptive Control. In Advanced Textbooks in Control and Signal Processing; Springer International Publishing: Cham, 2024. [Google Scholar] [CrossRef]
Kalman, R.E. A New Approach to Linear Filtering and Prediction Problem. Transactions of the ASME - Journal of Basic Engineering 1960.
Riekert, P.; Schunck, T.E. Zur Fahrmechanik des gummibereiften Kraftfahrzeugs. Ingenieur-Archiv 1940, 11, 210–224. [Google Scholar] [CrossRef]
Küçükay, F. Grundlagen der Fahrzeugtechnik; Springer Fachmedien Wiesbaden: Wiesbaden, 2022. [Google Scholar] [CrossRef]
Schindler, E. Fahrdynamik: Grundlagen des Lenkverhaltens und ihre Anwendung für Fahrzeugregelsysteme; Vol. 685, Kontakt & Studium, expert-Verl.: Renningen, 2007.
Schramm, D.; Hiller, M.; Bardini, R. Modellbildung und Simulation der Dynamik von Kraftfahrzeugen; Springer Berlin Heidelberg: Berlin, Heidelberg, 2018. [Google Scholar] [CrossRef]
Zeitvogel, D.; Ahlert, A.; Neubeck, J.; Krantz, W.; Wiedemann, J.; Boone, F.; Kan, W. An Innovative Test System for Holistic Vehicle Dynamics Testing. In Proceedings of the SAE Technical Paper Series. SAE International400 Commonwealth Drive, Warrendale, PA, United States, 2019, SAE Technical Paper Series. Warrendale, PA, United States, 2019. [CrossRef]
Zeitvogel, D.; Krantz, W.; Neubeck, J.; Wagner, A. Holistic vehicle parametrization on a handling roadway. Automot. Engine Technol. 2022, 7, 209–216. [Google Scholar] [CrossRef]
Dozat, T. Incorporating Nesterov Momentum into Adam. Proceedings of the 4th International Conference on Learning Representations 2016.

Figure 1. One general example variant of the HyPA-Net architecture. Two known components of the system are modeled using LTI systems (sharp edges with "phys" label), while the unknown part is modeled using an ANN (rounded edges).

Figure 2. The schematics of the Handling Roadway test bench. The vehicle is fastened in the center of gravity. Roll, pitch, and vertical degrees of freedom are free, while longitudinal, lateral and yaw are fixed and virtually addressed using load cells. Each corner module is able to excite vertically, set a steer angle, and road speed individually. Body force actuators can overlay virtual forces like aerodynamics.

Figure 3. Different variants for the HyPA-Net with LTI systems (sharp edges) and ANNs (rounded edges): a) V1: FWS and STM as physical models, RWS as ANN; b) V2: additional correction ANN; c) V3: FWS, RWS and STM as physical models, only a correction model as ANN.

Figure 4. The rear wheel steering angle over time. The ground truth is shown as a black dashed line, while the different predictions are solid lines.

Figure 5. The yaw transfer function for the different setups. The ground truth is shown as a dashed black line, while the different predictions are solid lines.

Table 1. Result metrics of the different setups.

Setup	MAE	MSE	MAPE	$R^{2}$
physical	$2.45 \times 10^{- 3}$	$2.1 \times 10^{- 5}$	$3.44$	$0.962$
LSTM	$1.09 \times 10^{- 3}$	$0.3 \times 10^{- 5}$	$2.90$	$0.993$
Transformer	$1.43 \times 10^{- 3}$	$0.8 \times 10^{- 5}$	$2.65$	$0.981$
Mamba	$1.48 \times 10^{- 3}$	$1.4 \times 10^{- 5}$	$3.03$	$0.973$
V1 LSTM	$2.17 \times 10^{- 3}$	$1.7 \times 10^{- 5}$	$2.67$	$0.984$
V1 Transformer	$1.95 \times 10^{- 3}$	$1.5 \times 10^{- 5}$	$2.71$	$0.974$
V1 Mamba	$1.95 \times 10^{- 3}$	$1.3 \times 10^{- 5}$	$3.35$	$0.980$
V2	$1.12 \times 10^{- 3}$	$0.4 \times 10^{- 5}$	$2.56$	$0.993$
V3	$1.34 \times 10^{- 3}$	$0.5 \times 10^{- 5}$	$3.07$	$0.991$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

A Hybrid Physics-Augmented Neural Network for Dynamic System Modeling with Partially Known Dynamics

Abstract

Keywords:

Subject:

1. Introduction

2. Related Work

2.1. State Space Models with Neural Networks

2.2. Physics-Informed Neural Network Approaches

2.3. Neural Ordinary Differential Equations

2.4. Koopman Neural Networks

3. Problem Statement

4. Methodology

5. Experiments

6. Results

7. Conclusions

Author Contributions

Funding

Conflicts of Interest

Abbreviations

References

MDPI Initiatives

Important Links

Subscribe