Submitted:
31 July 2024
Posted:
01 August 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Neural Ordinary Differential Equations (NODE): Basic Properties and Uses
- (i)
- The quantity is a time-like independent variable which parameterizes the dynamics of the hidden/latent neuron units; the initial value is denoted as (which can be considered to be an initial measurement time) while the stopping value is denoted as (which can be considered to be the next measurement time).
- (ii)
- The -dimensional vector-valued function represents the hidden/latent neural networks. In this work, all vectors are considered to be column vectors and the dagger “” symbol will be used to denote “transposition.” The symbol “” signifies “is defined as” or, equivalently, “is by definition equal to.”
- (iii)
- The-dimensional vector-valued nonlinear function models the dynamics of the latent neurons with learnable scalar adjustable weights represented by the components of the vector , where denotes the total number of adjustable weights in all of the latent neural nets.
- (iv)
- The -dimensional vector-valued function represents the “encoder” which is characterized by “inputs” and “learnable” scalar adjustable weights , where denotes the total number of “inputs” and denotes the total number of “learnable encoder weights” that define the “encoder.”
- (v)
- The -dimensional vector-valued function represents the vector of “system responses.” The vector-valued function represents the “decoder” with learnable scalar adjustable weights, which are represented by the components of the vector , where denotes the total number of adjustable weights that characterize the “decoder.” Each component can be epresented in integral form as follows:
3. Illustrative Paradigm Application: NODE Conceptual Modeling of the Nordheim-Fuchs Phenomenological Reactor Dynamics/Safety Model
- The time-dependent neutron balance (point kinetics) equation for the neutron flux :
- 2.
- The energy production equation:
- 3.
- The energy conservation equation:
- 4.
- The reactivity-temperature feedback equation: , where denotes the changed multiplication factor following the reactivity insertion at , denotes the magnitude of the negative temperature coefficient, denotes the reactor’s temperature, and denotes the reactor’s initial temperature at time . For illustrating the application of the 1st-FASAM methodology, it suffices to consider the special case of a “prompt critical transient”, when the reactor becomes prompt critical after the reactivity insertion, i.e., when , so that the reactivity-temperature feedback equation takes on the following particular form:
- Eliminating the function from Eqs. (13) and (14) yields a nonlinear differential equation which can be integrated directly to obtain the following relation:
- (ii)
- Using Eq. (16) in Eq. (14) yields the following nonlinear equation for the released energy :
- (iii)
- Replacing Eq. (18) into Eq. (16) yields the following closed-form expression for :
- (iv)
- Replacing Eq. (18) into Eq. (11) yields the following closed-form expression for :
- (i)
- The neutron flux in the reactor at a “final time” instance denoted as , after the initiation at of the prompt-critical power transient, which can be defined mathematically as follows:
- (ii)
- The total energy per cm3, , released at a user-chosen “final time” instance denoted as , after the initiation at of the prompt-critical power transient, which can be defined mathematically as follows:
- (iii)
- The reactor’s temperature at a “final time” instance denoted as after the initiation at of the prompt-critical power transient, which can be defined mathematically as follows:
4. First-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations (1st-CASAM-NODE): Mathematical Framework
5. Illustrative Application of the 1st-CASAM-NODE Methodology to Compute First-Order Sensitivities of Nordheim-Fuchs Model Responses with respect to the Underlying Parameters
5.1. First-Order Sensitivities of the Flux Response
- Consider that the 1st-level variational function , is an element in a Hilbert space denoted as , , comprising elements of the form , , and being endowed with the inner product introduced in Eq. (50), which takes on the following particular form for the Nordheim-Fuchs model:
- 2.
- Use Eq. (79) to form the inner product of Eqs. (69)‒(71) with a yet undefined function , to obtain the following relation, which is the particular form taken on by Eq. (51) for the Nordheim-Fuchs model:
- 3.
- Integrating by parts the terms on the left-side of Eq. (80) yields the following relation
- 4.
- The definition of the function is now completed by requiring that: (i) the integral term on the right-side of Eq.(81) represent the G-differential defined in Eq. (62), and (ii) the appearance of the unknown values of the components of be eliminated from appearing in Eq. (81). These requirements will be satisfied if the function is the solution of the following “1st-Level Adjoint Sensitivity System (1st-LASS)”:
- 5.
- Using Eqs. (84), (85), (80), (62), (72), (73) and (74) in Eq. (81) yields the following expression for the first G-differential of the response under consideration:
5.2. First-Order Sensitivities of the Energy Released Response
5.3. First-Order Sensitivities of the Temperature Response
5.4. First-Order Sensitivities of the Thermal Conductivity Response
5.5. Most Efficient Computation of First-Order Sensitivities: Application of the 1st-FASAM-N
6. Discussion and Conclusions
Funding
Conflicts of Interest
References
- Haber, E.; Ruthotto, L. Stable architectures for deep neural networks. Inverse problems, 2017, 014004.
- Lu, Y; Zhong, A.; Li, Q.; Dong, B. Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations. International Conference on Machine Learning, PMLR, 2018, 3276–3285.
- Ruthotto, L.; Haber, E. Deep neural networks motivated by partial differential equations. Journal of Mathematical Imaging and Vision, 2018, 352–364.
- Chen, R.T.Q.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D.K. Neural ordinary differential equations. In Advances in Neural Information Processing Systems, 31, 2018, pp. 6571–6583. Curran Associates, Inc., 2018. arXiv:1806.07366v5 [cs.LG] 14 Dec 2019.
- Dupont, E.; Doucet, A.; The, Y.W. Augmented neural odes. Advances in Neural Information Processing Systems, 32, 2019, 14–15.
- Kidger, P.; On Neural Differential Equations, arXiv e-prints (2022), arXiv:2202.02435.
- Kidger, P.; Morrill, J.; Foster, J.; Lyons, T.; Neural controlled differential equations for irregular time series, Advances in Neural Information Processing Systems, 2020, 33, 6696–6707.
- Morrill, J. Salvi, C.; Kidger, P.; Foster, J. Neural rough differential equations for long time series, International Conference on Machine Learning, PMLR, 2021, 7829–7838.
- Grathwohl, W. , Chen, R. T. Q., Bettencourt, J., Sutskever, I., and Duvenaud, D. Ffjord: Free-form continuous dynamics for scalable reversible generative models. International Conference on Learning Representations, 2019.
- Zhong, Y. D. , Dey, B., and Chakraborty, A. Symplectic ode-net: Learning Hamiltonian dynamics with control. In International Conference on Learning Representations, 2020.
- Tieleman, T.; Hinton, G. Lecture 6.5—RMSProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 2012.
- Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. International Conference on Learning Representations, 2015.
- Pontryagin, L.S. Mathematical Theory of Optimal Processes. CRC Press, Boca Raton, FL, USA, 1987.
- LeCun, Y.; Touresky, D.; Hinton, G.; Sejnowski, T. A theoretical framework for back-propagation. In Connectionist Models Summer School, 1988.
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 1998, 2278–2324.
- Norcliffe, A.; Deisenroth, M.P. Faster training of neural ODEs using Gauss–Legendre quadrature. Transactions on Machine Learning Research, 08/2023. Code available at: https://github.com/a-norcliffe/torch_gq_adjoint.
- Lamarsh, J.R. Introduction to Nuclear Reactor Theory, Adison-Wesley Publishing Co., Reading MA, USA; 1966; pp. 491-492.
- Hetrick, D. L. Dynamics of Nuclear Reactors, American Nuclear Society, Inc., La Grange Park, IL., USA, 1993; pp. 164-174.
- Cacuci, D.G. Computation of high-order sensitivities of model responses to model parameters. II: Introducing the Second-Order Adjoint Sensitivity Analysis Methodology for Computing Response Sensitivities to Functions/Features of Parameters,” Energies, 16, 2023, 6356. [CrossRef]
- Tukey, J.W. (1957) The Propagation of Errors, Fluctuations and Tolerances; Technical Reports No. 10–12; Princeton University. Princeton, NJ, USA, 1957.
- Cacuci, D.G. The nth-Order Comprehensive Adjoint Sensitivity Analysis Methodology (nth-CASAM): Overcoming the Curse of Dimensionality in Sensitivity and Uncertainty Analysis, Volume I: Linear Systems. 2022. [Google Scholar] [CrossRef]
- Cacuci, D.G. The Fourth-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (4th-CASAM-N): I. Mathematical Framework. Journal of Nuclear Eng.
- Cacuci, D.G. (1981a). Sensitivity theory for nonlinear systems: I. Nonlinear functional analysis approach. J. Math. Phys., 1981, 22, pp–2794. [Google Scholar] [CrossRef]
- Cacuci, D.G. Introducing the nth-Order Features Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (nth-FASAM-N): I. Mathematical Framework,” Am. J. Comp. Math, 2024, 14, 11–42. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
