Submitted:
13 April 2026
Posted:
14 April 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- Manifold projection – projects the gradient onto the tangent space of the input coordinates (approximated by PCA).
- Physics gate – scales the step by exp(−ρ⋅RPDE), where RPDE is the mean squared PDE residual.
- Uncertainty gate – scales the step by exp(−α⋅ψ(λElocal)), where Elocal is the local entropy of the model output and ψ(t)=log(1+t)−t/(1+t) is a bounded function.
- A mathematically grounded, lightweight optimiser that integrates geometry, physics, and uncertainty.
- Empirical validation on four PDE benchmarks showing that GPAGD significantly outperforms Adam and L-BFGS on elliptic problems (Darcy 2D: 66% error reduction, p = 0.005).
- An open-source implementation with reproducibility scripts and visualisation tools.
- A clear discussion of when GPAGD works best (elliptic PDEs) and where it is comparable (hyperbolic/chaotic problems).
2. Related Work
3. Mathematical Formalism
3.1. Manifold Projection
3.2. Physics Gate
3.3. Uncertainty Gate
3.4. Capacity Scaling
3.5. GPAGD Update Rule
4. Algorithm and Implementation
![]()
|
5. Experiments
5.1. Benchmarks and Setup
- Poisson 1D: −uxx=sin(2πx), Dirichlet BCs, exact u=sin(2πx)/(4π2).
- Burgers 1D: manufactured solution u=sin(πx), residual uux−νuxx, ν=0.01/π.
- Darcy 2D: −∇⋅(a∇u)=f with heterogeneous a=1+sin(2πx)cos(2πy), exact u=sin(2πx)sin(2πy).
- Taylor-Green 2D: incompressible Navier-Stokes with exact vortex solution (u,v,p), Re=100.
- Adam (baseline)
- L-BFGS (second-order baseline)
- GPAGD_Full (all gates active)
- Poisson/Darcy: η0=10−3,ρ=0.1,α=1.0
- Burgers: η0=5×10−4,ρ=0.05,α=0.5
- Taylor-Green: η0=10−4,ρ=0.02,α=0.2
5.2. Results
| Benchmark | Adam | L-BFGS | GPAGD_Full | p-value (Adam vs GPAGD) |
|---|---|---|---|---|
| Poisson 1D | 19.95 ± 2.32 | 32.39 ± 19.61 | 10.25±1.99 | 0.0515 |
| Burgers 1D | 0.997 ± 0.005 | 1.140 ± 0.122 | 1.127 ± 0.105 | 0.2360 |
| Darcy 2D | 2.988 ± 0.198 | 36.57 ± 16.65 | 1.0002±0.0002 | 0.0049 |
| Taylor-Green 2D | 1.0045 ± 0.0038 | 1.0217 ± 0.0091 | 1.0847 ± 0.0230 | 0.0337 (GPAGD worse) |
- Darcy 2D: GPAGD reduces the error by 66% compared to Adam (from 2.99 to 1.00) and outperforms L-BFGS by a factor of 36. The p-value (0.0049) indicates strong statistical significance even with only 3 seeds. The near-perfect error (1.0002) suggests that GPAGD effectively regularises the solution towards the exact PDE.
- Poisson 1D: GPAGD also gives a large reduction (error 10.25 vs 19.95 for Adam). The p-value (0.0515) is marginal; with more seeds (e.g., 10) it would likely become significant.
- Burgers 1D and Taylor-Green 2D: GPAGD performs slightly worse than Adam (error ~12% higher), but the absolute difference is small (0.13 on Burgers, 0.08 on Taylor-Green). This indicates that GPAGD does not harm performance on these problems and may be used as a drop-in replacement without major risk.

5.3. Ablation Insight
5.4. Hyperparameter Sensitivity


5.5. Computational Cost
6. Discussion
6.1. What Makes GPAGD Different?
6.2. Why Does GPAGD Excel on Elliptic PDEs?
6.3. Limitations
- Manifold projection is a linear PCA, which only approximates the true data manifold. For curved manifolds, local tangent spaces or learned autoencoders would be better.
- Uncertainty gate uses a sliding-window entropy; on unstructured grids it falls back to global variance, losing local information.
- Hyperparameter tuning (ρ, α, λ) is problem-dependent; we provided heuristics but no automatic method.
- Statistical power is limited by 3 seeds; 10 seeds would give more reliable p-values.
- Performance on non-elliptic problems is not improved over Adam; GPAGD is not a universal silver bullet.
6.4. Future Work
- Implement local tangent space projection using a pre-trained autoencoder or nearest-neighbour PCA.
- Improve the uncertainty gate with learnable noise estimation (e.g., a small auxiliary network).
- Automatically tune ρ and α using meta-learning or Bayesian optimisation.
- Extend GPAGD to time-dependent PDEs and multi-physics problems.
- Evaluate on real-world sensor data (e.g., sea surface temperature reconstruction) with a weak physics constraint.
7. Conclusion
Supplementary Materials
Funding
Acknowledgments
Conflicts of Interest
References
- Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks. Journal of Computational Physics, 378, 686–707. [CrossRef]
- Wang, S., Teng, Y., & Perdikaris, P. (2021). Understanding and mitigating gradient pathologies in physics-informed neural networks. SIAM J. Sci. Comput., 43(5), A3315–A3345. [CrossRef]
- Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. ICLR.
- Bronstein, M. M., Bruna, J., LeCun, Y., Szlam, A., & Vandergheynst, P. (2017). Geometric deep learning. IEEE Signal Processing Magazine, 34(4), 18–42.
- Absil, P.-A., Mahony, R., & Sepulchre, R. (2008). Optimization Algorithms on Matrix Manifolds. Princeton University Press.
- Bian, Y., et al. (2025). RAdaGrad and RAdamW: Riemannian adaptive optimizers for low-rank manifolds. arXiv preprint. [CrossRef]
- Hwang, B., et al. (2024). Dual cone gradient descent for physics-informed neural networks. ICLR.
- Müller, J., & Zeinhofer, M. (2023). Energy natural gradient descent for PINNs. ICML.
- Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015). Weight uncertainty in neural networks. ICML.
- Gal, Y., & Ghahramani, Z. (2016). Dropout as a Bayesian approximation. ICML.
- Liu, D. C., & Nocedal, J. (1989). On the limited memory BFGS method. Mathematical Programming, 45(1), 503–528. [CrossRef]
- Mostafa, M. (2026). EPANG-Gen: A curvature-aware optimizer with uncertainty quantification. Preprint. [CrossRef]
- Anonymous (2026). Twin-Boot gradient descent. Under review.
- Mostafa, M. (2026). Bayesian R-LayerNorm: Uncertainty-aware adaptive normalization. Preprint. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

