Preprint
Article

This version is not peer-reviewed.

Hamilton-Jacobi-Bellman Equations and Reinforcement Learning: A Theoretical Framework and Empirical Study for Dynamic Credit Decision-Making

Submitted:

01 April 2026

Posted:

02 April 2026

You are already at the latest version

Abstract
Traditional credit scoring models reduce decisions to static classification, ignoring dynamic risk evolution and long-term profit. This paper integrates the Hamilton-Jacobi-Bellman (HJB) equation with deep reinforcement learning, reformulating credit risk as a discrete-time stochastic optimal control problem. Theoretically, we establish equivalence between discrete Markov decision processes and the HJB equation, prove existence and uniqueness of the optimal value function, derive the closed-form Riccati solution under linear-quadratic assumptions, and show neural network value iteration is an effective numerical scheme with separable errors. Empirically, using LendingClub data (2016–2018), the HJB-based PPO model significantly outperforms all static baseline models considered (e.g., logistic regression, random forest, XGBoost) in average profit (1.5167) and total profit (786,700.4682). Ablation experiments replacing the policy network with linear mapping reduce profit by 34.7%, confirming the necessity of nonlinear approximation. Theoretical validation gives a mean squared error of 0.0006 between the neural value function and Riccati solution. This work provides a rigorous mathematical foundation for reinforcement learning in financial risk control and a path from static classification to dynamic optimization in credit scoring.
Keywords: 
;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated