Omnichannel Supply Chains Amid Demand Shocks: A Centralized Hierarchical Reinforcement Learning Framework

Panagiotis G. Giannopoulos; Thomas K. Dasaklis

doi:10.20944/preprints202603.1244.v1

Submitted:

14 March 2026

Posted:

17 March 2026

You are already at the latest version

Abstract

Background: The rapid evolution of omnichannel retailing has reshaped retail supply chains (SCs) by tightly coupling replenishment, fulfillment, and service decisions across multiple demand channels under inventory, lead-time, and capacity constraints. These interdependencies create complex coordination challenges, particularly when demand shocks interact with limited operational capacity. Methods: To address these challenges, this study develops a centralized Hierarchical Reinforcement Learning (HRL) control framework that makes decision timing explicit: replenishment and allocation are optimized weekly, while fulfillment and lateral inventory rebalancing are controlled daily. Policies are learned using Proximal Policy Optimization (PPO) in an actor–critic architecture with bounded stochastic policies suitable for constrained action spaces. To mitigate the curse of dimensionality often encountered in HRL, we introduce a capacity-aware state–action encoding mechanism that compresses the control interface into structured summary signals. Demand shocks are modeled using two specifications: a mixed regime where half the products follow uniform demand and half follow a Merton-type jump-diffusion process, and a fully shock-driven regime. Results: The framework is evaluated against forecast-driven base-stock and greedy fulfillment heuristics, as well as a perfect-information oracle. Results show that the proposed encoding improves learning efficiency and scalability, achieving higher profit and service performance than the full-observation alternative. Conclusions: Overall, hierarchically timed control outperforms heuristic baselines while remaining below the oracle bound, with the largest gains observed when demand shocks coincide with binding fulfillment and transfer capacities.

Keywords:

omnichannel supply chain

;

demand shocks

;

hierarchical reinforcement learning

;

proximal policy optimization

;

lateral transshipment

;

resilience

;

capacity constraints

;

jump-diffusion demand

Subject:

Business, Economics and Management - Business and Management

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Omnichannel Supply Chains Amid Demand Shocks: A Centralized Hierarchical Reinforcement Learning Framework

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe