Background: The rapid evolution of omnichannel retailing has reshaped retail supply chains (SCs) by tightly coupling replenishment, fulfillment, and service decisions across multiple demand channels under inventory, lead-time, and capacity constraints. These interdependencies create complex coordination challenges, particularly when demand shocks interact with limited operational capacity. Methods: To address these challenges, this study develops a centralized Hierarchical Reinforcement Learning (HRL) control framework that makes decision timing explicit: replenishment and allocation are optimized weekly, while fulfillment and lateral inventory rebalancing are controlled daily. Policies are learned using Proximal Policy Optimization (PPO) in an actor–critic architecture with bounded stochastic policies suitable for constrained action spaces. To mitigate the curse of dimensionality often encountered in HRL, we introduce a capacity-aware state–action encoding mechanism that compresses the control interface into structured summary signals. Demand shocks are modeled using two specifications: a mixed regime where half the products follow uniform demand and half follow a Merton-type jump-diffusion process, and a fully shock-driven regime. Results: The framework is evaluated against forecast-driven base-stock and greedy fulfillment heuristics, as well as a perfect-information oracle. Results show that the proposed encoding improves learning efficiency and scalability, achieving higher profit and service performance than the full-observation alternative. Conclusions: Overall, hierarchically timed control outperforms heuristic baselines while remaining below the oracle bound, with the largest gains observed when demand shocks coincide with binding fulfillment and transfer capacities.