This paper presents a practical industrial hybrid control architecture that augments the widely deployed 49-rule Mamdani fuzzy supervisory PID controller with a lightweight online meta-tuner based on Soft Actor-Critic (SAC) reinforcement learning. While the inner 1 kHz fuzzy-PID loop remains fully deterministic and identical to the industrial baseline, a separate 10 Hz SAC agent autonomously adapts the three output scaling factors (α_Kp, α_Ki, α_Kd ∈ [0.5, 2.5]) of the fuzzy layer using an ONNX Runtime inference engine. The complete controller is implemented and experimentally validated on a real Siemens S7-1214C PLC (6ES7214-1AG40-0XB0) in a hardware-in-the-loop setup with a high-fidelity 5-DoF manipulator model incorporating measured friction, backlash, sensor noise, and payload variation (0–2.5 kg). Across four demanding scenarios (sinusoidal tracking, sudden payload jumps, sustained disturbances up to 0.76 Nm, and high-speed motions), the proposed method consistently achieves 46–52 % lower RMSE and 28–30 % reduced control energy compared to the fixed-scaling industrial baseline, while preserving strict real-time constraints (inner loop cycle time 0.68–0.89 ms, SAC inference < 0.6 ms). The full PLC program (SCL/FBD), HIL environment, and trained policies will be released open-source upon acceptance (DOI to be provided during revision).The full PLC program, HIL environment, and trained SAC policies will be released open-source as a preprint supplement.