Fault-tolerant control in safety-critical industrial systems demands adaptive responses to equipment degradation, parameter drift, and sensor failures while maintaining strict operational constraints. Traditional model-based controllers struggle under these conditions, requiring extensive retuning and dense instrumentation. This work presents a safety-aware multi-agent deep reinforcement learning framework for adaptive fault-tolerant control in sensor-lean industrial environments, addressing three critical deployment barriers: formal safety guarantees, simulation-to-reality transfer, and instrumentation dependency. The framework integrates four synergistic mechanisms: (1) multi-layer safety architecture combining constrained action projection, prioritized experience replay, conservative training margins, and curriculum-embedded verification achieving zero constraint violations; (2) multi-agent coordination via decentralized execution with learned complementary policies; (3) curriculum-driven sim-to-real transfer through progressive four-stage learning achieving 85--92\% performance retention without fine-tuning; and (4) offline Extended Kalman Filter validation enabling 70\% instrumentation reduction (91--96\% reconstruction accuracy) while maintaining regulatory compliance. Validated through sustained deployment in commercial beverage manufacturing Clean-In-Place (CIP) systems—a representative safety-critical testbed with hard flow constraints ($\geq$1.5 L/s), harsh chemical environments, and zero-tolerance contamination requirements—the framework demonstrates superior control precision (coefficient of variation: 2.9--5.3\% versus 10\% industrial standard) across three hydraulic configurations spanning complexity range 2.1--8.2/10. Comprehensive validation comprising 37+ controlled stress-test campaigns and hundreds of production cycles (July--December 2025) confirms zero safety violations, high reproducibility (CV variation < 0.3\% across replicates), predictable complexity-performance scaling ($R^2 = 0.89$), and zero-retuning cross-topology transferability. The system has operated autonomously in active production since July 2025, establishing reproducible methodology for industrial reinforcement learning deployment in safety-critical, sensor-lean manufacturing environments.