A graph attention network-enhanced multi-agent proximal policy optimization (GAT-MAPPO) framework is proposed for cooperative guidance in counter-attack/defense scenarios. A dynamic heterogeneous interaction graph is formulated over interceptors and targets at every decision epoch. Through a multi-head graph attention encoder, relational features capturing both inter-interceptor cooperation and target threat dynamics are adaptively aggregated. These graph-enriched observations are processed by a Centralized-Training, Decentralized-Execution (CTDE) MAPPO architecture, guided by a hierarchical reward function that mandates miss distance minimization, simultaneity of arrival consensus, multi-directional encirclement, and smooth control effort. Furthermore, the integration of a three-stage curriculum learning strategy allows for robust cooperative policy derivation across transitions from rectilinear to highly adaptive evasion patterns, eliminating the need for explicit rule engineering. Extensive Monte Carlo simulations confirm GAT-MAPPO’s superior performance: achieving >95% interception success rate in 4-vs-4 scenarios and reducing mean simultaneity error by 41.4% compared to the MAPPO baseline. Comprehensive ablation studies validate the critical roles played by graph attention encoding, reward hierarchy design, and progressive curriculum staging.