This paper presents an innovative multi-agent, data-driven reinforcement learning (RL) approach to develop and utilize the thermal equivalent network model that represents the motor's thermal dynamics. A multi-agent reinforcement learning is designed and trained to adjust the model parameters using data from several motor driving cycles. To ensure the incoming driving cycle matches the historical data before employing the pre-trained RL agents, offline statistical analysis and clustering techniques are developed and used. Numerical simulations highlighted the RL agent's ability to develop strategies that effectively address the variability of driving cycles. The proposed RL framework showed its capability to accurately reflect the motor's thermal behavior under various driving conditions.