Elastic couplings and flexible joints introduce lightly damped vibration modes that significantly complicate stabilization of nonlinear, underactuated systems. This paper studies a spring-coupled cart–inverted-pendulum benchmark inspired by the Quanser Linear Flexible Joint with Inverted Pendulum platform, where a motor-driven cart excites a passive cart through a spring–damper connection and the pendulum is mounted on the passive cart. The control objective is to stabilize the pendulum near the upright equilibrium while simultaneously regulating spring deflection and suppressing vibration. To avoid manual derivation of high-order analytical dynamics for this coupled system, we adopt a model-based reinforcement learning framework that learns task-oriented latent dynamics and performs online receding-horizon planning. Concretely, we implement Task-Oriented Latent Dynamics (TOLD) for learning a compact latent model and Temporal- Difference Model Predictive Control (TD-MPC) for MPPI-style trajectory optimization in latent space. We evaluate TD-MPC in a high-fidelity Isaac Sim / Isaac Lab simulation and compare it against a model-free PPO baseline under the same observation and action interfaces. Training curves of physical variables and returns show that TD-MPC learns coordinated balancing and spring regulation with stable convergence behavior, while PPO achieves competitive balancing performance with more pronounced non-monotonic training dynamics and transient regressions. The study highlights when online planning with learned latent models is advantageous for elastically coupled mechanisms.