Reinforcement Learning with Verifiable Rewards (RLVR) has become a central paradigm for post-training large language models, yet group-relative methods often suffer from zero advantage failures, where identical rollout rewards erase the policy-gradient signal. A growing body of work addresses this bottleneck by intervening in rollout-group construction to restore learnable contrasts. Among these efforts, methods that introduce external textual signals beyond the model’s own distribution, such as reference trajectories, abstract scaffolds, and reusable experience, have emerged as a key branch, as they can restore learnable contrasts while expanding the model’s capability boundary. This survey provides the first systematic survey of this branch: we introduce Hint as a unifying concept for such external textual signals and organize hint-based RL methods into sample-level hints, covering trajectory-based and scaffold-based guidance, and task-level hints, covering static and evolving experience bases. Beyond taxonomy, we further clarify the boundaries, cross-level analysis of construction and utilization, and future directions. We maintain an up-to-date resource list at https://github.com/WYRipple/Awesome-Hint-Based-RL