Repulsive Guidance for Memorization Mitigation in Text-to-Music Diffusion Models

Taehyeon Kim; Hangyeol Lee; Chang Wook Ahn; Man-Je Kim

doi:10.20944/preprints202603.0982.v1

Submitted:

11 March 2026

Posted:

12 March 2026

You are already at the latest version

Abstract

Recent progress in text-to-music generation has enabled high-quality audio synthesis from natural language prompts. However, such models are at risk of unintended replication, raising concerns regarding originality and intellectual property. While training-time mitigation strategies can address this issue, they typically require retraining or curated datasets, limiting their practicality for largescale systems. Inference-time methods provide a more lightweight alternative but often involve a trade-off between fidelity and memorization risk. This work introduces Repulsive Guidance (RG), a systematic inference-time mitigation strategy that reduces memorization without disrupting the intended conditional guidance from the text prompt. RG operates by enforcing divergence between dual diffusion trajectories through a repulsive term applied only during early denoising steps, without reversing the conditional guidance from the prompt. Experiments on MusicBench with the TANGO model demonstrate that RG offers a complementary mitigation strategy, providing new insights into balancing fidelity and memorization risk.

Keywords:

music generation

;

memorization

;

repulsive guidance

;

attraction basin

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Repulsive Guidance for Memorization Mitigation in Text-to-Music Diffusion Models

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe