Preprint
Article

This version is not peer-reviewed.

Repulsive Guidance for Memorization Mitigation in Text-to-Music Diffusion Models

Submitted:

11 March 2026

Posted:

12 March 2026

You are already at the latest version

Abstract
Recent progress in text-to-music generation has enabled high-quality audio synthesis from natural language prompts. However, such models are at risk of unintended replication, raising concerns regarding originality and intellectual property. While training-time mitigation strategies can address this issue, they typically require retraining or curated datasets, limiting their practicality for largescale systems. Inference-time methods provide a more lightweight alternative but often involve a trade-off between fidelity and memorization risk. This work introduces Repulsive Guidance (RG), a systematic inference-time mitigation strategy that reduces memorization without disrupting the intended conditional guidance from the text prompt. RG operates by enforcing divergence between dual diffusion trajectories through a repulsive term applied only during early denoising steps, without reversing the conditional guidance from the prompt. Experiments on MusicBench with the TANGO model demonstrate that RG offers a complementary mitigation strategy, providing new insights into balancing fidelity and memorization risk.
Keywords: 
;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated