Pitch is a powerful cue for segregating sound sources in complex acoustic scenes, yet the neural mechanisms through which it guides selective attention remain unclear. In this review, we synthesise behavioural and neurophysiological evidence from humans and animal models to examine how pitch supports selective listening in a two-stage process: bottom-up pitch-based feature binding, followed by top-down enhancement of an attended sound source. Behavioural studies demonstrate that even modest pitch differences substantially improve listeners’ segregation of harmonic sounds, tone streams, and competing talkers. Human EEG, MEG, fMRI and ECoG studies show enhancement of target sound representations in auditory cortex during selective listening, but understanding this process at the level of individual neurons requires further study in animals that are trained in pitch-based selective listening tasks. Other key questions in this field include the relative roles of resolved and unresolved harmonic cues, the neural circuit mechanisms underlying target enhancement versus masker suppression, and how attention can target distributed cortical pitch representations. We argue that cross-species, naturalistic paradigms are essential for answering these questions and for addressing the listening difficulties associated with ageing and hearing loss.