Detectability of auditory signals in built environments is a critical issue in architectural acoustics, particularly in public spaces where notification sounds must be perceived reliably under background noise. This study investigated reaction times (RTs) to amplitude-modulated pure tones under silent, white noise, and bandpass-noise conditions. Twenty young and twenty elderly participants responded to 1- and 2-kHz tones with flat, gentle, and steep onset envelopes. To describe perceptual detection in physically interpretable terms, a time-integrated sound-exposure level model, LAE(t), was applied. RT was defined as the moment when cumulative acoustic energy exceeded a criterion value relative to the hearing threshold. In silent conditions, RTs were accurately predicted by LAE(t), with onset-envelope shape influencing early energy accumulation. In noise conditions, RTs increased systematically with spectral proximity between target and masker, consistent with auditory filter theory. When spectral separation exceeded approximately four ERB numbers, masking effects were minimal and RT approached silent-condition values. These findings demonstrate that perceptual detection timing is governed by cumulative acoustic energy and spectral masking rather than instantaneous sound pressure level. The LAE(t) model provides a detection-oriented metric that complements conventional room-acoustic parameters and may support evidence-based design of perceptually robust auditory signals in architectural environments.