Determining which sensor modalities carry genuine discriminative signal for CNC monitoring—and how many can be removed before performance degrades—is a practical question that prior work rarely answers quantitatively under leakage-resistant evaluation. We address this through a systematic cross-validated ablation study on a 9-class CNC toolpath and condition classification task, combining three toolpath strategies (adaptive, face, pocket) with three conditions (air-cutting, active-cutting, and damaged-spindle). Using 120 operation files from a desktop CNC mill with six consistently active sensors (17 channels per sensor) plus 8 machine-level electrical features, we evaluate six model families across 690 cross-validated runs spanning five cumulative feature-ablation levels (110–56 features) and ten temporal resolutions. To handle the fusion challenge, we introduce MM-DTAE-LSTM, a multi-modal denoising temporal attention encoder with unidirectional LSTM-based classification that combines learned modality gates, cross-modal attention, and a self-supervised denoising objective. Key findings on this single-machine, single-material dataset: (1) MM-DTAE-LSTM reaches 96.3% ± 4.7% accuracy at a 98-feature configuration (excluding proximity and pressure), leading all baselines by 3.1–5.2 points, though differences are not statistically significant at n=5 folds; (2) reducing the feature set by 49% (to accelerometer, gyroscope, temperature, RMS audio, and electrical channels) retains 92.5% ± 8.3% for the encoder while XGBoost drops to 84.4%, a loss of 10.7 points from its full-feature peak; (3) at full features, baselines are competitive (Random Forest: 95.6%, XGBoost: 95.1%); and (4) one-way ANOVA reveals that pressure channels encode session-level barometric confounds (F > 2,000) rather than machining dynamics, explaining baseline degradation when confound-prone channels are removed. These results suggest that core inertial, acoustic, and machine-level modalities may be sufficient for effective CNC operation classification on similar platforms, providing sensor-selection and temporal-configuration guidance for cost-effective monitoring deployments. Generalization to industrial machines, diverse materials, and production environments requires further validation.