Liquid chromatography-tandem mass spectrometry (LC-MS/MS) is widely employed in pesticide residue analysis. Machine learning methods for automated spectral interpretation depend on large, well-curated training datasets; however, publicly available pesticide mass spectrometry data are fragmented across heterogeneous repositories, lack standardized preprocessing, and suffer from incomplete metadata. We introduce PAID (Pesticide AI-ready Dataset), comprising two curated LC-MS/MS spectral collections derived from 15 public sources (GNPS, MassIVE, MoNA, MassBank). Starting from 91,420 raw spectra, after initial pesticide-directed screening, a seven-step reproducible pipeline—spanning multi-source integration, spectral cleaning, deduplication, metadata standardization, quality scoring, stratified splitting, and feature engineering—yields PAID-Strict (7,527 spectra; 3,197 compounds) and PAID-Extended (21,292 spectra; 3,224 compounds). Both versions cover eight pesticide categories across QTOF and Orbitrap platforms, with core metadata fields (SMILES, InChIKey, molecular formula) exceeding 98% completeness. A feature suite of 32 chemoinformatic descriptors, 2,214 molecular fingerprints, and 30 spectral features is provided alongside the spectra. Benchmark classification of the eight pesticide categories using XGBoost and LightGBM achieved 81.5% accuracy. The dataset, code, and pre-computed features are publicly available under CC BY 4.0 and MIT licenses (DOI: 10.57760/sciencedb.35414).