The effectiveness of machine learning and deep learning methods for network anomaly detection depends strongly on the quality and representativeness of the datasets used for training and evaluation. However, many publicly available benchmarks rely on synthetic traffic, outdated attack scenarios, or limited representation of encrypted communications. This work presents a network traffic dataset derived from operational firewall logs collected in a heterogeneous institutional environment dominated by HTTPS/TLS traffic. A structured data-centric pipeline was implemented, including preprocessing, behavioral feature engineering, unsupervised pseudo-labeling through the EFMS-KMeans algorithm, class balancing using SMOTE, and temporal sequence generation for sequential analysis. The resulting dataset contains large-scale flow-level records describing volumetric, behavioral, and temporal traffic characteristics while preserving privacy through anonymization procedures. Technical validation was conducted using statistical analysis, entropy-based measurements, clustering quality metrics, and dimensionality reduction techniques, confirming data consistency, diversity, and class separability. The dataset is publicly available through the Mendeley Data repository together with metadata and documentation supporting anomaly detection research, encrypted traffic analysis, and the evaluation of machine learning and deep learning approaches in realistic cybersecurity environments.