Preprint
Article

This version is not peer-reviewed.

A Comparative Study of an AI Model’s Robustness to Synthetic Data in Solving the Problem of Color Image Classification

Submitted:

03 April 2026

Posted:

07 April 2026

You are already at the latest version

Abstract
This study examines the impact of data augmentation on machine learning perfor-mance, focusing on how synthetic data influences various neural network architec-tures. Common issues such as limited data, class imbalance, and poor coverage often lead to low model metrics, and data augmentation is frequently used to address these problems. The research aims to identify the optimal proportion of synthetic data, assess its effects across different architectures, and analyze the impact of augmenting only specific classes in a multi-class medical image classification task. Twelve widely used architectures were selected for the experiments, including classical convolutional networks, visual transformers, and the hybrid ConvNeXt model. Results showed that no universal optimal augmentation ratio exists, as model robust-ness to synthetic data varies, even within the same architecture family. Transformer and hybrid models demonstrated greater stability, while convolutional networks exhibited inconsistent behavior, likely due to higher sensitivity to data bias.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated