The given research paper is an end-to-end architecture of grayscale clothing image classification with a lightweight Convolutional Neural Network (CNN) with the Fashion-MNIST dataset. Its architecture consists of three convolutional layers with Batch normalization to stabilize training, Dropout to avoid overfitting, MaxPooling to reduce spatial, and data augmentation (random rotation, shifting, zooming, flipping) to increase the effective training set. Early Stopping callback was used to terminate training when the validation performance leveled off. The model obtained 88.63%. test accuracy, which indicates that a tailor-crafted lightweight CNN can be used to perform competitively on Fashion-MNIST without resorting to complex heavyweight architectures. The precision and F1-scores were high when it came to categories that had distinct visual characteristics (trousers, sandals, bags) and categories with similar textures and outlines (T-shirts, pullovers, coat) were likely to be misclassified. The paper also contextualizes these findings concerning the development of CNN architecture of LeNet-5 to AlexNet and VGGNet, and explains the implications of the results to the effective use of AI in resource-restricted settings.