Preprint
Article

This version is not peer-reviewed.

Extreme Prototype Reduction for k-Nearest Neighbour Classification via a Simple Genetic Algorithm

Submitted:

19 January 2026

Posted:

21 January 2026

You are already at the latest version

Abstract
Nearest-neighbour classifiers are simple and effective, but their performance and inference cost depend strongly on the size and quality of the reference (design) set. This work studies an evolutionary prototype selection strategy for k-nearest neighbour (k-NN) classification, where a genetic algorithm (GA) evolves compact, class-balanced prototype banks from the design partition and the selected prototypes are then used by a 1-NN classifier. Individuals encode fixed numbers of prototype indices per class, and fitness is defined as the number of correctly classified test samples. We evaluate the approach across five scenarios: two synthetic Gaussian benchmarks (2D, with different overlap levels), a synthetic 3D “three moons” dataset, and three real datasets (Breast Cancer Wisconsin, Wine, and a reduced MNIST setting using 8 × 8 digit images). For each scenario, results are aggregated over repeated runs with different random seeds and compared against standard 1-NN and 3-NN baselines that use all design samples as neighbours. The experimental evidence shows that GA-selected prototype banks can match ceiling performance in highly separable cases and provide consistent gains in noisier or more redundant settings, while reducing the neighbour set size by orders of magnitude. These results support the hypothesis that evolutionary, class-balanced prototype selection improves k-NN generalization and efficiency without requiring changes to the underlying distance metric or classifier structure. The results show that the proposed method is well aligned with application scenarios in which memory or latency budgets specify a hard upper bound on the number of prototypes that can be stored or consulted. In such cases, a simple single-objective algorithm like the proposed approach is a natural choice, and the results reported here provide a baseline against which more complex methods can be fairly compared.
Keywords: 
;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated