Preprint
Article

This version is not peer-reviewed.

An Information-Theoretic Approach to Optimal Training Set Construction for Neural Networks

Submitted:

18 January 2026

Posted:

20 January 2026

You are already at the latest version

Abstract
We present cEntMax, an information-theoretic framework for training set optimization that selects classwise informative samples via cross-entropy divergence to prototype pivots. Under a noisy-channel generative view and local linearity of deep networks, the method connects predictive entropy, Fisher information, and G-optimal coverage. Experiments on EMNIST and KMNIST show faster convergence, lower validation loss, and greater stability than random sampling, especially for moderate sampling fractions.
Keywords: 
;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated