Submitted:
04 April 2026
Posted:
08 April 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Dataset Design
3.1. Label Space
3.2. Rationale
3.3. Quality Controls
4. Data Summary and Distribution
5. Data Records
5.1. Access and Organization
5.2. Manifest and Metadata
5.3. Versioning and Label Normalization
6. Technical Validation
6.1. Label Consistency Checks
6.2. Duplicate and Near-Duplicate Screening
6.3. File Integrity
6.4. Manual Inspection
7. Usage Notes
7.1. Recommended Splits
7.2. Metrics and Reporting
7.3. Imbalance-Aware Training
7.4. Data Augmentation Considerations
7.5. Reference Model Configurations
7.6. Reproducible Data Access
8. Ethical Considerations
9. Limitations and Future Work
Code Availability
Data Availability Statement
Conflicts of Interest
References
- Deng, J.; Dong, W.; Socher, R.; Li, L.; Li, K.; Fei-Fei, L. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009. [Google Scholar] [CrossRef]
- Lin, T.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the European Conference on Computer Vision (ECCV), 2014. [Google Scholar] [CrossRef]
- Shi, L.; Funt, B. Re-processed Version of the Gehler Color Constancy Dataset. Online dataset/technical note 2011. [Google Scholar]
- Cheng, D.; Prasad, D.K.; Brown, M.S. The NUS 8-Camera Dataset for Color Constancy. In Proceedings of the Proceedings of the IS&T Color and Imaging Conference (CIC), 2014. [Google Scholar]
- Maitlo, N.; Noonari, N.; Ghanghro, S.A.; Duraisamy, S.; Ahmed, F. Color Recognition in Challenging Lighting Environments: CNN Approach. In Proceedings of the 2024 IEEE 9th International Conference for Convergence in Technology (I2CT), 2024; pp. 1–7. [Google Scholar] [CrossRef]
- Hendrycks, D.; Dietterich, T. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. In Proceedings of the International Conference on Learning Representations (ICLR), 2019; p. 1903.12261. [Google Scholar]
- Zhou, K.; Yang, Y.; Qiao, Y.; Xiang, T. Domain Generalization: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022. [Google Scholar] [CrossRef]
- Wang, J.; Lan, C.; Liu, C.; Ouyang, W.; Qin, T.; Lu, W.; Liu, Z.; Wei, Y. Generalizing from a Few Environments in Domain Generalization. In Proceedings of the International Conference on Machine Learning (ICML), 2021; p. 2107.01636. [Google Scholar]
- Buchsbaum, G. A Spatial Processor Model for Object Colour Perception. In Journal of the Franklin Institute; 1980. [Google Scholar] [CrossRef]
- Land, E.; McCann, J. Lightness and Retinex Theory. Journal of the Optical Society of America 1971. [Google Scholar] [CrossRef] [PubMed]
- Finlayson, G.D.; Trezzi, E. Shades of Gray and Colour Constancy. In Proceedings of the Proc. IS&T Color and Imaging Conference (CIC), Society for Imaging Science and Technology (IS&T), 2004; pp. 37–41. [Google Scholar]
- Gershon, R.; Jepson, A.; Tsotsos, J.K. Color Constancy Using Spatio-Temporal Cues. Journal of the Optical Society of America A 1987. [Google Scholar] [CrossRef]
- Hu, Y.; Wang, B.; Lin, S. FC4: Fully Convolutional Color Constancy with Confidence-Weighted Pooling. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017; pp. 4085–4094. [Google Scholar]
- Lou, Z.; Gevers, T.; Hu, N.; Lucassen, M.P. Color Constancy by Deep Learning. In Proceedings of the Proceedings of the British Machine Vision Conference (BMVC), 2015; pp. 76.1–76.12. [Google Scholar] [CrossRef]
- Yendrikhovskij, S.N. Computational Color Categorization. Color Research & Application 2001. [Google Scholar] [CrossRef]
- Berlin, B.; Kay, P. Basic Color Terms: Their Universality and Evolution; University of California Press: Berkeley, CA, 1991. [Google Scholar]
- Cubuk, E.D.; Zoph, B.; Shlens, J.; Le, Q.V. RandAugment: Practical Automated Data Augmentation with a Reduced Search Space. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2020; p. 1909.13719. [Google Scholar]
- Hendrycks, D.; Mu, N.; Cubuk, E.D.; Zoph, B.; Gilmer, J.; Lakshminarayanan, B. AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty. In Proceedings of the International Conference on Learning Representations (ICLR), 2020; p. 1912.02781. [Google Scholar]
- Sun, Y.; Wang, X.; Liu, Z.; Miller, J.; Efros, A.A.; Hardt, M. Test-Time Training with Self-Supervision for Generalization under Distribution Shifts. In Proceedings of the Proceedings of the 37th International Conference on Machine Learning (ICML), Online. Vol. 119, pp. 9229–9248.
- Ganin, Y.; Lempitsky, V. Unsupervised Domain Adaptation by Backpropagation. Proceedings of the Proceedings of the 32nd International Conference on Machine Learning (ICML), Online. 2015; Vol. 37, pp. 1180–1189. [Google Scholar]
- Lee, D. Pseudo-Label: The Simple and Efficient Semi-Supervised Learning Method. In Proceedings of the ICML Workshop on Challenges in Representation Learning, 2013. [Google Scholar]
- Sagawa, S.; Koh, P.W.; Hashimoto, T.B.; Liang, P. Distributionally Robust Neural Networks for Group Shifts. In Proceedings of the International Conference on Learning Representations (ICLR), 2020; p. 1911.08731. [Google Scholar]
- Guo, C.; Pleiss, G.; Sun, Y.; Weinberger, K.Q. On Calibration of Modern Neural Networks. In Proceedings of the International Conference on Machine Learning (ICML), 2017; p. 1706.04599. [Google Scholar]
- Kumar, A.; Sarawgi, U.; Levine, A.; Finn, C.; Liang, P. Verified Uncertainty Calibration. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2019; p. 1909.10155. [Google Scholar]
- Lin, T.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. [Google Scholar] [CrossRef]
- Cui, Y.; Jia, M.; Lin, T.; Song, Y.; Belongie, S. Class-Balanced Loss Based on Effective Number of Samples. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019; p. 1901.05555. [Google Scholar] [CrossRef]


| Illumination | Count | Percent |
|---|---|---|
| indoor | 3,342 | 27.65% |
| indoorNight | 3,337 | 27.61% |
| fluorescentLight | 2,985 | 24.70% |
| sunLight | 2,422 | 20.04% |
| Color | Count | Percent |
|---|---|---|
| Orange | 2,323 | 19.22% |
| Pink | 1,995 | 16.51% |
| Black | 1,674 | 13.85% |
| Blue | 1,205 | 9.97% |
| Purple | 1,194 | 9.88% |
| Gray | 1,128 | 9.33% |
| White | 882 | 7.30% |
| Yellow | 845 | 6.99% |
| Skyblue | 840 | 6.95% |
| Bucket | Count | Note |
|---|---|---|
| Orange / indoorNight | 1,237 | Largest Orange subset |
| Pink / indoor | 912 | Large indoor subset |
| Gray / sunLight | 461 | Relatively high sunLight subset |
| Skyblue / (all) | 210 each | Approximately uniform across conditions |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.