Submitted:
21 May 2025
Posted:
21 May 2025
You are already at the latest version
Abstract

Keywords:
1. Introduction
2. Related Works
2.1. Ground- Based Cloud Detection
2.2. Loss Functions for Object Detection
2.3. Curriculum Learning
3. The Proposed Method
3.1. Loss Computing and Analysis
3.1.1. Unified Batch Loss
3.1.2. Loss Sliding Window Queue
3.2. Dynamic Loss Threshold
- Relative Difficulty Assessment: Unlike static thresholds, automatically adapts to the current state of the model. As performance improves, the threshold dynamically decreases following the downward shift of the loss distribution, maintaining an optimal challenge level.
- Architectural Adaptability: The framework naturally accommodates detectors with different convergence characteristics - identical values produce detector-specific thresholds that match individual optimization trajectories.
3.3. Loss-Based Dynamic Curriculum Learning Scheduling
4. Experiments
4.1. Datasets
4.2. Evaluation Metrics
4.3. Implementation Details
4.4. Results
4.5. Ablation Study
4.6. Comparative Study
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgements
Conflicts of Interest
Abbreviations
| WMO | The World Meteorological Organization |
| CNN | convolutional neural network |
| CL | Curriculum learning |
| SPL | Self-Paced Learning |
| SPCL | self-paced curriculum leaning |
| UBL | Unified Batch Loss |
| ULFG-FD | Ultra-Light-Fast-Generic-Face-Detector-1MB |
References
- Li, S.; Wang, M.; Shi, M.; Wang, J.; Cao, R. Leveraging Deep Spatiotemporal Sequence Prediction Network with Self-Attention for Ground-Based Cloud Dynamics Forecasting. Remote Sens. 2025, 17, 18. [Google Scholar] [CrossRef]
- Lu, Z.; Zhou, Z.; Li, X.; Zhang, J. STANet: A Novel Predictive Neural Network for Ground-Based Remote Sensing Cloud Image Sequence Extrapolation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4701811. [Google Scholar] [CrossRef]
- Wei, L.; Zhu, T.; Guo, Y.; Ni, C.; Zheng, Q. Cloudprednet: An ultra-short-term movement prediction model for ground-based cloud image. IEEE Access 2023, 11, 97177–97188. [Google Scholar] [CrossRef]
- Deng, F.; Liu, T.; Wang, J.; Gao, B.; Wei, B.; Li, Z. Research on Photovoltaic Power Prediction Based on Multimodal Fusion of Ground Cloud Map and Meteorological Factors. Proc. CSEE 2025. Available online: https://link.cnki.net/urlid/11.2107.TM.20250220.1908.019 (accessed on 21 February 2025).
- World Meteorological Organization. International Cloud Atlas (WMO-No.407); WMO: Geneva, Switzerland, 2017. [Google Scholar]
- Rachana, G.; Satyasai, J.N. Cloud Detection in Satellite Images with Classical and Deep Neural Network Approach: A Review. Multimed. Tools Appl. 2022, 81, 31847–31880. [Google Scholar] [CrossRef]
- Neto, S.L.M.; Wangenheim, R.V.; Pereira, R.B.; Comunello, R. The Use of Euclidean Geometric Distance on RGB Color Space for the Classification of Sky and Cloud Patterns. J. Atmos. Ocean. Technol. 2010, 27, 1504–1517. [Google Scholar] [CrossRef]
- Liu, S.; Wang, C.H.; Xiao, B.H.; Zhang, Z.; Shao, Y.X. Salient Local Binary Pattern for Ground-Based Cloud Classification. Acta Meteorol. Sin. 2013, 27, 211–220. [Google Scholar] [CrossRef]
- Cheng, H.Y.; Yu, C.C. Block-Based Cloud Classification with Statistical Features and Distribution of Local Texture Features. Atmos. Meas. Tech. 2015, 8, 1173–1182. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
- Wang, C.Y.; Liao, H.Y.M. YOLOv1 to YOLOv10: The Fastest and Most Accurate Real-Time Object Detection Systems. arXiv 2024, arXiv:2405.14458. [Google Scholar] [CrossRef]
- Wang, A.; Chen, H.; Liu, L.H.; Chen, K.; Lin, Z.J.; Han, J.G.; Ding, G.G. YOLOv10: Real-Time End-to-End Object Detection. In Advances in Neural Information Processing Systems 37 (NeurIPS 2024), Vancouver, Canada, 10–15 December 2024; pp. 107984–108011. [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Computer Vision–ECCV 2016, Cham, Switzerland, 2016; pp. 21–37. [CrossRef]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Computer Vision–ECCV 2020; Springer: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar] [CrossRef]
- Lv, W.; Zhao, Y.; Xu, S.; Wei, J.; Wang, G.; Cui, C.; Du, Y.; Dang, Q.; Liu, Y. DETRs Beat YOLOs on Real-Time Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 16965–16974. [Google Scholar] [CrossRef]
- Wang, S.; Chen, Y. Ground Nephogram Object Detection Algorithm Based on Improved Loss Function. Comput. Eng. Appl. 2022, 58, 169–175. [Google Scholar] [CrossRef]
- Hu, J.; Wei, Y.; Chen, W.; Zhi, X.; Zhang, W. CM-YOLO: Typical Object Detection Method in Remote Sensing Cloud and Mist Scene Images. Remote Sens. 2025, 17, 125. [Google Scholar] [CrossRef]
- Wang, M.; Zhuang, Z.H.; Wang, K.; Zhang, Z. Intelligent Classification of Ground-Based Visible Cloud Images Using a Transfer Convolutional Neural Network and Fine-Tuning. Opt. Express 2021, 29, 150455. [Google Scholar] [CrossRef]
- Zhou, Z.; Zhang, F.; Xiao, H.; Wang, F.; Hong, X.; Wu, K.; Zhang, J. A Novel Ground-Based Cloud Image Segmentation Method by Using Deep Transfer Learning. IEEE Geosci. Remote Sens. Lett. 2022, 19, 4004705. [Google Scholar] [CrossRef]
- Bengio, Y.; Louradour, J.; Collobert, R.; Weston, J. Curriculum Learning. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada, 14–18 June 2009; pp. 41–48. [Google Scholar] [CrossRef]
- Soviany, P.; Ionescu, R.T.; Rota, P.; Sebe, N. Curriculum Learning: A Survey. Int. J. Comput. Vis. 2022, 130, 1526–1565. [Google Scholar] [CrossRef]
- Wang, X.; Chen, Y.; Zhu, W. A Survey on Curriculum Learning. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 1–20. [Google Scholar] [CrossRef]
- Xiang, H.Y.; Han, L.L.; Shi, C.J.; Zhang, K.; Li, X.K.; Yang, S.F. Research Progress of Ground-Based Cloud Images Classification in Machine Learning. Laser Infrared 2023, 53, 1795–1809. [Google Scholar] [CrossRef]
- Zhang, X.; Jia, K.B.; Liu, J.; Zhang, L. Ground Cloud Image Recognition and Segmentation Technology Based on Multi-Task Learning. Meteorol. Mon. 2023, 49, 454–466. [Google Scholar] [CrossRef]
- Wang, M.; Zhou, S.D.; Yang, Z.; Zhang, Z.; Liu, Z.H. CloudA: A Ground-Based Cloud Classification Method with a Convolutional Neural Network. J. Atmos. Ocean. Technol. 2020, 37, 1661–1668. [Google Scholar] [CrossRef]
- Liu, S.; Duan, L.L.; Zhang, Z.; Cao, X.Z.; Durrani, T.S. Ground-Based Remote Sensing Cloud Classification via Context Graph Attention Network. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5602711. [Google Scholar] [CrossRef]
- Li, Z.F.; Zhou, H.; Zhang, Y.J.; Tao, H.J.; Yu, H.C. An Improved YOLOv8 Network for Multi-Object Detection with Large-Scale Differences in Remote Sensing Images. Int. J. Pattern Recognit. Artif. Intell. 2024, 38, 2455017. [Google Scholar] [CrossRef]
- Yin, Z.; Yang, B.; Chen, J.; Zhu, C.; Chen, H.; Tao, J. Lightweight Small Object Detection Algorithm Based on STD-DETR. Laser Optoelectron. Prog. 2025, 62, 0815002. [Google Scholar] [CrossRef]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2999–3007. [Google Scholar] [CrossRef]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
- Rezatofghil, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection over Union: A Metric and a Loss for Bounding Box Regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 658–666. [Google Scholar] [CrossRef]
- Allo, N.T.; Indrabayu; Zainuddin, Z. A Novel Approach of Hybrid Bounding Box Regression Mechanism to Improve Convergence Rate and Accuracy. Int. J. Intell. Eng. Syst. 2024, 17, 57–68. [Google Scholar] [CrossRef]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 9626–9635. [Google Scholar] [CrossRef]
- Ionescu, R.T.; Alexe, B.; Leordeanu, M.; Popescu, M.; Papadopoulos, D.P.; Ferrari, V. How Hard Can It Be? Estimating the Difficulty of Visual Search in an Image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2157–2166. [Google Scholar] [CrossRef]
- Shi, M.; Ferrari, V. Weakly Supervised Object Localization Using Size Estimates. In Computer Vision–ECCV 2016; Springer: Cham, Switzerland, 2016; pp. 105–121. [Google Scholar] [CrossRef]
- Kumar, M.P.; Packer, B.; Koller, D. Self-Paced Learning for Latent Variable Models. In Proceedings of the 23rd International Conference on Neural Information Processing Systems (NIPS 2010), Vancouver, Canada, 6-9 December 2010; Volume 23, pp. 1189–1197. [Google Scholar]
- Jiang, L.; Meng, D.; Zhao, Q.; Shan, S.; Hauptmann, A.G. Self-Paced Curriculum Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; pp. 2694–2700. [Google Scholar] [CrossRef]
- Soviany, P.; Ionescu, R. T.; Rota, P.; Sebe, N. Curriculum Self-Paced Learning for Cross-Domain Object Detection. Comput. Vis. Image Underst. 2021, 204, 103166. [Google Scholar] [CrossRef]
- Croitoru, F.A.; Ristea, N.C.; Ionescu, R.T.; Sebe, N. Learning Rate Curriculum. Int. J. Comput. Vis. 2025, 133, 1–23. [Google Scholar] [CrossRef]








| Index | Class Name | WMO Name | Base Height | Proportion |
|---|---|---|---|---|
| 0 | High | Cirrus(Ci), Cirrocumulus (Cc), Cirrostratus (Cs); |
>20k ft | 37.07% |
| 1 | Low | Cumulus (Cu), Stratocumulus (Sc), Stratus (St); |
<6.5k ft | 46.14% |
| 2 | AcTra | Altocumulus translucidus (AcTra); | 6.5k - 20k ft | 5.94% |
| 3 | Len | Altocumulus lenticularis (AcLen); | 6.5k - 20k ft | 5.38% |
| 4 | Ma | Mammatus clouds (part of Cumulonimbus (Cb)). |
<6.5k ft | 5.49% |
| Detector Model | Training Phase (m) | Epochs | w | |
|---|---|---|---|---|
| YOLOv10s | 1–9 | ∞ | 10 | |
| : 10–49 | 0.5 | |||
| : 50–99 | 0.7 | |||
| : 100–149 | 0.9 | |||
| 150– | ∞ | |||
| RT-DETR-R50 | 1–9 | ∞ | 20 | |
| : 10–29 | 0.7 | |||
| : 30–49 | 0.8 | |||
| : 50–99 | 0.9 | |||
| 100– | ∞ | |||
| ULFG-FD | 1–9 | ∞ | 20 | |
| : 10–29 | 0.7 | |||
| : 30–49 | 0.8 | |||
| : 50–99 | 0.9 | |||
| 100– | ∞ |
| Model Category | Model | Training Regime | mAP50 | Precision | Recall |
|---|---|---|---|---|---|
| CNN Based | YOLOv10s | Conventional | 0.800 | 0.943 | 0.820 |
| CurriCloud(Ours) | 0.821 | 0.955 | 0.787 | ||
| ULFG-FD | Conventional | 0.442 | 0.878 | 0.557 | |
| CurriCloud(Ours) | 0.556 | 0.835 | 0.593 | ||
| Transformer Based | RT-DETR-R50 | Conventional | 0.863 | 0.977 | 0.780 |
| CurriCloud(Ours) | 0.875 | 0.975 | 0.813 |
| Model | CurriCloud (Ours) | w | mAP50 | Precision | Recall | |||
|---|---|---|---|---|---|---|---|---|
| Yolov10s | Conventional | — | — | — | — | 0.800 | 0.943 | 0.820 |
| loss10579 | 10 | 0.5 | 0.7 | 0.9 | 0.821 | 0.955 | 0.787 | |
| loss10789 | 10 | 0.7 | 0.8 | 0.9 | 0.820 | 0.963 | 0.724 | |
| loss20579 | 20 | 0.5 | 0.7 | 0.9 | 0.831 | 0.939 | 0.756 | |
| loss20789 | 20 | 0.7 | 0.8 | 0.9 | 0.810 | 0.942 | 0.799 | |
| loss40579 | 40 | 0.5 | 0.7 | 0.9 | 0.813 | 0.964 | 0.745 | |
| loss40789 | 40 | 0.5 | 0.7 | 0.9 | 0.772 | 0.910 | 0.801 | |
| ULFG-FD | Conventional | — | — | — | — | 0.442 | 0.878 | 0.557 |
| loss10579 | 10 | 0.5 | 0.7 | 0.9 | 0.554 | 0.845 | 0.550 | |
| loss10789 | 10 | 0.7 | 0.8 | 0.9 | 0.500 | 0.836 | 0.621 | |
| loss20579 | 20 | 0.5 | 0.7 | 0.9 | 0.519 | 0.860 | 0.621 | |
| loss20789 | 20 | 0.7 | 0.8 | 0.9 | 0.556 | 0.835 | 0.593 | |
| loss40579 | 40 | 0.5 | 0.7 | 0.9 | 0.488 | 0.811 | 0.593 | |
| loss40789 | 40 | 0.7 | 0.8 | 0.9 | 0.496 | 0.823 | 0.609 | |
| RT-DETR-R50 | Conventional | — | — | — | — | 0.863 | 0.977 | 0.780 |
| loss10579 | 10 | 0.5 | 0.7 | 0.9 | 0.865 | 0.966 | 0.810 | |
| loss10789 | 10 | 0.7 | 0.8 | 0.9 | 0.851 | 0.971 | 0.789 | |
| loss20579 | 20 | 0.5 | 0.7 | 0.9 | 0.857 | 0.977 | 0.794 | |
| loss20789 | 20 | 0.7 | 0.8 | 0.9 | 0.875 | 0.975 | 0.813 | |
| loss40579 | 40 | 0.5 | 0.7 | 0.9 | 0.839 | 0.968 | 0.773 | |
| loss40789 | 40 | 0.5 | 0.7 | 0.9 | 0.824 | 0.984 | 0.705 | |
| Model | Training Regime | mAP50 | Precision | Recall |
|---|---|---|---|---|
| YOLOv10s | Conventional (Baseline) | 0.800 | 0.943 | 0.820 |
| loss10579(CurriCloud) | 0.821 | 0.955 | 0.787 | |
| pre12-579 (Static CL) | 0.802 | 0.953 | 0.766 | |
| RT-DETR-R50 | Conventional (Baseline) | 0.863 | 0.977 | 0.780 |
| loss20789CurriCloud | 0.875 | 0.975 | 0.813 | |
| pre12-579 (Static CL) | 0.857 | 0.985 | 0.770 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).