Submitted:
13 April 2026
Posted:
14 April 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
2.1. Autonomous Navigation for Nano-UAVs
2.2. Efficient CNN Architectures
2.3. Knowledge Distillation and Model Compression for Drones
2.4. Training Strategies for Small Models
3. Methodology
3.1. Baseline Architecture: PULP-Dronet v3
3.2. Proposed Architecture: Stem-Optimized D+P CNN
3.3. Training Configuration
3.4. Loss Function
4. Results
4.1. Main Results
4.2. Training Dynamics
4.3. Efficiency Analysis
4.4. Ablation Study
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| CNN | Convolutional Neural Network |
| D+P | Depthwise + Pointwise (Separable Convolution) |
| DW | Depthwise |
| PW | Pointwise |
| MAC | Multiply-Accumulate Operation |
| UAV | Unmanned Aerial Vehicle |
| RMSE | Root Mean Squared Error |
| MSE | Mean Squared Error |
| BCE | Binary Cross-Entropy |
| BN | Batch Normalization |
| FC | Fully Connected |
| IoT | Internet of Things |
References
- Labib, N.S.; Brust, M.R.; Danoy, G.; Bouvry, P. The Rise of Drones in Internet of Things: A Survey on the Evolution, Prospects and Challenges of Unmanned Aerial Vehicles. IEEE Access 2021, 9, 115466–115487. [Google Scholar] [CrossRef]
- Wei, Z.; Zhu, M.; Zhang, N.; Wang, L.; Zou, Y.; Meng, Z.; Wu, H.; Feng, Z. UAV-Assisted Data Collection for Internet of Things: A Survey. IEEE Internet Things J. 2022, 9, 15460–15483. [Google Scholar] [CrossRef]
- Cereda, E.; Crupi, L.; Risso, M.; Burrello, A.; Benini, L.; Giusti, A.; Jahier Pagliari, D.; Palossi, D. Deep Neural Network Architecture Search for Accurate Visual Pose Estimation aboard Nano-UAVs. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 6065–6071. [Google Scholar]
- Lamberti, L.; Bompani, L.; Kartsch, V.J.; Rusci, M.; Palossi, D.; Benini, L. Bio-inspired Autonomous Exploration Policies with CNN-based Object Detection on Nano-drones. In Proceedings of the 2023 Design, Automation & Test in Europe Conference (DATE), Antwerp, Belgium, 17–19 April 2023; pp. 1–6. [Google Scholar]
- Shakhatreh, H.; Sawalmeh, A.H.; Al-Fuqaha, A.; Dou, Z.; Almaita, E.; Khalil, I.; Othman, N.S.; Khreishah, A.; Guizani, M. Unmanned Aerial Vehicles (UAVs): A Survey on Civil Applications and Key Research Challenges. IEEE Access 2019, 7, 48572–48634. [Google Scholar] [CrossRef]
- Hossein Motlagh, N.; Taleb, T.; Arouk, O. Low-Altitude Unmanned Aerial Vehicles-Based Internet of Things Services: Comprehensive Survey and Future Perspectives. IEEE Internet Things J. 2016, 3, 899–922. [Google Scholar] [CrossRef]
- Niculescu, V.; Lamberti, L.; Conti, F.; Benini, L.; Palossi, D. Improving Autonomous Nano-Drones Performance via Automated End-to-End Optimization and Deployment of DNNs. IEEE J. Emerg. Sel. Top. Circuits Syst. 2021, 11, 1–1. [Google Scholar] [CrossRef]
- Lamberti, L.; Bellone, L.; Macan, L.; Natalizio, E.; Conti, F.; Palossi, D.; Benini, L. Distilling Tiny and Ultra-fast Deep Neural Networks for Autonomous Navigation on Nano-UAVs. IEEE Internet Things J. 2024, 71, 1–13. [Google Scholar]
- Loquercio, A.; Maqueda, A.I.; Del-Blanco, C.R.; Scaramuzza, D. DroNet: Learning to Fly by Driving. IEEE Robot. Autom. Lett. 2018, 3, 1088–1095. [Google Scholar] [CrossRef]
- Palossi, D.; Loquercio, A.; Conti, F.; Flamand, E.; Scaramuzza, D.; Benini, L. A 64-mW DNN-Based Visual Navigation Engine for Autonomous Nano-Drones. IEEE Internet Things J. 2019, 6, 8357–8371. [Google Scholar] [CrossRef]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the 36th International Conference on Machine Learning (ICML), Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
- Zhang, X.; Zhou, X.; Lin, M.; Sun, J. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 6848–6856. [Google Scholar]
- Radosavovic, I.; Kosaraju, R.P.; Girshick, R.; He, K.; Dollár, P. Designing Network Design Spaces. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10428–10436. [Google Scholar]
- Liu, Z.; Mao, H.; Wu, C.-Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 11976–11986. [Google Scholar]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the Knowledge in a Neural Network. arXiv 2015, arXiv:1503.02531. [Google Scholar] [CrossRef]
- Jacob, B.; Kligys, S.; Chen, B.; Zhu, M.; Tang, M.; Howard, A.; Adam, H.; Kalenichenko, D. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 2704–2713. [Google Scholar]
- He, Y.; Liu, P.; Wang, Z.; Hu, Z.; Yang, Y. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 4340–4349. [Google Scholar]
- Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jegou, H. Training Data-Efficient Image Transformers & Distillation through Attention. In Proceedings of the 38th International Conference on Machine Learning (ICML), Virtual, 18–24 July 2021; pp. 10347–10357. [Google Scholar]
- Loshchilov, I.; Hutter, F. SGDR: Stochastic Gradient Descent with Warm Restarts. In Proceedings of the 5th International Conference on Learning Representations (ICLR), Toulon, France, 24–26 April 2017. [Google Scholar]
- Shorten, C.; Khoshgoftaar, T.M. A Survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
- Zhang, N.; Nex, F.; Vosselman, G.; Kerle, N. End-to-End Nano-Drone Obstacle Avoidance for Indoor Exploration. Drones 2024, 8, 33. [Google Scholar] [CrossRef]


| Hyperparameter | Baseline [8] | Ours |
|---|---|---|
| Optimizer | Adam | Adam |
| Learning rate | (fixed) | (cosine annealing) |
| LR minimum | N/A | |
| Weight decay | ||
| Batch size | 32 | 32 |
| Epochs | 100 | 100 |
| Online augmentation | None | ColorJitter (, ) |
| Dropout rate | 0.5 (before flatten) | 0.3 (after flatten) |
| Loss function | MSE + BCE | MSE + BCE |
| Weight initialization | Xavier uniform | Xavier uniform |
| Yaw normalization |
| Model | Acc (%) | RMSE | MACs | Params | Size (B) | |
|---|---|---|---|---|---|---|
| PULP-Dronet v3 [8] | /1 | 84 | 0.350 | 12M | 51k | 204k |
| PULP-Dronet v3 [8] | /2 | 84 | 0.367 | 5.2M | 17k | 69k |
| PULP-Dronet v3 [8] | /4 | 81 | 0.373 | 2.4M | 6.6k | 26k |
| Tiny-PULP-Dronet v3 [8] | /8 | 78 | 0.379 | 1.1M | 2.9k | 12k |
| Ours (stem-opt D+P) | /4 | 83.97 | 0.372 | 540K | 6.4k | 25k |
| Component | Baseline MACs | Ours MACs | Reduction |
|---|---|---|---|
| Stem convolution | ∼2.0M | ∼170K | ∼11.8× |
| D+P Block 1 | ∼105K | ∼105K | 1× |
| D+P Block 2 | ∼68K | ∼68K | 1× |
| D+P Block 3 | ∼130K | ∼130K | 1× |
| FC layer | ∼3K | ∼3K | 1× |
| Total | ∼2.4M | ∼540K | ∼4.4× |
| Configuration | Stem | LR Schedule | Augmentation | Test Acc (%) |
|---|---|---|---|---|
| Baseline [8] | conv | Fixed | Offline | 81.0 |
| Proposed (Full) | DW+PW | Cosine | ColorJitter | 83.97 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).