Submitted:
10 August 2025
Posted:
11 August 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Literature Review and Research Gap Analysis
3. Materials and Methods
3.1. Integrated System Architecture
3.2. Mathematical Formulation and Variable Notation
3.3. Queueing Theory Integration
3.4. Integrated Algorithm Implementation
| Algorithm 1 ML-CALMO Integrated Optimization Algorithm |
|
4. Results
4.1. Experimental Setup
4.2. Performance Comparison Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| ML-CALMO | Machine Learning-Enhanced Cloud-Assisted Last-Mile Optimization |
| VRP | Vehicle Routing Problem |
| LSTM | Long Short-Term Memory |
| CNN | Convolutional Neural Network |
| DQN | Deep Q-Network |
References
- Abdullahi, M., Usman, A. M., & Sheltami, T. R. (2025). A review of last-mile delivery optimization: Strategies, technologies, drone integration, and future trends. Drones, 9(3), 158. [CrossRef]
- Nazari, M., Oroojlooy, A., Snyder, L. V., & Takác, M. (2018). Reinforcement learning for solving the vehicle routing problem. Advances in Neural Information Processing Systems, 31, 9839–9849.
- Kool, W., van Hoof, H., & Welling, M. (2019). Attention, learn to solve routing problems! International Conference on Learning Representations.
- Konovalenko, A., & Hvattum, L. M. (2024). Optimizing a dynamic vehicle routing problem with deep reinforcement learning: Analyzing state-space components. Logistics, 8(4), 96. [CrossRef]
- Li, J., Ma, Y., Guan, R., Li, X., Zhang, W., Lim, M. K., & Zheng, B. (2024). Solving the vehicle routing problem with stochastic travel cost using deep reinforcement learning. Electronics, 13(16), 3242. [CrossRef]
- Pan, W., & Liu, S. (2023). Deep reinforcement learning for the dynamic and uncertain vehicle routing problem. Applied Intelligence, 53, 405–422. [CrossRef]
- Jackson, J. R. (1957). Networks of waiting lines. Operations Research, 5(4), 518–521. [CrossRef]
- Jackson, J. R. (1963). Jobshop-like queueing systems. Management Science, 10(1), 131–142. [CrossRef]
- Buzacott, J. A., & Shanthikumar, J. G. (1993). Stochastic Models of Manufacturing Systems. Prentice-Hall, Englewood Cliffs, NJ.
- Kleinrock, L. (1976). Queueing Systems. Vol. II: Computer Applications. John Wiley & Sons, New York. [CrossRef]
- Ross, S. M. (2014). Introduction to Probability Models. 11th edition, Academic Press, Burlington.
- Vilaplana, J., Solsona, F., Abella, F., Filgueira, R., & Rius, J. (2014). A queuing theory model for cloud computing. The Journal of Supercomputing, 69(1), 563–577. [CrossRef]
- Chen, H., & Yao, D. D. (2001). Fundamentals of Queueing Networks: Performance, Asymptotics, and Optimization. Springer-Verlag, New York.
- Hopp, W. J., & Spearman, M. L. (2008). Factory Physics. 3rd edition, McGraw-Hill, Boston.
- Wang, Y., Ma, X., Li, Z., Liu, Y., Xu, M., & Wang, Y. (2021). Profit distribution in collaborative multiple centers vehicle routing problem. Journal of Cleaner Production, 289, 125733. [CrossRef]
- Wen, X., Wu, G., Li, S., & Wang, L. (2024). Ensemble multi-objective optimization approach for heterogeneous drone delivery problem. Expert Systems with Applications, 249, 123472. [CrossRef]
- Gendreau, M., & Potvin, J. Y. (2010). Handbook of Metaheuristics. 2nd edition, Springer, New York.
- Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd edition, Springer, New York.
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press, Cambridge, MA.
- Solomon, M. M. (1987). Algorithms for the vehicle routing and scheduling problems with time window constraints. Operations Research, 35(2), 254–265. [CrossRef]
- Li, H., & Lim, A. (2003). A metaheuristic for the pickup and delivery problem with time windows. International Journal on Artificial Intelligence Tools, 12(2), 173–186. [CrossRef]
- Psaraftis, H. N., Wen, M., & Kontovas, C. A. (2016). Dynamic vehicle routing problems: Three decades and counting. Networks, 67(1), 3–31. [CrossRef]
- Silva, M., Pedroso, J. P., & Viana, A. (2023). Deep reinforcement learning for stochastic last-mile delivery with crowdshipping. EURO Journal on Transportation and Logistics, 12, 100105. [CrossRef]
- Toth, P., & Vigo, D. (2014). Vehicle Routing: Problems, Methods, and Applications. 2nd edition, SIAM, Philadelphia.
- Laporte, G. (2009). Fifty years of vehicle routing. Transportation Science, 43(4), 408–416. [CrossRef]
- Cordeau, J. F., Laporte, G., Savelsbergh, M. W., & Vigo, D. (2007). Vehicle routing. Transportation Handbooks in Operations Research and Management Science, 14, 367–428.
- Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. 2nd edition, MIT Press, Cambridge, MA.
- Puterman, M. L. (2014). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Hoboken, NJ. [CrossRef]

| Variable | Description | Type/Range |
|---|---|---|
| Decision Variables | ||
| Assignment of order j to vehicle i at time t | ||
| Optimal action selected by DQN | Policy output | |
| Actual System Parameters | ||
| Actual arrival rate of priority orders for vehicle i | ||
| Actual arrival rate of regular orders for vehicle i | ||
| Actual travel time from vehicle i to order j | ||
| Actual service time for order j | ||
| Actual loading time for order j | ||
| Deadline for completing priority order j | ||
| Penalty weight for priority order j | ||
| Demand volume of order j | ||
| Available capacity of vehicle i at time t | ||
| Predicted Values (ML Outputs) | ||
| LSTM-predicted arrival rate of priority orders | ||
| LSTM-predicted arrival rate of regular orders | ||
| CNN-predicted travel time from vehicle i to order j | ||
| Predicted waiting time for order j | ||
| Predicted utilization factor for vehicle i | ||
| Predicted total cost of assigning j to i | ||
| Preference matrix for order j at time t | vector | |
| Performance Metrics | ||
| Service rate of vehicle i for priority orders | ||
| Service rate of vehicle i for regular orders | ||
| Actual utilization factor for vehicle i | ||
| Success probability for assigning order j to vehicle i | ||
| Method | Service Efficiency (%) | Service Success Rate (%) | Avg. Delivery Time (min) | Priority Success (%) | Total Cost |
|---|---|---|---|---|---|
| OR-Tools VRP | |||||
| Attention-VRP | |||||
| RL-VRP | |||||
| Dynamic-VRP | |||||
| Queueing-VRP | |||||
| ML-CALMO |
| ML Component | MAPE (%) | R-squared | Training Time (min) | Prediction Accuracy (%) |
|---|---|---|---|---|
| LSTM Demand () | ||||
| CNN Traffic () | ||||
| DQN Route Optimization | − | − | ||
| Success Probability () | ||||
| Integrated ML-CALMO |
| Performance Metric | Analytical Prediction | Simulation Result | Deviation (%) |
|---|---|---|---|
| System Utilization ( vs ) | 78.4 | 79.1 | 0.9 |
| Average Response Time (min) | 38.7 | 39.2 | 1.3 |
| Priority Queue Length (orders) | 2.14 | 2.23 | 4.2 |
| System Throughput (ord/hr) | 4.73 | 4.81 | 1.7 |
| Regular Queue Length (orders) | 3.92 | 4.07 | 3.8 |
| Preemption Probability | 0.089 | 0.093 | 4.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).