Submitted:
05 August 2025
Posted:
06 August 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Benchmark Dataset
3. Methods
3.1. Calibration Model for Traffic Surveillance Cameras
3.2. Single-Image Calibration with DeepCalib
3.2.1. Backbone
3.2.2. Multi-Task Detection Head
3.2.3. Multi-Task Loss Function
3.3. Training Details
4. Experiments
4.1. Ablation Studies
4.2. Transfer Learning
4.3. Camera Calibration
4.4. Time Consumption
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| PnP | Perspective n point |
| PTZ | Pan tilt zoom |
| CNNs | Convolutional neural networks |
| TAM | Triplet attention module |
| MVCCD | Multi-view camera calibration dataset |
| FOV | Field of view |
| CH | Channel height |
| CW | Channel Width |
| VP | Vanishing point |
| EP | Extrinsic parameter |
References
- Sochor, J.; Juránek, R.; Špaňhel, J.; Maršík, L.; Široký, A.; Herout, A.; Zemčík, P. Comprehensive data set for automatic single camera visual speed measurement. IEEE Transactions on Intelligent Transportation Systems 2018, 20, 1633–1643. [Google Scholar] [CrossRef]
- Revaud, J.; Humenberger, M. Robust automatic monocular vehicle speed estimation for traffic surveillance. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10-17 October 2021, pp. [CrossRef]
- Zhang, W.; Song, H.; Liu, L.; Li, C.; Mu, B.; Gao, Q. Vehicle localisation and deep model for automatic calibration of monocular camera in expressway scenes. IET Intelligent Transport Systems 2022, 16, 459–473. [Google Scholar] [CrossRef]
- Qin, L.; Lin, C.; Huang, S.; Yang, S.; Zhao, Y. Camera calibration for the surround-view system: a benchmark and dataset. The Visual Computer 2024, 40, 7457–7470. [Google Scholar] [CrossRef]
- Hu, Z.; Lam, W.H.; Wong, S.; Chow, A.H.; Ma, W. Turning traffic surveillance cameras into intelligent sensors for traffic density estimation. Complex & Intelligent Systems 2023, 9, 7171–7195. [Google Scholar] [CrossRef]
- Wang, Z.; Huang, X.; Hu, Z. Attention-Based LiDAR–Camera Fusion for 3D Object Detection in Autonomous Driving. World Electric Vehicle Journal 2025, 16, 306. [Google Scholar] [CrossRef]
- Hu, X.; Chen, T.; Zhang, W.; Ji, G.; Jia, H. MonoAMP: Adaptive Multi-Order Perceptual Aggregation for Monocular 3D Vehicle Detection. Sensors (Basel, Switzerland) 2025, 25, 787. [Google Scholar] [CrossRef] [PubMed]
- Lepetit, V.; Moreno-Noguer, F.; Fua, P. EPnP: An accurate O (n) solution to the PnP problem. International journal of computer vision 2009, 81, 155–166. [Google Scholar] [CrossRef]
- Li, S.; Xu, C.; Xie, M. A robust O (n) solution to the perspective-n-point problem. IEEE transactions on pattern analysis and machine intelligence 2012, 34, 1444–1450. [Google Scholar] [CrossRef] [PubMed]
- Zheng, Y.; Sugimoto, S.; Sato, I.; Okutomi, M. A general and simple method for camera pose and focal length determination. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23-28 June 2014, pp. [CrossRef]
- Penate-Sanchez, A.; Andrade-Cetto, J.; Moreno-Noguer, F. Exhaustive linearization for robust camera pose and focal length estimation. IEEE transactions on pattern analysis and machine intelligence 2013, 35, 2387–2400. [Google Scholar] [CrossRef] [PubMed]
- Li, S.; Yoon, H.S. Enhancing camera calibration for traffic surveillance with an integrated approach of genetic algorithm and particle swarm optimization. Sensors 2024, 24, 1456. [Google Scholar] [CrossRef] [PubMed]
- Guo, S.; Yu, X.; Sha, Y.; Ju, Y.; Zhu, M.; Wang, J. Online camera auto–calibration appliable to road surveillance. Machine Vision and Applications 2024, 35, 91. [Google Scholar] [CrossRef]
- Bhardwaj, R.; Tummala, G.K.; Ramalingam, G.; Ramjee, R.; Sinha, P. Autocalib: Automatic traffic camera calibration at scale. ACM Transactions on Sensor Networks (TOSN) 2018, 14, 1–27. [Google Scholar] [CrossRef]
- Bartl, V.; Herout, A. Optinopt: Dual optimization for automatic camera calibration by multi–target observations. In Proceedings of the 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, Taipei, Taiwan, 2019, 18-21 September; pp. 1–8. [CrossRef]
- Bartl, V.; Juranek, R.; Špaňhel, J.; Herout, A. Planecalib: Automatic camera calibration by multiple observations of rigid objects on plane. In Proceedings of the 2020 Digital Image Computing: Techniques and Applications (DICTA). IEEE, Melbourne, Australia, 29 November- 02 December 2020; pp. 1–8. [Google Scholar] [CrossRef]
- Bartl, V.; Špaňhel, J.; Dobeš, P.; Juranek, R.; Herout, A. Automatic camera calibration by landmarks on rigid objects. Machine Vision and Applications 2021, 32, 2. [Google Scholar] [CrossRef]
- Alvarez, S.; Llorca, D.F.; Sotelo, M. Camera auto–calibration using zooming and zebra–crossing for traffic monitoring applications. In Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013). IEEE, The Hague, Netherlands, 06-09 October 2013; pp. 608–613. [Google Scholar] [CrossRef]
- Dubská, M.; Herout, A.; Juránek, R.; Sochor, J. Fully automatic roadside camera calibration for traffic surveillance. IEEE Transactions on Intelligent Transportation Systems 2014, 16, 1162–1171. [Google Scholar] [CrossRef]
- Wang, N.; Du, H.; Liu, Y.; Tang, Z.; Hwang, J.N. Self–calibration of traffic surveillance cameras based on moving vehicle appearance and 3–D vehicle modeling. In Proceedings of the 2018 25th IEEE international conference on image processing (ICIP). IEEE, Athens, Greece, 07-10 October 2018; pp. 3064–3068. [Google Scholar] [CrossRef]
- Kocur, V.; Ftáčnik, M. Traffic camera calibration via vehicle vanishing point detection. In Proceedings of the International conference on artificial neural networks. Springer; 2021; pp. 628–639. [Google Scholar] [CrossRef]
- Zhang, W.; Song, H.; Liu, L. Automatic calibration for monocular cameras in highway scenes via vehicle vanishing point detection. Journal of transportation engineering, Part A: Systems 2023, 149, 04023050. [Google Scholar] [CrossRef]
- Tong, X.; Ying, X.; Shi, Y.; Wang, R.; Yang, J. Transformer based line segment classifier with image context for real–time vanishing point detection in Manhattan world. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, New Orleans, LA, USA, 18-24 June 2022, pp. [CrossRef]
- Wildenauer, H.; Hanbury, A. Robust camera self–calibration from monocular images of Manhattan worlds. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Providence, RI, USA, 16-21 June 2012; pp. 2831–2838. [Google Scholar] [CrossRef]
- Itu, R.; Borza, D.; Danescu, R. Automatic extrinsic camera parameters calibration using Convolutional Neural Networks. In 531 Proceedings of the 2017 13th IEEE International Conference on Intelligent Computer Communication and Processing (ICCP). [CrossRef]
- Borji, A. Vanishing point detection with convolutional neural networks. arXiv, 2016; arXiv:1609.00967. [Google Scholar]
- Chang, C.K.; Zhao, J.; Itti, L. Deepvp: Deep learning for vanishing point detection on 1 million street view images. In Proceedingsof the 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, Brisbane, QLD, Australia, 21-, pp. 4496–4503. 25 May. [CrossRef]
- Lee, S.; Kim, J.; Shin Yoon, J.; Shin, S.; Bailo, O.; Kim, N.; Lee, T.H.; Seok Hong, H.; Han, S.H.; So Kweon, I. Vpgnet: Vanishing point guided network for lane and road marking detection and recognition. In Proceedings of the Proceedings of the IEEE international conference on computer vision, Venice, Italy, 22-29 October 2017, pp. [CrossRef]
- Workman, S.; Greenwell, C.; Zhai, M.; Baltenberger, R.; Jacobs, N. Deepfocal: A method for direct focal length estimation. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP). IEEE, Quebec City, QC, Canada, 27-30 September 2015; pp. 1369–1373. [Google Scholar] [CrossRef]
- Hold–Geoffroy, Y.; Sunkavalli, K.; Eisenmann, J.; Fisher, M.; Gambaretto, E.; Hadap, S.; Lalonde, J.F. A perceptual measure for deep single image camera calibration. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18-23 June 2018, pp. [CrossRef]
- Workman, S.; Zhai, M.; Jacobs, N. Horizon lines in the wild. arXiv, 2016; arXiv:1604.02129. [Google Scholar]
- Lee, J.; Sung, M.; Lee, H.; Kim, J. Neural geometric parser for single image camera calibration. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, Proceedings, Part XII 16. Springer, 2020, 23–28 August 2020; pp. 541–557. [Google Scholar] [CrossRef]
- Jin, L.; Zhang, J.; Hold–Geoffroy, Y.; Wang, O.; Blackburn–Matzen, K.; Sticha, M.; Fouhey, D.F. Perspective fields for single image camera calibration. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17-24 June 2023, pp. [CrossRef]
- Dosovitskiy, A.; Ros, G.; Codevilla, F.; Lopez, A.; Koltun, V. CARLA: An Open Urban Driving Simulator. In Proceedings of the Proceedings of the 1st Annual Conference on Robot Learning, Mountain View, United States, 2017, pp.
- Misra, D.; Nalamada, T.; Arasanipalai, A.U.; Hou, Q. Rotate to attend: Convolutional triplet attention module. In Proceedings of the Proceedings of the IEEE/CVF winter conference on applications of computer vision, Waikoloa, HI, USA, 03-08 January 2021, pp. [CrossRef]
- Kanhere, N.K.; Birchfield, S.T. A taxonomy and analysis of camera calibration methods for traffic monitoring applications. IEEE Transactions on Intelligent Transportation Systems 2010, 11, 441–452. [Google Scholar] [CrossRef]
- Liu, Z.; Mao, H.; Wu, C.Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A convnet for the 2020s. In Proceedings of the Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, New Orleans, LA, USA, 18-24 June 2022, pp. [CrossRef]
- Mousavian, A.; Anguelov, D.; Flynn, J.; Kosecka, J. 3d bounding box estimation using deep learning and geometry. In Proceedings of the Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21-26 July 2017, pp. [CrossRef]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the Proceedings of the IEEE international conference on computer vision, 2020, 42, 318–327. [CrossRef]
- Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. arXiv, 2017; arXiv:1711.05101. [Google Scholar]













| Dataset | Sample Size | h | Format | Resolution | ||
| MVCCD_R [3] | 8,765 | [−18.4°, 0°] | [−29.3°, 29°] | [10.4m,13.9m] | RGB | 1920×1080 |
| MVCCD_S | 336,249 | [−28°, 0°] | [−40°, 40°] | [10m,14.5m] | RGB | 1920×1080 |
| Methods | Straight Roads | Curved Roads | ||||
| Minimum | Maximum | Mean | Minimum | Maximum | Mean | |
| DeepVP [27] | 8.58 | 235.22 | 91.08 | 20.39 | 336.06 | 119.43 |
| DeepCN [3] | 3.37 | 112.54 | 57.14 | 12.66 | 189.45 | 73.16 |
| DeepCalib | 1.34 | 58.73 | 13.11 | 7.07 | 79.21 | 34.29 |
| Methods | 6m | 9m | 15m | ||||||
| Minimum | Maximum | Mean | Minimum | Maximum | Mean | Minimum | Maximum | Mean | |
| Manual | 5.87 | 6.22 | 6.05 | 8.53 | 9.09 | 8.91 | 14.71 | 15.15 | 14.96 |
| DeepCN [3] | 2.77 | 10.24 | 6.82 | 3.51 | 15.02 | 10.27 | 7.28 | 25.87 | 17.11 |
| DeepCalib | 3.44 | 9.18 | 6.56 | 4.77 | 14.94 | 9.96 | 9.05 | 23.43 | 16.68 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).