Submitted:
02 June 2023
Posted:
05 June 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Proposed Method
3.1. Data Acquisition
3.2. Data Pre-Processing
3.2.1. Frames Extraction
3.2.2. Facial Landmark Detection
3.2.3. ROI Selection
| Algorithm 1: ROI Correction |
![]() |
3.2.4. ROI Extraction
3.3. Dataset Creation
3.4. CNN Training - Experiments
3.4.1. Dataset Processing
3.4.2. Model Architecture
3.5. CNN Test and Evaluation Metrics
3.5.1. CNN Test
3.5.2. Evaluation Metrics
3.6. Driver Drowsiness Detection
4. Experimental Results
4.1. CNN Training
4.2. CNN testing evaluation
4.3. CNN Visual Result
4.4. CNN Processing Results
4.5. Driver Drowsiness Detection Results
4.6. Results Comparison
5. Conclusion and Future Works
Supplementary Materials
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- PAHO. Road safety. 2022. Available online: https://www.paho.org/en/topics/road-safety (accessed on 9 February 2023).
- Gestión. Some 265 people died each month of 2022 in traffic accidents in Peru (Spanish). 2022. Available online: https://gestion.pe/peru/unas-265-personas-murieron-cada-mes-del-2022-en-accidentes-de-transito-en-peru-noticia/ (accessed on 9 February 2023).
- ONSV. Road accident report and actions to promote road safety (Spanish). 2022. Available online: https://www.onsv.gob.pe/post/informe-de-siniestralidad-vial-y-las-acciones-para-promover-la-seguridad-vial/ (accessed on 9 February 2023).
- Albadawi, Y.; Takruri, M.; Awad, M. A Review of Recent Developments in Driver Drowsiness Detection Systems. Sensors 2022, 22. [Google Scholar] [CrossRef] [PubMed]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition; 2016; pp. 2818–2826. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14; Springer, 11 October 2016; pp. 630–645. [Google Scholar]
- Torrey, L.; Shavlik, J. Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques; IGI global, 2010; pp. 242–264. [Google Scholar]
- Lugaresi, C.; Tang, J.; Nash, H.; McClanahan, C.; Uboweja, E.; Hays, M.; Zhang, F.; Chang, C.L.; Yong, M.G.; Lee, J. ; others. Mediapipe: A framework for building perception pipelines. arXiv 2019, arXiv:1906.08172. [Google Scholar]
- Kwon, K.A.; Shipley, R.J.; Edirisinghe, M.; Ezra, D.G.; Rose, G.; Best, S.M.; Cameron, R.E. High-speed camera characterization of voluntary eye blinking kinematics. Journal of the Royal Society Interface 2013, 10, 20130227. [Google Scholar] [CrossRef] [PubMed]
- Park, S.; Pan, F.; Kang, S.; Yoo, C.D. Driver drowsiness detection system based on feature representation learning using various deep networks. In Computer Vision–ACCV 2016 Workshops: ACCV 2016 International Workshops, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part III; Springer, 20 November 2017; pp. 154–164. [Google Scholar]
- Chirra, V.R.R.; Uyyala, S.R.; Kolli, V.K.K. Deep CNN: A Machine Learning Approach for Driver Drowsiness Detection Based on Eye State. Rev. d’Intelligence Artif. 2019, 33, 461–466. [Google Scholar] [CrossRef]
- Zhao, Z.; Zhou, N.; Zhang, L.; Yan, H.; Xu, Y.; Zhang, Z. Driver fatigue detection based on convolutional neural networks using EM-CNN. Computational intelligence and neuroscience 2020, 2020. [Google Scholar] [CrossRef] [PubMed]
- Phan, A.C.; Nguyen, N.H.Q.; Trieu, T.N.; Phan, T.C. An Efficient Approach for Detecting Driver Drowsiness Based on Deep Learning. Applied Sciences 2021, 11. [Google Scholar] [CrossRef]
- Rajkar, A.; Kulkarni, N.; Raut, A. Driver drowsiness detection using deep learning. In Applied Information Processing Systems: Proceedings of ICCET 2021; Springer, 2022; pp. 73–82. [Google Scholar]
- Hashemi, M.; Mirrashid, A.; Beheshti Shirazi, A. Driver safety development: Real-time driver drowsiness detection system based on convolutional neural network. SN Computer Science 2020, 1, 1–10. [Google Scholar] [CrossRef]
- Tibrewal, M.; Srivastava, A.; Kayalvizhi, R. A deep learning approach to detect driver drowsiness. Int. J. Eng. Res. Technol 2021, 10, 183–189. [Google Scholar]
- Petrellis, N.; Zogas, S.; Christakos, P.; Mousouliotis, P.; Keramidas, G.; Voros, N.; Antonopoulos, C. Software Acceleration of the Deformable Shape Tracking Application: How to eliminate the Eigen Library Overhead. 2021 2nd European Symposium on Software Engineering; 2021; pp. 51–57. [Google Scholar]
- Grishchenko, I.; Ablavatski, A.; Kartynnik, Y.; Raveendran, K.; Grundmann, M. Attention mesh: High-fidelity face mesh prediction in real-time. arXiv 2020, arXiv:2006.10962. [Google Scholar]
- Liu, P.; Guo, J.M.; Tseng, S.H.; Wong, K.; Lee, J.D.; Yao, C.C.; Zhu, D. Ocular Recognition for Blinking Eyes. IEEE Transactions on Image Processing 2017, 26, 5070–5081. [Google Scholar] [CrossRef] [PubMed]
- Kumari, P.; KR, S. An optimal feature enriched region of interest (ROI) extraction for periocular biometric system. Multimedia Tools and Applications 2021, 80, 1–19. [Google Scholar] [CrossRef]
- Pandey, N.; Muppalaneni, N. A novel drowsiness detection model using composite features of head, eye, and facial expression. Neural Computing and Applications 2022, 34. [Google Scholar] [CrossRef]
- Ahmed, M.; Laskar, R. Eye center localization using gradient and intensity information under uncontrolled environment. Multimedia Tools and Applications 2022, 81, 1–24. [Google Scholar] [CrossRef]
- Caelen, O. A Bayesian interpretation of the confusion matrix. Annals of Mathematics and Artificial Intelligence 2017, 81, 429–450. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Das, A.; Vedantam, R.; Cogswell, M.; Parikh, D.; Batra, D. Grad-CAM: Why did you say that? arXiv 2016, arXiv:1611.07450. [Google Scholar]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision; 2017; pp. 618–626. [Google Scholar]














| Classes | ||
|---|---|---|
| Data set | Drowsy | Not drowsy |
| training set | 2380 | 2380 |
| validation set | 510 | 510 |
| test set | 510 | 510 |
| Hyper-Parameters | Value |
| Optimizer | ADAM |
| 0.001 | |
| 0.9 | |
| Learning rate | 0.999 |
| Epochs | 30 |
| Batch size | 32 |
| Number of experiments | 10 for each CNN |
| CNN based on | Class Name | Precision | Recall | F1-score | Accuracy |
|---|---|---|---|---|---|
| InceptionV3 | Not drowsy | 0.9945±0.002 | 0.9967±0.001 | 0.9956±0.001 | 0.9956±0.001 |
| Drowsy | 0.9967±0.001 | 0.9945±0.002 | 0.9956±0.001 | ||
| VGG16 | Not drowsy | 0.9928±0.002 | 0.9951±0.003 | 0.9934±0.001 | 0.9934±0.001 |
| Drowsy | 0.9951±0.003 | 0.9928±0.002 | 0.9934±0.001 | ||
| ResNet50V2 | Not drowsy | 0.9982±0.002 | 0.9996±0.001 | 0.9989±0.001 | 0.9989±0.001 |
| Drowsy | 0.9996±0.001 | 0.9982±0.002 | 0.9989±0.001 |
| CNN based on | Class Name | Precision | Recall | F1-score | Accuracy |
|---|---|---|---|---|---|
| InceptionV3 | Not drowsy | 0.9908±0.003 | 0.9957±0.002 | 0.9928±0.001 | 0.9927±0.001 |
| Drowsy | 0.9957±0.002 | 0.9908±0.003 | 0.9927±0.001 | ||
| VGG16 | Not drowsy | 0.9937±0.003 | 0.9941±0.005 | 0.9939±0.002 | 0.9939±0.002 |
| Drowsy | 0.9941±0.005 | 0.9937±0.003 | 0.9939±0.002 | ||
| ResNet50V2 | Not drowsy | 0.9948±0.002 | 0.9994±0.001 | 0.9971±0.001 | 0.9971±0.001 |
| Drowsy | 0.9994±0.001 | 0.9947±0.002 | 0.9971±0.001 |
| Results in training: | Results in test: | |||
|---|---|---|---|---|
| Training Time | File Size (KB) | Total Params | Response Time | |
| InceptionV3 | 6.2min±3s | 182,072 | 29,997,786 | 137.8ms |
| VGG16 | 6.1min±12s | 111,603 | 19,325,690 | 71.3ms |
| ResNet50V2 | 6.2min±5s | 476,612 | 56,335,802 | 106.5ms |
| Autor | Dataset | Facial Method | ROI | Delay | Accuracy |
|---|---|---|---|---|---|
| Park et al. [11] | NTHU-DDD | VGG-FaceNet | Face | - | 73.06% |
| Chirra et al. [12] | Own/collected | Haar Cascade | Eyes | - | 96.42% |
| Zhao et al. [13] | Company Biteda | MTCNN | Face | - | 93.623% |
| Phan et al. [14] | Own/collected | Dlib | Face | - | 97% |
| Rajkar et al. [15] | YawDD/CEITW | Haar Cascade | Eyes | - | 96.82% |
| Hashemi et al. [16] | ZJU Eyeblink | Haar Cascade/Dlib | Eyes | 1.4ms | 98.15% |
| Tibrewal et al. [17] | MRL Eye | Dlib | Eyes | 95ms | 94% |
| Based on InceptionV3 | NITYMED | MediaPipe | Eyes | 137.8ms | 99.31% |
| Based on VGG16 | 71.3ms | 99.41% | |||
| Based on ResNet50V2 | 106.5ms | 99.71% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
