Submitted:
06 February 2025
Posted:
07 February 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Background
- Accuracy: Measures the proportion of true results (both true positives and true negatives) among the total number of cases. So, it means that the Equation (1) defines the accuracy metric, which calculates, how many correctly predicted cases among all available instances. The formula is
- Precision: Indicates the proportion of true positives among the total predicted positives. It defined in Equation (2), the correctness of positive predictions. The formula is Precision
- Recall: Also known as sensitivity, it measures the proportion of true positives identified among the actual positives. Finally, our recall metric is shown in Equation (3) by emphasizing the retrieval of actual positives. The formula is Recall
- F1-Score: The harmonic mean of precision and recall, providing a single metric that balances both. The F1-Score represented in Equation (4) below gives balanced measures between precision and recall. The formula is
3. Proposed Framework
3.1. Dataset Description
3.1.1. Manipulation Methods
- (1)
- Deepfakes Utilizes deep learning for face-swapping, replacing a target face with one from another video.
- (2)
- Face2Face Alters facial expressions in the target video to match those of a source actor.
- (3)
- FaceSwap A traditional face-swapping technique that doesn’t rely on deep learning.
- (4)
- Neural Textures Uses GAN-based techniques to manipulate facial features, producing highly realistic details.
3.1.2. Dataset Scale
3.2. Data Preprocessing
3.3. Artifacts Landmark Detection
3.4. Correlation Between the Artifacts to Identify Correlated Pairs
- A strong positive correlation (features move together) is represented by deep red.
- Blue deepness, means a strong negative correlation (the first feature increases while the second decreases).
- Strong or positive correlation is indicated by negative shades, while lighter shades suggest weak or no correlation.
-
Individual Feature Analysis:
- Nose: It is apparent that parameters such as width, height, tip location, and nostril symmetry correlate amazingly, which means that these facial dimensions change often at the same time during various movements and facial expressions.
- Mouth: The coordinated variations of the upper and the lower jaw’s height and width, and the changes occurring during the mouth movements (speaking or smiling) suggest very strong correlations among these parameters.
- Eyes: The eye-determined indicators such as eye aspect ratio (EAR), blink frequency, amplitude, and duration, as well as the pupils’ size and movement, typically exhibit high correlations, implying that blinks and eye movements are closely related.
-
Inter-Feature Correlations Analysis:
- Nose and Eyes: Exploring the relationship between nose positions/dimensions and eye movements/closures can reveal coordination between these features during blinks or facial expressions.
- Nose and Mouth: This analysis checks whether movements of the mouth correlate with changes in the nose area, which might occur during various expressions.
- Eyes and Mouth: The focus here is on whether movements in the eyes (like blinking) are synchronized with mouth movements, which would be common during expressions or speech.
-
Strength of Correlations:
- Strong Correlation (>0.7): Indicates that features move in tandem. For example, a strong correlation between the position of the Nose Tip X and Nose Bridge X suggests synchronized movements in these features.
- Moderate Correlation (0.3 to 0.7): Suggests a relationship but with less consistency. For instance, a moderate correlation between Mouth Aspect Ratio and average Eyelid Movement might indicate that certain expressions affecting the mouth could also impact eyelid movements.
- Weak Correlation (<0.3): Shows little to no linear relationship. For example, a weak correlation between Left EAR and Nose Shape Y implies that eye closures do not consistently correlate with the nose’s vertical dimensions.
3.5. Artifact Annotations
- Collect: – List relevant artifacts and features along with the similar or related information necessary for deepfake detection.
- Noise Remover: – Removes noise and irrelevant data and improves the overall data quality.
- Transform: - Format and standardize the data to make the data consistent across the dataset.
- Enrich: - Augment the dataset, to increase the robustness of the deep learning models.
3.6. Data Preparation in Deepfake Forensics
- Noise Removal,
- Data Transformation
- Data Enrichment.
3.6.1. Noise Removal
3.6.2. Data Transformation
3.6.3. Data Enrichment
3.7. Artifact Sample Augmentation
3.8. Artifacts Balancing
-
The Synthetic Handling Imbalanced Features with SMOTE:To further improve model performance, Artifact Tags Balancing is carried out. This step removes any possible class imbalances in the dataset, so the model will receive equal number of authentic and manipulated artifacts. To improve model generalizability, we apply techniques like Synthetic Minority Over-sampling Technique (SMOTE) in underrepresented class, generating synthetic samples.
-
Artifacts Distributions:For analytical or modeling tasks, the data is split into training and testing sets. The training set is used to build the model, while the testing set evaluates its performance. Table 3 shows that data augmentation techniques have been used to balance the dataset, providing a significantly larger training sample size.
3.8.1. K-Fold Cross-Validation
3.8.2. Artifacts Transformation
3.9. Workflow Using Models
3.10. Model Training
3.10.1. Model Training Pseudocode
| Algorithm 1 Detailed Pseudocode for GAN and Deep Learning Model Operations |
|
| Algorithm 2 Detailed Pseudocode for GAN and Deep Learning Model Operations - Part 2 |
|
4. Related Work - Comparative Analysis
5. Results and Discussion
- Experiment 1: Eye Landmarks
- Experiment 2: Fusion of eye and nose landmark facial region
- Experiment 3: Fusion of eye, nose, and mouth landmark facial region
5.0.2. Experiment 1: Eye Landmarks
- Training and validation curves of CNN-LSTM demonstrate the highest accuracy and the least gap between training and validation curve further signifying that there is less overfitting occurring.
- Next is TCN which performs nearly as well and shows stable and reliable learning for the temporal analysis of artifacts.
5.0.3. Experiment 2: Fusion of Eye and Nose Landmark Facial Region
5.0.4. Experiment 3: Fusion of Eye, Nose, and Mouth Landmark Facial Region
5.1. State of Art Table
6. Conclusion
Abbreviations
| CNN | Convolutional Neural Networks |
| GRU | Gated Recurrent Unit |
| GANs | Generative Adversarial Networks |
| TCN | Temporal Convolutional Network |
| AUC | Area Under the Curve |
| RNN | Recurrent Neural Networks |
| LSTM | Long Short-Term Memory |
| VAE | Variational Autoencoder |
| MLP | Multi-Layer Perceptron |
| SMOTE | Synthetic Minority Oversampling Technique |
| FF++ | FaceForensics++ |
| MAR | Mouth Aspect Ratio |
| EAR | Eye Aspect Ratio |
| DNN | Deep Neural Network |
| ROC | Receiver Operating Characteristic |
| MSE | Mean Squared Error |
| FF-LPBH | Fisher-Face Local Binary Pattern Histogram |
| YOLO | You Only Look Once |
| FCC-GAN | Fully Connected Convolutional Generative Adversarial Network |
| PGGAN | Progressive Growing of GANs |
| CRNN | Convolutional Recurrent Neural Network |
| DBN | Deep Belief Network |
| OC-FakeDetect | One-Class Fake Detection |
| C-GAN | Conditional Generative Adversarial Network |
| AddNets | Attention-based Deepfake Detection Networks |
| KL-Divergence | Kullback-Leibler Divergence |
| IQR | Interquartile Range |
| CSV | Comma-Separated Values |
| ReLU | Rectified Linear Unit |
| SVM | Support Vector Machine |
References
- Koopman, M.; Rodriguez, A.M.; Geradts, Z. Detection of deepfake video manipulation. In Proceedings of the 20th Irish Machine Vision and Image Processing Conference (IMVIP), Belfast, Northern Ireland, 10-12 September 2018; pp. 133–136. [Google Scholar]
- Chesney, B.; Citron, D. deepfakes: A looming challenge for privacy, democracy, and national security. Calif. L. Rev. 2019, 107, 1753. [Google Scholar]
- Harris, D. Deepfakes: False pornography is here and the law cannot protect you. Duke L. & Tech. Rev. 2018, 17, 99. [Google Scholar]
- Guarnera, L.; Giudice, O.; Battiato, S.; et al. The Face Deepfake Detection Challenge. Journal of Imaging 2022, 8, 263. [Google Scholar] [CrossRef] [PubMed]
- Patel, M.; Gupta, A.; Tanwar, S.; Obaidat, M. Trans-DF: A transfer learning-based end-to-end deepfake detector. In Proceedings of the 2020 IEEE 5th International Conference on Computing Communication and Automation (ICCCA), 2020; IEEE; pp. 796–801. [Google Scholar]
- Bracken, B. Deepfake Attacks Are About to Surge, Experts Warn. 2021.
- FakeApp. Available online: https://www.fakeapp.org/ (accessed on 30 November 2022).
- FaceApp. Available online: https://www.faceapp.com/ (accessed on 30 November 2022).
- Korshunov, P.; Marcel, S. Deepfakes: A new threat to face recognition? Assessment and detection. arXiv 2018, arXiv:1812.08685 2018. [Google Scholar]
- Masood, M.; Nawaz, M.; Malik, K.M.; et al. Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward. Applied Intelligence 2022, 1–53. [Google Scholar] [CrossRef]
- Li, L.; Bao, J.; Zhang, T.; et al. Face X-ray for more general face forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020; pp. 5001–5010. [Google Scholar]
- Dang, H.; Liu, F.; Stehouwer, J.; Liu, X.; Jain, A.K. On the detection of digital face manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2020; pp. 5781–5790. [Google Scholar]
- Li, Y.; Chang, M.C.; Lyu, S. In ictu oculi: Exposing AI-generated fake face videos by detecting eye blinking. arXiv 2018, arXiv:1806.02877 2018. [Google Scholar]
- Chintha, A.; Sharma, P.; et al. Recurrent convolutional structures for audio spoof and video deepfake detection. IEEE Journal of Selected Topics in Signal Processing 2020, 14, 1024–1037. [Google Scholar] [CrossRef]
- Afchar, D.; Nozick, V.; Yamagishi, J.; Echizen, I. Mesonet: A compact facial video forgery detection network. In Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security (WIFS), 2018; IEEE; pp. 1–7. [Google Scholar]
- Sabir, E.; Cheng, J.; Jaiswal, A.; AbdAlmageed, W.; Masi, I.; Natarajan, P. Recurrent convolutional strategies for face manipulation detection in videos. Interfaces (GUI) 2019, 3, 80–87. [Google Scholar]
- Yang, X.; Li, Y.; Lyu, S. Exposing deepfakes using inconsistent head poses. In Proceedings of the 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019; IEEE; pp. 8261–8265. [Google Scholar]
- Suganthi, S.; Ayoobkhan, M.U.A.; Bacanin, N.; et al. Deep learning model for deepfake face recognition and detection. PeerJ Computer Science 2022, 8, e881. [Google Scholar]
- Ismail, A.; Elpeltagy, M.; Zaki, M.; ElDahshan, K.A. Deepfake video detection: YOLO-Face convolution recurrent approach. PeerJ Computer Science 2021, 7, e730. [Google Scholar] [CrossRef]
- Kshirsagar, M.; Suratkar, S.; Kazi, F. Deepfake Video Detection Methods using Deep Neural Networks. In Proceedings of the 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), 2022; IEEE; pp. 27–34. [Google Scholar]
- Rana, M.S.; Nobi, M.N.; Murali, B.; Sung, A.H. Deepfake detection: A systematic literature review. IEEE Access 2022. [Google Scholar] [CrossRef]
- Chauhan, S.S.; Jain, N.; Pandey, S.C.; Chabaque, A. Deepfake Detection in Videos and Pictures: Analysis of Deep Learning Models and Dataset. In Proceedings of the 2022 IEEE International Conference on Data Science and Information System (ICDSIS), 2022; IEEE; pp. 1–5. [Google Scholar]
- KoÇak, A.; Alkan, M. Deepfake Generation, Detection and Datasets: A Rapid-review. In Proceedings of the 2022 15th International Conference on Information Security and Cryptography (ISCTURKEY), 2022; IEEE; pp. 86–91. [Google Scholar]
- Kiran, B.R.; Thomas, D.M.; Parakkal, R. An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos. Journal of Imaging 2018, 4, 36. [Google Scholar] [CrossRef]
- Groh, M.; Epstein, Z.; Firestone, C.; Picard, R. Deepfake detection by human crowds, machines, and machine-informed crowds. Proceedings of the National Academy of Sciences 2022, 119, e2110013119. [Google Scholar] [CrossRef] [PubMed]
- Zi, B.; Chang, M.; Chen, J.; Ma, X.; Jiang, Y.G. Wilddeepfake: A challenging real-world dataset for deepfake detection. In Proceedings of the 28th ACM International Conference on Multimedia; 2020; pp. 2382–2390. [Google Scholar]
- Raza, A.; Munir, K.; Almutairi, M. A Novel Deep Learning Approach for Deepfake Image Detection. Applied Sciences 2022, 12, 9820. [Google Scholar] [CrossRef]
- Shahzad, H.F.; Rustam, F.; Flores, E.S.; Mazón, J.L.V.; Diez, I.T.; Ashraf, I. A Review of Image Processing Techniques for Deepfakes. Sensors 2022, 22, 4556. [Google Scholar] [CrossRef] [PubMed]
- Khochare, J.; Joshi, C.; Yenarkar, B.; Suratkar, S.; Kazi, F. A deep learning framework for audio deepfake detection. Arabian Journal for Science and Engineering 2022, 47, 3447–3458. [Google Scholar] [CrossRef]
- Verdoliva, L. Media forensics and deepfakes: an overview. IEEE Journal of Selected Topics in Signal Processing 2020, 14, 910–932. [Google Scholar] [CrossRef]
- Khalid, H.F.; Woo, S.S. OC-FakeDect: Classifying deepfakes using one-class variational autoencoder. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops; 2020; pp. 656–657. [Google Scholar]
- Malik, Y.S.; Sabahat, N.; Moazzam, M.O. Image Animations on Driving Videos with DeepFakes and Detecting DeepFakes Generated Animations. In Proceedings of the 2020 IEEE 23rd International Multitopic Conference (INMIC), 2020; IEEE; pp. 1–6. [Google Scholar]
- Hashmi, M.F.; Ashish, B.K.K.; Keskar, A.G.; Bokde, N.D.; Yoon, J.H.; Geem, Z.W. An exploratory analysis on visual counterfeits using conv-lstm hybrid architecture. IEEE Access 2020, 8, 101293–101308. [Google Scholar] [CrossRef]
- Rana, M.S.; Sung, A.H. Deepfakestack: A deep ensemble-based learning technique for deepfake detection. In Proceedings of the 2020 7th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2020 6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom), 2020; IEEE; pp. 70–75. [Google Scholar]
- Matern, F.; Riess, C.; Stamminger, M. Exploiting visual artifacts to expose deepfakes and face manipulations. In Proceedings of the 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), 2019; IEEE; pp. 83–92. [Google Scholar]
- Guarnera, Luca and Giudice, Oliver and Guarnera, Francesco and Ortis, Alessandro and Puglisi, Giovanni and Paratore, Antonino and Bui, Linh MQ and Fontani, Marco and Coccomini, Davide Alessandro and Caldelli, Roberto. The face deepfake detection challenge. Imaging 2022, 8, 10.
- Author 1, T. The title of the cited article. Journal Abbreviation 2008, 10, 142–149. [Google Scholar]
- Author 2, L. The title of the cited contribution. In The Book Title; Editor 1, F., Editor 2, A., Eds.; Publishing House: City, Country, 2007; pp. 32–58. [Google Scholar]
- Author 1, A.; Author 2, B. Book Title, 3rd ed.; Publisher: Publisher Location, Country, 2008; pp. 154–196. [Google Scholar]
- Author 1, A.B.; Author 2, C. Title of Unpublished Work. Abbreviated Journal Name year, phrase indicating stage of publication (submitted; accepted; in press).
- Author 1, A.B. (University, City, State, Country); Author 2, C. (Institute, City, State, Country). Personal communication, 2012.
- Author 1, A.B.; Author 2, C.D.; Author 3, E.F. Title of presentation. In Proceedings of the Name of the Conference, Location of Conference, Country, Date of Conference (Day Month Year); Abstract Number (optional), Pagination (optional).
- Author 1, A.B. Title of Thesis. Level of Thesis, Degree-Granting University, Location of University, Date of Completion.
- Title of Site. Available online: URL (accessed on Day Month Year).
- CB Insights. The Future of Information Warfare. Available online: https://www.cbinsights.com/research/future-of-information-warfare/ 2024.
- Karras, T.; Laine, S.; Aila, T. Analyzing and Improving the Image Quality of StyleGAN. arXiv, 2020; arXiv:1912.04958 2020. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; Uszkoreit, J.; Houlsby, N. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv, 2021; arXiv:2010.11929 2021. [Google Scholar]
- Grill, J.B.; Strub, F.; Altché, F.; Tallec, C.; Richemond, P.H.; Buchatskaya, E.; Doersch, C.; Pires, B.A.; Guo, Z.D.; Azar, M.G.; Piot, B.; Kavukcuoglu, K.; Munos, R.; Valko, M. Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning. arXiv, 2020; arXiv:2006.07733 2020. [Google Scholar]
- Ho, J.; Salimans, T.; Chan, W.; Chen, B.; Schulman, J.; Sutskever, I.; Abbeel, P. Cascaded Diffusion Models for High Fidelity Image Generation. arXiv, 2021; arXiv:2102.00732 2021. [Google Scholar]
- Verdoliva, L. Media forensics and deepfakes: an overview. IEEE Journal of Selected Topics in Signal Processing 2020, 14, 910–932. [Google Scholar] [CrossRef]
- Rössler, A.; Cozzolino, D.; Verdoliva, L.; Riess, C.; Thies, J.; and Nießner, M. FaceForensics++: Learning to detect manipulated facial images. 2020. Available online: http://github.com/ondyari/FaceForensics (accessed on 3 November 2024).







| Training Parameter | Values(CNN-LSTM, GRU, TCN) | GAN-Autoencoded |
|---|---|---|
| Epochs | 10 | 10 |
| Batch Size | 32 | 32 |
| Optimizer | Adam | Adam |
| Loss Function | Binary Crossentropy | GAN-Autoencoded: MSE |
| Metrics | Accuracy | Accuracy |
| Eyes | Nose | Mouth |
|---|---|---|
| Eye Aspect Ratio (EAR) | Nose Tip | Mouth Aspect Ratio (MAR) |
| Blink Frequency and Amplitude | Nostril Symmetry | Mouth Symmetry |
| Pupil Dilation | Nasal Base | Mouth Position (X, Y) |
| Eyelid Creases and Movement | Nasal Sides | Lip Spacing |
| Iris Texture and Diameter | Nasal Septum | Lip Boundary |
| Eye Position and Aspect Ratio | Nasal Shape | Mouth Shape Dynamics |
| Sclera-to-Iris Ratio | Nostrils Position (X, Y) | Mouth-to-Face Proportion |
| Pupil to Iris Ratio | Nose Bridge | Corner of Mouth (Left X, Y; Right X, Y) |
| Dataset | Ratio | Samples (Before Augmentation) | Samples (After Augmentation) |
|---|---|---|---|
| Training Set | 80% | 33246 | 66492 |
| Testing Set | 20% | 7379 | 7379 |
| Dataset | Ratio | Samples (Before Augmentation) | Samples (After Augmentation) |
|---|---|---|---|
| Training Set | 80% | 33246 | 66492 |
| Testing Set | 20% | 7379 | 7379 |
| Layer Type | Parameters |
|---|---|
| Input | shape=(input_dim,) |
| Dense | units=64, activation=’relu’ |
| Dense | units=32, activation=’relu’ |
| Dense | units=64, activation=’relu’ |
| Dense | units=input_dim, activation=’sigmoid’ |
| Training Parameters | Values |
|---|---|
| Epochs | 50 |
| Batch Size | 256 |
| Optimizer | Adam |
| Loss Function | Mean Squared Error (MSE) |
| Metrics | Accuracy |
| Ref. | Featured Based Methodology | Classifier | Best Performance | Datasets |
|---|---|---|---|---|
| [35] | Combined Visual Features of eyes and teeth | Logistic Regression, MLP | AUC = 0.851 Accuracy = 0.854 Precision = 0.807 Recall = 0.849 F1 Score = 0.828 |
FaceForensics++ |
| [39] | Deep learning features | Capsule Network | AUC = 0.91 Accuracy = 0.91 F1 Score = 0.91 Precision = 0.92 Recall = 0.08 |
FaceForensics++ |
| [16] | Image + Temporal features | CNN + RNN | AUC = 0.93 Accuracy = 0.939 Precision = 0.92 Recall = 0.08 F1 Score = 0.91 |
FF++ (FaceSwap, DeepFakes, LQ) |
| [40] | Image + Temporal features | Dynamic Prototype Network | AUC = 0.718 Accuracy = 0.72 Precision = 0.73 Recall = 0.26 F1 Score = 0.73 |
FF++ (Face2Face, FaceSwap, HQ) |
| [13] | Eye blinking features | LRCN | AUC = 0.78 Accuracy = 0.76 Precision = 0.77 Recall = 0.22 |
FaceForensics++ (Face Synthesis) |
| [41] | Eye blinking features | Distance, | AUC = 0.875 Precision = 0.875 Recall = 0.778 F1 Score = 0.824 Accuracy = 0.85 |
FaceForensics++ (Face Synthesis with the unnatural movement of the eye) |
| Model | Precision (Without GAN) | Recall (Without GAN) | F1 Score (Without GAN) | Precision (With GAN) | Recall (With GAN) | F1 Score (With GAN) |
|---|---|---|---|---|---|---|
| CNN | 0.896 | 0.884 | 0.890 | 0.915 | 0.902 | 0.908 |
| CNN-GRU | 0.902 | 0.890 | 0.896 | 0.920 | 0.910 | 0.915 |
| CNN-LSTM | 0.910 | 0.902 | 0.906 | 0.928 | 0.916 | 0.922 |
| TCN | 0.917 | 0.910 | 0.913 | 0.935 | 0.920 | 0.927 |
| Model | Precision (Without GAN) | Recall (Without GAN) | F1 Score (Without GAN) | Precision (With GAN) | Recall (With GAN) | F1 Score (With GAN) |
|---|---|---|---|---|---|---|
| CNN | 0.875 | 0.860 | 0.867 | 0.895 | 0.880 | 0.887 |
| CNN-GRU | 0.890 | 0.875 | 0.882 | 0.910 | 0.895 | 0.902 |
| CNN-LSTM | 0.898 | 0.882 | 0.890 | 0.918 | 0.902 | 0.910 |
| TCN | 0.905 | 0.890 | 0.897 | 0.925 | 0.910 | 0.917 |
| Model | Precision (Without GAN) | Recall (Without GAN) | F1 Score (Without GAN) | Precision (With GAN) | Recall (With GAN) | F1 Score (With GAN) |
|---|---|---|---|---|---|---|
| CNN | 0.865 | 0.850 | 0.857 | 0.885 | 0.870 | 0.877 |
| CNN-GRU | 0.880 | 0.865 | 0.872 | 0.900 | 0.885 | 0.892 |
| CNN-LSTM | 0.890 | 0.875 | 0.882 | 0.910 | 0.895 | 0.902 |
| TCN | 0.900 | 0.885 | 0.892 | 0.920 | 0.905 | 0.912 |
| Ref. | Feature-Based Methodology | Classifier | Best Performance | Datasets |
|---|---|---|---|---|
| [40] | Image + Temporal features | Dynamic Prototype Network | AUC = 0.718, Accuracy = 0.72, Precision = 0.73, Recall = 0.26, F1-score = 0.73 | FF++ (Face2Face, FaceSwap, HQ) |
| [13] | Eye blinking features | LRCN | AUC = 0.78, Accuracy = 0.76, Precision = 0.77, Recall = 0.22 | FaceForensics++ (Face Synthesis) |
| [35] | Combined Visual Features of eyes and teeth | Logistic Regression, MLP | AUC = 0.851, Accuracy = 0.854, Precision = 0.807, Recall = 0.849, F1 Score = 0.828 | FaceForensics++ |
| [41] | Eye blinking features | Distance | AUC = 0.875, Precision = 0.875, Recall = 0.778, F1 Score = 0.824, Accuracy = 0.85 | FaceForensics++ (Face Synthesis with unnatural movement of the eye) |
| [39] | Deep learning features | Capsule Network | AUC = 0.91, Accuracy = 0.91, F1 Score = 0.91, Precision = 0.92, Recall = 0.08 | FaceForensics++ |
| [16] | Image + Temporal features | CNN + RNN | AUC = 0.93, Accuracy = 0.939, Precision = 0.92, Recall = 0.08, F1-score = 0.91 | FF++ (FaceSwap, DeepFakes, LQ) |
| propose | Spatiotemporal features + augmented facial landmarks with GAN model | TCN model for spatiotemporal analysis with augmentation + GAN | AUC = 0.93, Accuracy = 0.96, Precision = 0.98, F1-score = 0.98 | FF++ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).