Frame Selection Strategies for Video Deepfake Detection: Benchmarking Accuracy and Runtime Trade-Offs

Artūras Serackis; Mindaugas Jankauskas; Anastasija Grubinskienė; Vytautas Abromavičius

doi:10.20944/preprints202604.1520.v1

Submitted:

20 April 2026

Posted:

21 April 2026

You are already at the latest version

Abstract

Deepfake detection from images and videos has evolved from artifact-specific convolutional baselines toward more generalizable, cross-dataset, and foundation-model-based approaches. The current work focuses on the efficiency and informativeness of frame selection itself, while keeping the downstream detectors fixed. The study compares twelve frame-selection heuristics ranging from simple baselines to landmark-aware strategies. Four pre-trained detectors were included in the present quantitative comparison: Self-Blended Images (SBI), Frequency-Enhanced Self-Lendered Images (FSBI), Generative Convolutional Vision Transformer (GenConViT), and GenD. The results show that GenD achieved the strongest average detector-level performance, with a mean frame-mean AUC of 0.9464. The best single validated configuration is GenD, yielding an AUC value of 0.9607 and a balanced accuracy of 0.9133. FSBI and SBI reached mean AUC values of 0.8953 and 0.8935, respectively, while GenD was the best general candidate. For SBI, the best validation configuration is Landmark cluster with 32 selected frames. GenD achieves the best AUC at the level of selection strategy. The present work demonstrates that inference-time frame selection is an important component of video-only deepfakes under constrained inference budgets.

Keywords:

deepfake detection

;

frame selection

;

landmark-based sampling

;

reusable frame cache

;

frame-based detectors

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Frame Selection Strategies for Video Deepfake Detection: Benchmarking Accuracy and Runtime Trade-Offs

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe