Automated Acoustic Side-Channel Attack on Keyboard Inputs via Combined Video–Audio Analysis

Dario Vranješ; Ivo Stančić; Marin Bugarić; Toni Perković

doi:10.20944/preprints202606.1478.v1

Submitted:

18 June 2026

Posted:

19 June 2026

You are already at the latest version

Abstract

Acoustic side-channel attacks (ASCAs) exploit unintended sound emitted by keyboards to infer typed input, but existing methods generally assume manually-labelled training data and controlled environments, limiting their applicability to realistic scenarios such as online lectures. We develop a pipeline that automatically labels keystroke-sound samples captured from online coding tutorials: video frames are processed with optical character recognition (OCR) to extract the ground-truth character sequence, audio is segmented into clips centred on detected click events, and the two streams are aligned. A convolutional neural network (CNN) is trained on mel-spectrogram features, with transfer learning used to adapt the pretrained model to a target user with minimal samples. Our dataset contains 50 unique keys from standard QWERTZ keyboards recorded during real programming lectures. On a held-out test set the CNN achieves 98.1 % top-1, 99.4 % top-2 and 100 % top-3 accuracy. Transfer learning retains strong performance with as few as 13 samples per key. Pairing OCR-derived ground truth with acoustic CNN classification removes the labelling bottleneck that has limited previous ASCAs, and the transfer-learning stage makes the attack viable with minimal per-victim data. All code, trained models, and labelled datasets are released to support reproducible research.

Keywords:

acoustic side-channel

;

keyboard

;

convolutional neural network

;

video analysis

;

mel-spectrogram

;

password security

Subject:

Computer Science and Mathematics - Security Systems

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Automated Acoustic Side-Channel Attack on Keyboard Inputs via Combined Video–Audio Analysis

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe