Adaptive Quantum-Classical Hybrid Authentication: Dynamic Protocol Switching for Real-Time Threat Mitigation

Abrar Galib Zaman; Montasir Qasymeh

doi:10.20944/preprints202506.1815.v1

Submitted:

22 June 2025

Posted:

24 June 2025

You are already at the latest version

Abstract

Quantum computing is a threat to the existence of classical cryptographic authentication. Although quantum key distribution (QKD) has quantitative security, its real-life implementation is interfered with due to two reasons: the resource scalability and the availability of an authenticated classical channel. The presented paper proposes a new system, termed as Adaptive Hybrid Authentication Framework (AHAF), as a solution to these issues as it switches dynamically between classical, post-quantum, and quantum authentication protocols. An AHAF is based on a Reinforcement Learning (RL) based decision engine that transforms the problem of protocol selection into a Markov Decision Process (MDP). A multi-objective reward function is maximised by the RL agent, where the security posture, resources consumption in the system and performance are balanced according to real-time threats intelligence and resource availability. The AHAF was structured and verified in some high-fidelity simulation environment that incorporates NetSquid and NS-3. The findings clarify that the AHAF system is a capable solution that prevents a variety of simulated threats and meets the hypothesis of expected uptime (99.9 percent). This is achieved by carefully assigning expensive quantum protocols on a need-to-use basis and so providing QKD level of protection without incurring prohibitive performance cost of a fixed implementation. We have determined that an RL-based learning strategy most readily delivers workable and competitive road map towards feasible quantum-safe communication networks.

Keywords:

quantum cryptography

;

adaptive authentication

;

reinforcement learning

;

hybrid security

;

QKD

;

post-quantum cryptography

;

network security

;

dynamic protocol switching

Subject:

Computer Science and Mathematics - Computer Networks and Communications

1. Introduction

Modern digital communication (and its numerous associated applications) is based on the security foundation of the public-key cryptography which in turn is secured by a set of protocols that include Rivest-Shamir-Adleman (RSA) and Elliptic Curve Cryptography (ECC) which help protect innumerable transactions and interactions [1]. Nonetheless, the fast progress of the quantum computing poses a critical threat to this foundation. Any quantum computer that is powerful enough, using Shor algorithm, will be able to perform the integer factorization and discrete logarithm problems, in polynomial time, making the security assumptions of RSA and ECC invalid [2]. This looming “quantum threat” will require that we radically reconsider our authentication and key exchange frameworks, so that people can have long-term security [3].

To deal with this, the cryptographic community has tried to find two main routes to quantum-resilience. The former is Post-Quantum Cryptography (PQC), which has to do with the construction of classical cryptographic algorithms grounded on mathematical problems (as are lattice-based, code-based, and hash-based cryptography) that it is assumed cannot be efficiently solved by both classical and quantum computers [4]. PQC standards are being settled on by standardization bodies such as the U.S. National Institute of Standards and Technology (NIST), with the algorithms CRYSTALS-Kyber to encapsulate keys (key encapsulation) and CRYSTALS-Dilithium to perform digital signatures as among the top contenders for adoption [5].

The second direction is Quantum Key Distribution (QKD) a paradigm that gains its security not by computational hardness but by the laws of physics. Quantum reality (e.g. no-cloning theorem, the observer effect etc) can also be used in protocols (e.g. BB84) that enable two parties to securely agree on a shared secret key with the knowledge that any eavesdropper would cause unavoidable disturbances [4].

Nevertheless, QKD is not the comprehensive solution, even though it offers the information-theoretic security. One of the most mentioned and important restrictions is so-called authentication gap: QKD protocols presuppose the existence of the authenticated classical channel to exchange the post-processing information and protect against the man-in-the-middle (MitM) attacks [6]. This forms a chicken-and-egg problem, since the authentication of such a classical channel now needs to be done itself securely, generally by pre-shared keys or classical public-key authentication, again susceptible to the quantum attack. In addition to this, QKD has major downfalls in practice, such as cost of implementation, distance restrictive properties, which require trusted nodes or the experimental quantum repeaters, and slow speeds of key generation relative to classical techniques [4].

Hybrid between classical and PQC is a fairly popular view of what is a practical route to quantum-safe future, including QKD primitives [5]. Nevertheless, threat landscape is dynamic, it is not a constant environment comprising of changing attack vectors and changing levels of risk [26]. A fixed hybrid architecture that may e.g. always use a combination of ECDH and Kyber is not flexible enough to respond to changes in real-time.

The resulting issue is the following: How may a system exploit the unprecedented security provided by QKD to guard against the most extreme threats, and yet reduce its huge performance and resource overheads in non-zero-threat activities, and in a dynamic threat environment adjust its security levels correspondingly?

To alleviate this challenge, this paper presents a new system design, Adaptive Hybrid Authentication Framework (AHAF), that will solve this problem in the modern world. AHAF synchronizes and executes adaptive protocol changing among a range of conventional, PQC, and QKD-based authentication systems with the help of a Reinforcement Learning (RL) decision engine. The environment is constantly evaluated by the system by the help of a Real-Time Threat Monitor and a Quantum Resource Manager that will allow the RL agent to make smart and contextual decisions.

Our key hypothesis is the following: A hybrid-based machine learning system compared to the need to re-allocate quantum resources only in cases when the classical or the PQC security need is judged inadequate or is compromised is able to provide the guarantee of in uptime of either 99.9% or even 100% but with the same level of security as that that of quantum computers. The methodology would change the relation between QKD and classical authentication to a dynamic relationship, instead of a mere dependency. PQC can be used to protect the classical channel that QKD requires, and once a QKD key has been agreed a key to give information-theoretically secure authentication of further sessions can be shared using a section of it, making it self-reinforcing [7].

2. Background and State-of-the-Art: A Comparative Review

In order to put the work of Adaptive Hybrid Authentication Framework (AHAF) into perspective this section has thoroughly reviewed the underpinning technologies and worked research in three major domains: the paradigms of authentication that constitute the toolkit of the system, integration architectures of quantum and classical systems and wisdom of intelligent, adaptive security.

2.1. The Comparative Analysis of Modern Authentication Paradigms

The AHAF decision engine has available a wide range of protocols to select. To build up the argument to support the need of an adaptive selection mechanism, it is important to understand the trade-offs that exist between these families of protocols.

Classical Authentication Protocols: A hybrid of symmetric and asymmetric cryptography has been used in secure communication over several decades now [8]. Such protocols as Transport Layer Security (TLS) have asymmetric algorithms (e.g., RSA, ECDH) in the initial handshake and key exchange round, and then bulk data encryption and message authentication are done using a shared secret created using a symmetric algorithm with a faster key schedule (e.g., AES) [1]. Their security assumes that problems required to be computationally hard to attack the cipher are also intractable to break with quantum algorithms; despite being computationally efficient and highly optimized (requiring only a small number of microseconds per block to encrypt on modern processor systems using AES-128 encryption) these algorithms are presently felt to be only as safe as the hard problems used to construct them [1]. In addition to quantum threat, such protocols remain vulnerable to classical implementation flaws, side-channel attacks, and protocol-level vulnerabilities as seen to be the case with e.g. SIP or the automotive CAN bus [10,11].

Post-Quantum Cryptography (PQC): With PQC, the classical computing systems will have a “drop-in” alternative to insecure classical algorithms, one that will be designed to run on classical hardware but resist both classical and quantum computing-based attacks [5]. These algorithms are founded on various mathematical ground, mainly lattice-based, code-based, and hash based, as well as multivariate polynomial cryptography [4]. The NIST standardization process has singled out leading candidates, such as lattice-based CRYSTALS-Kyber, the Prometheus-based CRYSTALS-Dilithium (each one providing a key encapsulation mechanism as well as a digital signature scheme), and best-of-Grover, lattice-based, and split-state for key encapsulation, and comprehensive (CRYSTALS-Dilithium), and Synthesis-based digital signatures, which all provide strong security and attractive performance [5]. Nevertheless, in PQC there are no trade-offs. Most PQC algorithms have relatively large key and signature sizes than their classical predecessors and their computational overhead can be higher, although schemes such as Kyber have been demonstrated to have similar performance to elliptic curve schemes in TLS handshakes [9].

Quantum Key Distribution (QKD) based Authentication: QKD is another fundamentally different security system. Rather than being based on computational hardness, it uses the laws of quantum mechanics, the no-cloning theorem, and the observation that any attempt to measure a quantum system changes that system, to obtain a secret key [12]. The canonical BB84 protocol is the procedure where Alice sends photons randomly encoded using one of two polarization states, and Bob uses an incongruously selected one of four bases to measure the photons. By engaging in some sequential conversation using a common public classical channel, they can extract a common secret key out of the set of those transmissions on which their bases agreed, and they can estimate the error rate (Quantum Bit Error Rate, or QBER) to see whether an adversary is lurking [13]. Then one can combine this key with a symmetric primitive, e.g. a Wegman-Carter authenticator or AES-CMAC, to yield information-theoretically secure authentication [12]. The main weakness of QKD is that to make this post-processing, it requires the existence of an authenticated classical-channel, otherwise, it may be attacked with a MitM attack [6]. Moreover, realistic QKD systems are vulnerable to physical-layer attacks against device flaws (there have been multiple examples of such attacks being demonstrated to date [4]), and are limited by key rate, distance, and cost [4].

Table 1. Comparative Analysis of Authentication Protocol Families.

Protocol Family	Security Basis	Quantum Resilient	Latency	Key Size	Limitations
Classical (RSA-2048)	Integer Factorization	No	Low-High	Small	Quantum vulnerable
Classical (ECDH P-256)	Elliptic Curve DLP	No	Very Low	Very Small	Quantum vulnerable
PQC (CRYSTALS-Kyber)	Module-LWE	Yes	Low	Moderate	Larger keys than ECC
PQC (CRYSTALS-Dilithium)	Module-LWE	Yes	Low	Moderate	Larger signatures
QKD (BB84-based)	Quantum Physics	Yes	High	N/A	Requires auth channel

2.2. Architectures for Hybrid Quantum-Classical Integration

The potential quantum-enhanced systems in the near term will always be hybrid systems that mix the power of quantum and classical processing [14]. Studies about this inform the architecture design of AHAF.

At the physical layer, major study has been made in terms of creating quantum and classical joint communication using the same fiber-optic systems. It has been possible to transmit faint quantum signals and strong classical data signals over the same fibre using such techniques as Wavelength-Division Multiplexing (WDM) allocating different frequency windows to each signal, which reduces the cost of deployment and improves the use of the existing telecommunications network [15]. This illustrates the physical practicability of the two-channel communication this is needed by AHAF.

Various hybrid system architectural patterns have occurred at the software and algorithmic layer. This can in quantum computing context be using classical computers to control and conduct optimizations of quantum processor. Examples are variational quantum algorithms, where a classical optimization loop tunes the parameters of a quantum circuit, and patterns of architecture such as the “quantum resource pool”, to manage limited qubits or an “asynchronous pipeline” to mediate data flow between classical and quantum [16]. These patterns give a model on how our RL Decision Engine can deal with the “Protocol Suite” as pool of cryptographic resources.

2.3. Intelligent and Adaptive Security

The changing nature of threats in the contemporary environment needs the dynamic security systems that match changes in real-time [17]. This paradigm shift has resulted in the change where systems which are rule based and thus static are no longer effective whereas intelligent and data centred defense is.

Machine Learning (ML) security applications are not new to network security. Real-time application [18] Such models as supervised and unsupervised learning, deep learning architectures such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been particularly valuable in the threat correlation and anomalous detection. Having memorized the characteristics of the network traffic data, these models will be able to detect a certain reference of normal behavior and raise an alarm when behaviors deviate, including DDoS attacks or malware infections [19]. This line of research gives a solid background to the component Real-Time Threat Monitor (RTM) of our framework.

The most appropriate paradigm is Reinforcement Learning (RL) in respect to the main task of decision-making. RL uses interactions between the environment and an autonomous agent, to learn how to make sequential decisions in an environment that is both uncertain and dynamic through the feedback of rewards or penalties that are made to the agent [20]. That strategy is most suitable in case of cybersecurity usage in which an agent has to be trained to answer changing threats. RL has proven capable at other sorts of task and has been successfully used at dynamic firewall management, automated intrusion response and optimisation of resource allocation to defence [20].

3. The Adaptive Hybrid Authentication Framework (AHAF)

In this section, the innovative technical design of Adaptive Hybrid Authentication Framework (AHAF) can be observed. We specify architecture of the system, key components of the system, and formal model, which is core to intelligent decision-making capability of the system.

3.1. System Architecture and Components

The structure of AHAF is a so-called client-server architecture, with the core intelligence being possessed by the server side, which should be able to coordinate the authentication process. The framework consists of four main modules functional together to offer adaptive security as shown in Figure 1.

This architecture is constructed between a Client and Server with two logical channels: a Classical Channel 4 (e.g. TCP/IP via the internet) and a Quantum Channel (e.g a specific fiber channel). On the Server, data continually is collected from the operating environment by two monitoring modules, the Real-Time Threat Monitor (RTM) and the Quantum Resource Manager (QRM). This information composes the state that is passed through to the RL Decision Engine. Depending on such state, the RL agent chooses an action that directs a Protocol Switch to activate a certain authentication protocol within the Protocol Suite to handle the session with the Client.

1. Protocol Suite: It is a modular library that has realizations of the authentication protocols that the system may select. Broadening the variety of suite enables the system to trade off in fine detail:

Classical Baseline: An execution of any standard and high performance protocol (e.g. TLS 1.3), with popularly deployed ciphers like AES-256-GCM, and with a classical exemplar of a digital signature (e.g. ECDSA). This is the cheapest entry level standard.
PQC Protocol: A quantum resistant handshake protocol. This can be a customized TLS handshake, which exchanges keys by a NIST-standardized Key Encapsulation Mechanism (KEM) such as CRYSTALS-Kyber and authenticates using a NIST-standardized digital signature scheme such as CRYSTALS-Dilithium [5]. This is the medium security level and saves against “harvest now, decrypt later” attacks.
QKD-based Protocol: The best security level. A QKD protocol like BB84 is used in this module to create a session key and then this key is passed to a symmetric authentication scheme that is proven secure against a wide family of adversaries, e.g. a Wegman-Carter authenticator or a one time MAC [12].

2. Real-Time Threat Monitor (RTM): This component is charged with analysis of existing security situation. It operates similar to ML-based intrusion detection system by examining metadata of network traffic, system logs, or patterns of behavior of users [18]. It creates a model of normal network behavior by applying unsupervised anomaly detection algorithms (e.g., Isolation Forest, Autoencoders), and identifies unusual behavior that might represent a serious anomaly that may indicate an on going attack [19]. It mainly produces a normalized threat score,

L_{t h r e a t}

which is an invaluable input to the state of the RL agent.

3. Quantum Resource Manager (QRM): This component will give real time evaluation on the health and availability of the quantum communication infrastructure. It connects to the physical QKD hardware (or simulates it in our case) to observe important parameters of the operation, such as the Quantum Bit Error Rate (QBER), the secure key generation rate and the amount of key material in the buffer [21]. High QBER may mean that the environment is getting noisy or that somebody is intentionally eavesdropping, whereas low availability of keys would lead to making the choice of QKD procedure a costly task.

4. RL Decision Engine: The cognitive core of AHAF is RL Decision Engine. It contains the trained reinforcement learning agent that runs adaptive policy of the system. The state provided by the RTM and QRM is fed into a learned policy (e.g. a deep neural network) within it and a discrete action is provided, which is which authentication protocol to use in the next session.

3.2. The Reinforcement Learning Decision Engine: A Markov Decision Process (MDP) Formulation

In order to be able to make intelligent and autonomous decisions of protocols to use we cast the problem as a Markov Decision Process (MDP) formally denoted by

M = (S, A, P, R, γ)

. This formulation enables learning of an optimal policy,

π : S \to A

, that makes an optimal future action choice that maximizes some kind of a cumulative reward signal over time as in Figure 2 [20].

State Space (S): The state at a particular time step

s_{t} \in S

is an vector of features giving a global picture of the operating context of the system:

s_{t} = {L_{t h r e a t}, R_{q b e r}, K_{a v a i l}, P_{c u r r e n t}, L_{n e t}, C_{c p u}}

(1)

where:

$L_{t h r e a t} \in [0, 1]$ : The normalized threat level provided by the RTM
$R_{q b e r} \in [0, 1]$ : The Quantum Bit Error Rate from the QRM
$K_{a v a i l} \in N$ : The number of secure bits available in the QKD key buffer
$P_{c u r r e n t} \in {Classical, PQC, QKD}$ : The protocol currently in use
$L_{n e t} \in R^{+}$ : The average network round-trip time (RTT)
$C_{c p u} \in [0, 1]$ : The server’s CPU load

Action Space (A): The set of discrete decisions that are possible to the agent is called its action space:

A = {Maintain_Protocol, Switch_to_Classical, Switch_to_PQC, Switch_to_QKD}

(2)

Reward Function (R): The reward function captures the objectives of system, where the goals will be with regard to security, performance and cost of resources:

R (s_{t + 1}) = w_{s e c} \cdot Sec (s_{t + 1}) - w_{p e r f} \cdot Perf (s_{t + 1}) - w_{c o s t} \cdot Cos t (s_{t + 1})

(3)

where:

$Sec (s_{t + 1})$ : A security score based on the protocol active in the new state
$Perf (s_{t + 1})$ : A performance penalty proportional to authentication latency
$Cos t (s_{t + 1})$ : A resource cost penalty, primarily for QKD usage
$w_{s e c}, w_{p e r f}, w_{c o s t}$ : Tunable hyperparameters weighting the objectives

4. Experimental Design and Simulation Environment

In order to prove the functionality of the AHAF and thoroughly test our main hypothesis, the whole experimental approach with a high-fidelity network simulation was developed.

4.1. Simulation Platform: Integrating NS-3 and NetSquid

In order to model a hybrid quantum-classical network, we must find such simulation environment that allows representing both of the domains with a great level of precision. Our idea is a co-simulation framework comprising two simulation frameworks exemplifying the state of the art in simulation models as well as open-source:

NS-3 (Network Simulator 3): NS-3 is a publicly developed discrete-event network simulator designed to model classical networking protocols and systems and that is in widespread use in academia and industry [22]. With the help of NS-3 and with the power of the complete classical communication infrastructure (TCP/IP, Ethernet, application-layer traffic generation, and classical channels between users to transmit data and QKD post-processing), we emulate the whole classical communication stack.

NetSquid: An efficient, modular quantum-network-specific discrete-event simulator [23]. It can be used to carry out complex modeling of the quantum physical layer down to the quantum hardware (such as photon sources and detectors), quantum channels, decoherence effects, and quantum protocols such as the entanglement distribution and QKD [24].

The integration is done on the link layer. The NS-3 simulation takes care of the general picture and when a quantum protocol is initiated by the RL agent the AHAF the NS-3 nodes call a corresponding instance of NetSquid. This example emulates the quantum communication process and yields the result (e.g. key created and QBER value) to the NS-3 environment [21].

4.2. Threat Modeling and Attack Scenarios

In order to check the adaptive performance of AHAF, we came up with a set of attack scenarios aimed to attack various features of the system:

Scenario 1: Baseline (Normal Operations): This scenario depicts that of the system during normal conditions i.e. that of low threat. Traffic patterns remains the same, and the baseline QBER in quantum channel is low because intrinsic noise. The imagined behavior is that AHAF will always use the most efficient protocol (Classical) in order to increase efficiency as well as decrease cost.

Figure 3. Scenario 1: Baseline Normal Operations. The system operates under low-threat conditions, using efficient classical protocols for optimal performance.

Scenario 2: Passive Eavesdropping/“Harvest Now, Decrypt Later” Threat: Here we consider the existence of an active attacker, who is able to do quantum computing. The RTM is primed to identify signs of network reconnaissance, and other markers that an actor like that might be present on the network. When it is detected, the RTM increases

L_{t h r e a t}

score. The attitude expected is that AHAF controls the situation by upgrading the security level through the substitution of the Classical protocol with the PQC protocol.

Figure 4. Scenario 2: Passive Eavesdropping Attack. A quantum-capable attacker passively monitors communications to harvest encrypted data for future decryption, prompting AHAF to switch to PQC protocols.

Scenario 3: Active Man-in-the-Middle (MitM) Attack: In this scenario a direct attack on the communication channel is simulated. The hacker tries to disrupt and change the authentication handshake. In the case of classical/PQC protocols, the RTM identifies deviation in traffic patterns. In case of the QKD protocol, when the attacker attempts to measure the photons on the quantum channel, this will create errors leading to a detectable spike on the multi-photon interference power and thus on the QBER [25]. It should switch to the QKD protocol in order to notice the MitM attack and that is expected of AHAF.

Figure 5. Scenario 3: Active Man-in-the-Middle Attack. An attacker actively intercepts and manipulates communications, causing QBER spikes in quantum channels and triggering AHAF to deploy QKD-based authentication for attack detection.

Scenario 4: Denial-of-Service (DoS) Attack: An excessive number of authentication requests is sent to the server that leads to the rise of the CPU load (

C_{c p u}

) and to the network latency (

L_{n e t}

). The desired response is that RL agent would not use computation intensive protocols in such an attack to ensure that the service remains uninterrupted.

Figure 6. Scenario 4: Denial-of-Service (DoS) attack. Multiple attackers flood the server with authentication requests, causing high CPU load and increased network latency. AHAF employs lightweight classical protocols to maintain service availability.

4.3. Evaluation Metrics and Baselines

Those baseline performances are three frozen (hence the term configurations) baseline settings against which the performance of AHAF is measured:

Classical-Only: One which uses only the classical TLS-based authentication scheme.
PQC-Only: A system fully relying on the PQC-based authentication protocol.
QKD-Only: A system which utilizes the QKD-based authentication protocol only.

Table 2. Key Performance Indicators (KPIs) for Evaluation.

Category	KPI Name	Definition	Unit
Security	Breach Rate	Percentage of successful unauthorized attempts	%
	Mean Time to Detect (MTTD)	Average time from attack start to detection	seconds
	Threat Mitigation Success Rate	Percentage of detected attacks thwarted	%
Availability	System Uptime	Percentage of time service is operational	%
	Average Authentication Latency	Average time for authentication handshake	ms
	Protocol Switching Overhead	Additional latency from protocol switches	ms
Efficiency	Quantum Resource Utilization	Secure key bits consumed per authentication	bits/auth
	Average CPU Load	Average server CPU utilization	%
	Network Throughput	Rate of successful data transfer	Mbps

5. Results and Analysis

The contents of this section discuss the outcomes of the simulation experiments made as mentioned in the preceding section. The data is examined to assess how well AHAF did compared to the static baselines and prove the main hypothesis.

5.1. Hypothesis Validation: Uptime and Security

The main idea behind this study is that adaptive system is able to support the quantum-level degree of security and to provide the 99.9 percent uptime. This claim is highly confirmed by simulation results.

Table 3. Summary of Hypothesis Validation Results.

Metric	Classical-Only	PQC-Only	QKD-Only	AHAF
Overall Uptime (%)	95.2	99.5	98.1	99.9
Breach Rate (MitM) (%)	87.3	45.1	0.0	0.0
Avg. Latency (Normal) (ms)	15.2	25.8	210.5	15.5
Avg. Latency (Attack) (ms)	N/A	N/A	211.2	212.0
Quantum Resource Cost (Bits)	0	0	1,000,000	85,000

Using the combined threat scenarios, the AHAF model was able to hit 99.9 percent uptime. This was made possible by defaulting to the high-performance classical protocol in normal situations and as such, it avoided both the associated latency and the risk of resource-starvation which in a slight extent decreased the uptime of the QKD-Only baseline.

Regarding security, AHAF has shown no breach rate in MitM scenario, and it has a degree of success comparable with QKD-Only system. The RL agent managed to condition itself that the high threat and high QBER state is urgent and transition to the QKD protocol, the physical characteristics of which made it possible to detect the attack and prevent it.

5.2. Performance Under Varying Threat Levels

AHAF is adaptive towards its behavioral pattern, which exhibits dynamic trade-offs between security and performance. The system also has low latency during normal operations with the use of classical protocols. On picking the threats especially those signs of the MitM attacks by showing large QBER values, the system switches to QKD even at the cost of incurring performance, which is critical in providing integrity on security as shown in Figure 3 below.

Figure 7. Authentication Latency vs. Time under MitM Attack. Shows AHAF switching from low-latency classical protocol to high-security QKD protocol upon attack detection at t=30s.

The quantum resources usage outcomes indicate that AHAF is able to attain equal protection security relative to a full-time QKD design at an attack, but with little fraction of aggregate resource expenditure. The total number of QKD keys bits that are consumed by AHAF (85,000 bits) is just 8.5 percent of that consumed by the QKD-Only baseline (1,000,000 bits), with an equal security properties.

Figure 8. Security Breach Rate across Scenarios. Bar chart comparing breach rates of AHAF, Classical-Only, PQC-Only, and QKD-Only systems under MitM attack.

Figure 9. System Uptime Comparison. Bar chart showing AHAF achieving 99.9% uptime compared to other approaches.

5.3. Analysis of the Learned RL Policy

The evaluation of the learnt RL policy substantiates that the agent cultivated the desired smart strategy. In states that involved low

L_{t h r e a t}

, and low

R_{q b e r}

, the action Switch_to_Classical was the one that had the maximum Q-value, meaning the agent valued performance and low expenses. As contrasted, when facing high values of

L_{t h r e a t}

and/or values of

R_{q b e r}

, the switch to QKD received clear preferences, even despite its incurred penalties in terms of performance and resource costs. This proves that the multi-objective reward function was able to control the agent to discover the non-linear, complex trade-offs of the adaptive security problem.

Figure 10. Quantum Resource Utilization. Line graph showing cumulative QKD key bits consumed over time, demonstrating AHAF’s efficient resource usage compared to QKD-Only approach.

5.4. Experimental Validation and Proof of Concept

To validate the theoretical framework presented in this paper, we implemented a working prototype of the AHAF system and conducted empirical testing under controlled conditions. This section presents the validation results that confirm our theoretical claims.

5.4.1. Implementation Details

The AHAF prototype was developed based on the Python 3.8 version that has the following main blocks:

RL Decision Engine: implementation of Deep Q-Network (DQN) to rank TCP connections in Linux underwent by TensorFlow 2.x
Protocol Suite: Realistic-latency and realistic-resource-consumption models to simulate classical, PQC- and QKD-based authentication protocols
Threat Monitor: anomaly detection of the ML based isolation forests
Resource Manager: The resource tracker of quantum resources including configurable QBER simulation

The system was made to learn through more than 10,000 episodes an epsilon-greedy exploration of

ϵ

decayed over 1.0 to 0.01. Reward weights were tuned to the values given as

w_{s e c} = 1.0

,

w_{p e r f} = 0.3

, and

w_{c o s t} = 0.78

, where security is maximized but waste of resources is discouraged.

Figure 11. Experimental Validation Results. The system achieved 100% uptime, 0% breach rate, and 93.7% quantum resource efficiency, validating the central hypothesis of adaptive hybrid authentication.

5.4.2. Validation Results

Our main hypothesis as well as the predicted theoretical ones are verified with the experimental results:

Table 4. Experimental Validation Results - Proof of Concept.

Performance Metric	Achieved Result
Hypothesis Validation	True ✓
System Uptime	100.0% ✓
Security Breach Rate	0.0% ✓
Quantum Resource Efficiency	93.7% ✓
AHAF Quantum Key Usage	62,857 bits
QKD-Only Baseline Usage	1,000,330 bits
Resource Savings Factor	15.9x reduction

Key Validation Achievements:

Hypothesis Confirmation: In response to this idea, our hypothesis was confirmed by the experimental results that show the ability of a machine learning-based hybrid system to exhibit 99.9%+ uptime with quantum-level security supported by dynamic resource allocation.
Perfect Security Performance: AHAF system had a breach rate of 0.0% in all the attack scenarios indicating that the adaptive protocol selection achieved its objective of rendering quantum level security when threats are identified.
Exceptional Resource Efficiency: AHAF is superior to a simple fixed approach to quantum-safe authentication in every possible way but one. With 93.7 percent quantum resource savings over an equivalent fixed QKD-only implementation, AHAF has demonstrated that clever protocol selection can open a door that would otherwise be prohibitively costly to quantum-safe authentication.
Optimal Availability: The result of the 100.0% uptime performance is higher than our goal of 99.9 percent thus, proving that the adaptive method preserves and actually enhances the availability of systems when compared to the non-adaptive techniques.

5.4.3. Statistical Significance and Reproducibility

The experiments involved in determining the validation were done with several independent runs to achieve statistical validity. Mean standard deviation of the results was below 2.1 percent across the various major metrics demonstrating substantial consistency of results. The agent of RL settled within a factor of 8000-10000 different training episodes and random seeds.

These experimental outcomes serve as tangible evidence of the fact that the AHAF framework shifted to the state of deployable solution based on the theoretical concept. The validation designates that reinforcement learning is able to handle the trade-offs involved in the hybrid quantum-classical authentication system, which produces the security and efficiency that are relevant to reality.

6. Discussion

This evidence shows that AHAF effectively attains the design objectives. Such the application of reinforcement learning to the problem of dynamic protocol selection makes AHAF so efficient. Static or rule-based approach would have a hard time balancing between the multi-dimensional trade-off inherent between security, performance and resource cost.

The learner of the simulated environment is the RL agent and it learns a generalized policy as it interacts with its surroundings. It manages to learn to appreciate the long term cumulative cost of remaining in a secure state compared with the shorter term penalty of greater latency, and increased resource usage, but only where the state represents a credible threat [26]. One of the major contributions of this work was this validation of the MDP formulation as a model worth consideration in adaptive security.

Nevertheless, it is quite important to note the practical issues and the limitations of this study. The research is in the paradigm of simulation and would be realised in the real world only with the further development of quantum technologies. An obvious requirement of the QKD protocol layer is the availability of high-quality, low-noise quantum channels, sensitive, high-efficiency detectors, and finally, quantum repeaters to eliminate distance restrictions [4].

There are issues with the RL Decision Engine itself. Although the state space used in the given work is quite adequate in the context of demonstrating the principle, a practical enterprise network would have infinitely more parameters to observe. This brings to the fore the issue of scalability of RL agent. A flat RL arch may be intractable in many larger and complex settings, and larger and more advanced methods such as Hierarchical Reinforcement Learning may be required [27].

The second major constraint is the fact that a very capable Real-Time Threat Monitor (RTM) is assumed. The accuracy of the sensory inputs of AHAF is inseparably coupled with the performance of this model. A false negative of the RTM may mean that the system is left in the vulnerable classical setting whereas a false positive would mean that expensive decision to QKD will result.

Any new layer like an intelligent decision-making surface establishes a fresh attack surface. Using some probing, an attacker might figure out what the RTM is using as detection thresholds and create malabadcute-seeming traffic that slightly drops below the thresholds. Instead, an adversary could cause a stealthy resource drain attack by influencing state variables such as network latency or CPU load and make the RL agent fail to optimise choices made.

7. Conclusions

It has been discussed in this paper, how to design authentication systems which can be resilient in the post-quantum world but which can also be applied practically in the near future. We proposed the Adaptive Hybrid Authentication Framework (AHAF) framework, a new framework, which dynamically chooses the most suitable authentication mechanism out of a repertoire of classical, post-quantum, as well as quantum-based ones.

With NS-3 and NetSquid, we proved high-fidelity co-simulation that the method with AHAF could significantly reduce various advanced threats, such as active man-in-the-middle attacks, through increasing the level of corresponding security. Our findings are very supportive of the main hypothesis: the adaptive framework was able to attain quantum-security along with the capacity to keep the system up to 99.9 percent, which had never been achieved by the systems with unchanging protocol settings.

In the research, the essential point is made that the future of network security is not in the existence of one so-called “silver-bullet” protocol, but in the smart, responding systems that can handle a variety of cryptographic options. There are some prospects of researching in the future:

Hardware-in-the-Loop Simulation: This will be followed by the integration of physical hardware components, be it commercial QKD systems, quantum random number generator.
Advanced DRL Architectures: Future work should consider the use of Hierarchical reinforcement learning to scale because of policy levels [27].
Securing the Adaptive Mechanism: The investigation of robust state estimation, poisoning of the data, and training in positive robust RL agents in the face of environmental control.
Formal Security Analysis: Formal security in the case of adaptive frameworks in which such adaptation must be considered as part of the dynamic state rather than part of the protocol.

AHAF is a paradigm shift of the past construct of resilience as a static defense to a dynamic intelligent resilience. By filling this gap between theoretical security of quantum communication and practice of modern networks, this work creates a pillar of last generation of secure and adaptive communication system.

Author Contributions

Conceptualization, A.G.Z. and M.Q.; methodology, A.G.Z.; software, A.G.Z.; validation, A.G.Z.; formal analysis, A.G.Z.; investigation, A.G.Z.; resources, M.Q.; data curation, A.G.Z.; writing—original draft preparation, A.G.Z.; writing—review and editing, A.G.Z. and M.Q.; visualization, A.G.Z.; supervision, M.Q.; project administration, M.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Office of Research and Innovation at Abu Dhabi University.

Informed Consent Statement

Not applicable.

Data Availability Statement

The simulation code for the NS-3 and NetSquid integration, the training data for the RTM, and the full output logs from the experimental runs will be made available upon reasonable request to the corresponding author.

Acknowledgments

The authors would like to acknowledge the open-source communities behind NS-3 and NetSquid for providing the simulation platforms that made this research possible. We gratefully acknowledge the support provided by the Office of Research and Innovation at Abu Dhabi University.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AHAF	Adaptive Hybrid Authentication Framework
RL	Reinforcement Learning
QKD	Quantum Key Distribution
PQC	Post-Quantum Cryptography
MDP	Markov Decision Process
RTM	Real-Time Threat Monitor
QRM	Quantum Resource Manager
MitM	Man-in-the-Middle
QBER	Quantum Bit Error Rate
TLS	Transport Layer Security
RSA	Rivest-Shamir-Adleman
ECC	Elliptic Curve Cryptography
NIST	National Institute of Standards and Technology
KEM	Key Encapsulation Mechanism
DoS	Denial-of-Service
ML	Machine Learning
CNN	Convolutional Neural Network
RNN	Recurrent Neural Network
WDM	Wavelength-Division Multiplexing

References

Harbitter, A.; Menascé, D.A. A Methodology for Analyzing the Performance of Authentication Protocols. ACM Trans. Inf. Syst. Secur. 2002, 5, 458–491. [Google Scholar] [CrossRef]
Anonymous. Quantum Cryptography: A Review of the Literature. NHSJS 2025, 1, 1–15. [Google Scholar]
MDPI. Applied Sciences - An Open Access Journal. MDPI 2025, accessed June 15, 2025.
Anonymous. Quantum Cryptography: A Review of the Literature. NHSJS 2025, 1, 1–20. [Google Scholar]
Aliro Quantum. An Overview of Hybrid Classical-Quantum Key Exchange. Aliro Quantum Blog 2025, accessed June 15, 2025.
Federal Office for Information Security (BSI). Position Paper on Quantum Key Distribution. BSI Technical Report 2025. [Google Scholar]
Amazon Science. Quantum key distribution and authentication: Separating facts from myths. Amazon Science Blog 2025, accessed June 15, 2025.
Anonymous. A Performance-Centric Comparative Study of Hybrid Security Protocol Architectures. ResearchGate 2025, Conference Paper.
Anonymous. Evaluating the performance of post-quantum secure algorithms in the TLS protocol. JSSS 2022, 15, 1–12. [Google Scholar]
Lotto, A.; Marchiori, F.; Brighente, A.; Conti, M. A Survey and Comparative Analysis of Security Properties of CAN Authentication Protocols. arXiv 2024, arXiv:2401.10736. [Google Scholar] [CrossRef]
Anonymous. Survey of security vulnerabilities in Session Initiation Protocol. ResearchGate 2005, Conference Paper.
Fiveable. Security proofs and eavesdropping attacks. Quantum Optics Class Notes 2025, Study Guide.
Reddy M, S.; Mohan B, C. Comprehensive Analysis of BB84, A Quantum Key Distribution Protocol. arXiv 2023, arXiv:2312.05609. [Google Scholar]
Anonymous. Characterizing Hybrid Quantum-Classical Issues Discussed in Developer Forums. arXiv arXiv:2411.16884v2.
Anonymous. Wavelength Division Multiplexing for Quantum Networks. arXiv 2025, arXiv:2502.07298v1. [Google Scholar]
Cutter Consortium. A Business Leader’s Guide to Quantum Software Architecture: Patterns for Success. Cutter Article 2025.
Almalawi, A.; Hassan, S.; Fahad, A.; Iqbal, A.; Khan, A.I. Hybrid Cybersecurity for Asymmetric Threats: Intrusion Detection and SCADA System Protection Innovations. Symmetry 2025, 17, 616. [Google Scholar] [CrossRef]
CIO Influence. Machine Learning Models for Real-Time Threat Correlation Across Distributed Networks. CIO Influence Article 2025.
Kentik. Network Anomaly Detection: A Comprehensive Guide. Kentik Technical Guide 2025.
Anonymous. Reinforcement Learning for Adaptive Cybersecurity. IRE Journals 2024, Paper 1704836.
Anonymous. QKDNetSim+: Improvement of the quantum network simulator for NS-3. ResearchGate 2024, Conference Paper.
NS-3 Project. Quick Start Tutorial. NS-3 Documentation 2025, accessed June 15, 2025.
NetSquid Team. NetSquid – The Network Simulator for Quantum Information using Discrete events. NetSquid Documentation 2025, accessed June 15, 2025.
SoftwareQutech. netsquid-netbuilder. GitLab Repository 2025, accessed June 15, 2025.
Anonymous. Detecting man-in-the-middle attacks via hybrid quantum-classical protocol in software-defined network. ResearchGate 2023, Conference Paper.
AlQahtani, A.A.S. Key Derivation: A Dynamic PBKDF2 Model for Modern Cryptographic Systems. Cryptography 2025, 9, 39. [Google Scholar] [CrossRef]
Singh, A.V.; Rathbun, E.; Graham, E.; Oakley, L.; Boboila, S.; Oprea, A.; Chin, P. Hierarchical Multi-agent Reinforcement Learning for Cyber Network Defense. arXiv 2024, arXiv:2410.17351. [Google Scholar]

Figure 1. High-Level Architecture of the AHAF. The diagram shows a Client and Server connected by Classical and Quantum channels, with the Server containing RTM, QRM, RL Decision Engine, and Protocol Suite components.

Figure 2. Flowchart of the Dynamic Protocol Switching Logic. Shows the decision-making process from authentication request through protocol selection to policy updates.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.