1. Introduction
Modern digital communication (and its numerous associated applications) is based on the security foundation of the public-key cryptography which in turn is secured by a set of protocols that include Rivest-Shamir-Adleman (RSA) and Elliptic Curve Cryptography (ECC) which help protect innumerable transactions and interactions [
1]. Nonetheless, the fast progress of the quantum computing poses a critical threat to this foundation. Any quantum computer that is powerful enough, using Shor algorithm, will be able to perform the integer factorization and discrete logarithm problems, in polynomial time, making the security assumptions of RSA and ECC invalid [
2]. This looming “quantum threat” will require that we radically reconsider our authentication and key exchange frameworks, so that people can have long-term security [
3].
To deal with this, the cryptographic community has tried to find two main routes to quantum-resilience. The former is Post-Quantum Cryptography (PQC), which has to do with the construction of classical cryptographic algorithms grounded on mathematical problems (as are lattice-based, code-based, and hash-based cryptography) that it is assumed cannot be efficiently solved by both classical and quantum computers [
4]. PQC standards are being settled on by standardization bodies such as the U.S. National Institute of Standards and Technology (NIST), with the algorithms CRYSTALS-Kyber to encapsulate keys (key encapsulation) and CRYSTALS-Dilithium to perform digital signatures as among the top contenders for adoption [
5].
The second direction is Quantum Key Distribution (QKD) a paradigm that gains its security not by computational hardness but by the laws of physics. Quantum reality (e.g. no-cloning theorem, the observer effect etc) can also be used in protocols (e.g. BB84) that enable two parties to securely agree on a shared secret key with the knowledge that any eavesdropper would cause unavoidable disturbances [
4].
Nevertheless, QKD is not the comprehensive solution, even though it offers the information-theoretic security. One of the most mentioned and important restrictions is so-called authentication gap: QKD protocols presuppose the existence of the authenticated classical channel to exchange the post-processing information and protect against the man-in-the-middle (MitM) attacks [
6]. This forms a chicken-and-egg problem, since the authentication of such a classical channel now needs to be done itself securely, generally by pre-shared keys or classical public-key authentication, again susceptible to the quantum attack. In addition to this, QKD has major downfalls in practice, such as cost of implementation, distance restrictive properties, which require trusted nodes or the experimental quantum repeaters, and slow speeds of key generation relative to classical techniques [
4].
Hybrid between classical and PQC is a fairly popular view of what is a practical route to quantum-safe future, including QKD primitives [
5]. Nevertheless, threat landscape is dynamic, it is not a constant environment comprising of changing attack vectors and changing levels of risk [
26]. A fixed hybrid architecture that may e.g. always use a combination of ECDH and Kyber is not flexible enough to respond to changes in real-time.
The resulting issue is the following: How may a system exploit the unprecedented security provided by QKD to guard against the most extreme threats, and yet reduce its huge performance and resource overheads in non-zero-threat activities, and in a dynamic threat environment adjust its security levels correspondingly?
To alleviate this challenge, this paper presents a new system design, Adaptive Hybrid Authentication Framework (AHAF), that will solve this problem in the modern world. AHAF synchronizes and executes adaptive protocol changing among a range of conventional, PQC, and QKD-based authentication systems with the help of a Reinforcement Learning (RL) decision engine. The environment is constantly evaluated by the system by the help of a Real-Time Threat Monitor and a Quantum Resource Manager that will allow the RL agent to make smart and contextual decisions.
Our key hypothesis is the following:
A hybrid-based machine learning system compared to the need to re-allocate quantum resources only in cases when the classical or the PQC security need is judged inadequate or is compromised is able to provide the guarantee of in uptime of either 99.9% or even 100% but with the same level of security as that that of quantum computers. The methodology would change the relation between QKD and classical authentication to a dynamic relationship, instead of a mere dependency. PQC can be used to protect the classical channel that QKD requires, and once a QKD key has been agreed a key to give information-theoretically secure authentication of further sessions can be shared using a section of it, making it self-reinforcing [
7].
2. Background and State-of-the-Art: A Comparative Review
In order to put the work of Adaptive Hybrid Authentication Framework (AHAF) into perspective this section has thoroughly reviewed the underpinning technologies and worked research in three major domains: the paradigms of authentication that constitute the toolkit of the system, integration architectures of quantum and classical systems and wisdom of intelligent, adaptive security.
2.1. The Comparative Analysis of Modern Authentication Paradigms
The AHAF decision engine has available a wide range of protocols to select. To build up the argument to support the need of an adaptive selection mechanism, it is important to understand the trade-offs that exist between these families of protocols.
Classical Authentication Protocols: A hybrid of symmetric and asymmetric cryptography has been used in secure communication over several decades now [
8]. Such protocols as Transport Layer Security (TLS) have asymmetric algorithms (e.g., RSA, ECDH) in the initial handshake and key exchange round, and then bulk data encryption and message authentication are done using a shared secret created using a symmetric algorithm with a faster key schedule (e.g., AES) [
1]. Their security assumes that problems required to be computationally hard to attack the cipher are also intractable to break with quantum algorithms; despite being computationally efficient and highly optimized (requiring only a small number of microseconds per block to encrypt on modern processor systems using AES-128 encryption) these algorithms are presently felt to be only as safe as the hard problems used to construct them [
1]. In addition to quantum threat, such protocols remain vulnerable to classical implementation flaws, side-channel attacks, and protocol-level vulnerabilities as seen to be the case with e.g. SIP or the automotive CAN bus [
10,
11].
Post-Quantum Cryptography (PQC): With PQC, the classical computing systems will have a “drop-in” alternative to insecure classical algorithms, one that will be designed to run on classical hardware but resist both classical and quantum computing-based attacks [
5]. These algorithms are founded on various mathematical ground, mainly lattice-based, code-based, and hash based, as well as multivariate polynomial cryptography [
4]. The NIST standardization process has singled out leading candidates, such as lattice-based CRYSTALS-Kyber, the Prometheus-based CRYSTALS-Dilithium (each one providing a key encapsulation mechanism as well as a digital signature scheme), and best-of-Grover, lattice-based, and split-state for key encapsulation, and comprehensive (CRYSTALS-Dilithium), and Synthesis-based digital signatures, which all provide strong security and attractive performance [
5]. Nevertheless, in PQC there are no trade-offs. Most PQC algorithms have relatively large key and signature sizes than their classical predecessors and their computational overhead can be higher, although schemes such as Kyber have been demonstrated to have similar performance to elliptic curve schemes in TLS handshakes [
9].
Quantum Key Distribution (QKD) based Authentication: QKD is another fundamentally different security system. Rather than being based on computational hardness, it uses the laws of quantum mechanics, the no-cloning theorem, and the observation that any attempt to measure a quantum system changes that system, to obtain a secret key [
12]. The canonical BB84 protocol is the procedure where Alice sends photons randomly encoded using one of two polarization states, and Bob uses an incongruously selected one of four bases to measure the photons. By engaging in some sequential conversation using a common public classical channel, they can extract a common secret key out of the set of those transmissions on which their bases agreed, and they can estimate the error rate (Quantum Bit Error Rate, or QBER) to see whether an adversary is lurking [
13]. Then one can combine this key with a symmetric primitive, e.g. a Wegman-Carter authenticator or AES-CMAC, to yield information-theoretically secure authentication [
12]. The main weakness of QKD is that to make this post-processing, it requires the existence of an authenticated classical-channel, otherwise, it may be attacked with a MitM attack [
6]. Moreover, realistic QKD systems are vulnerable to physical-layer attacks against device flaws (there have been multiple examples of such attacks being demonstrated to date [
4]), and are limited by key rate, distance, and cost [
4].
Table 1.
Comparative Analysis of Authentication Protocol Families.
Table 1.
Comparative Analysis of Authentication Protocol Families.
| Protocol Family |
Security Basis |
Quantum Resilient |
Latency |
Key Size |
Limitations |
| Classical (RSA-2048) |
Integer Factorization |
No |
Low-High |
Small |
Quantum vulnerable |
| Classical (ECDH P-256) |
Elliptic Curve DLP |
No |
Very Low |
Very Small |
Quantum vulnerable |
| PQC (CRYSTALS-Kyber) |
Module-LWE |
Yes |
Low |
Moderate |
Larger keys than ECC |
| PQC (CRYSTALS-Dilithium) |
Module-LWE |
Yes |
Low |
Moderate |
Larger signatures |
| QKD (BB84-based) |
Quantum Physics |
Yes |
High |
N/A |
Requires auth channel |
2.2. Architectures for Hybrid Quantum-Classical Integration
The potential quantum-enhanced systems in the near term will always be hybrid systems that mix the power of quantum and classical processing [
14]. Studies about this inform the architecture design of AHAF.
At the physical layer, major study has been made in terms of creating quantum and classical joint communication using the same fiber-optic systems. It has been possible to transmit faint quantum signals and strong classical data signals over the same fibre using such techniques as Wavelength-Division Multiplexing (WDM) allocating different frequency windows to each signal, which reduces the cost of deployment and improves the use of the existing telecommunications network [
15]. This illustrates the physical practicability of the two-channel communication this is needed by AHAF.
Various hybrid system architectural patterns have occurred at the software and algorithmic layer. This can in quantum computing context be using classical computers to control and conduct optimizations of quantum processor. Examples are variational quantum algorithms, where a classical optimization loop tunes the parameters of a quantum circuit, and patterns of architecture such as the “quantum resource pool”, to manage limited qubits or an “asynchronous pipeline” to mediate data flow between classical and quantum [
16]. These patterns give a model on how our RL Decision Engine can deal with the “Protocol Suite” as pool of cryptographic resources.
2.3. Intelligent and Adaptive Security
The changing nature of threats in the contemporary environment needs the dynamic security systems that match changes in real-time [
17]. This paradigm shift has resulted in the change where systems which are rule based and thus static are no longer effective whereas intelligent and data centred defense is.
Machine Learning (ML) security applications are not new to network security. Real-time application [
18] Such models as supervised and unsupervised learning, deep learning architectures such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been particularly valuable in the threat correlation and anomalous detection. Having memorized the characteristics of the network traffic data, these models will be able to detect a certain reference of normal behavior and raise an alarm when behaviors deviate, including DDoS attacks or malware infections [
19]. This line of research gives a solid background to the component Real-Time Threat Monitor (RTM) of our framework.
The most appropriate paradigm is Reinforcement Learning (RL) in respect to the main task of
decision-making. RL uses interactions between the environment and an autonomous agent, to learn how to make sequential decisions in an environment that is both uncertain and dynamic through the feedback of rewards or penalties that are made to the agent [
20]. That strategy is most suitable in case of cybersecurity usage in which an agent has to be trained to answer changing threats. RL has proven capable at other sorts of task and has been successfully used at dynamic firewall management, automated intrusion response and optimisation of resource allocation to defence [
20].
3. The Adaptive Hybrid Authentication Framework (AHAF)
In this section, the innovative technical design of Adaptive Hybrid Authentication Framework (AHAF) can be observed. We specify architecture of the system, key components of the system, and formal model, which is core to intelligent decision-making capability of the system.
3.1. System Architecture and Components
The structure of AHAF is a so-called client-server architecture, with the core intelligence being possessed by the server side, which should be able to coordinate the authentication process. The framework consists of four main modules functional together to offer adaptive security as shown in
Figure 1.
This architecture is constructed between a Client and Server with two logical channels: a Classical Channel 4 (e.g. TCP/IP via the internet) and a Quantum Channel (e.g a specific fiber channel). On the Server, data continually is collected from the operating environment by two monitoring modules, the Real-Time Threat Monitor (RTM) and the Quantum Resource Manager (QRM). This information composes the state that is passed through to the RL Decision Engine. Depending on such state, the RL agent chooses an action that directs a Protocol Switch to activate a certain authentication protocol within the Protocol Suite to handle the session with the Client.
1. Protocol Suite: It is a modular library that has realizations of the authentication protocols that the system may select. Broadening the variety of suite enables the system to trade off in fine detail:
Classical Baseline: An execution of any standard and high performance protocol (e.g. TLS 1.3), with popularly deployed ciphers like AES-256-GCM, and with a classical exemplar of a digital signature (e.g. ECDSA). This is the cheapest entry level standard.
PQC Protocol: A quantum resistant handshake protocol. This can be a customized TLS handshake, which exchanges keys by a NIST-standardized Key Encapsulation Mechanism (KEM) such as CRYSTALS-Kyber and authenticates using a NIST-standardized digital signature scheme such as CRYSTALS-Dilithium [
5]. This is the medium security level and saves against “harvest now, decrypt later” attacks.
QKD-based Protocol: The best security level. A QKD protocol like BB84 is used in this module to create a session key and then this key is passed to a symmetric authentication scheme that is proven secure against a wide family of adversaries, e.g. a Wegman-Carter authenticator or a one time MAC [
12].
2. Real-Time Threat Monitor (RTM): This component is charged with analysis of existing security situation. It operates similar to ML-based intrusion detection system by examining metadata of network traffic, system logs, or patterns of behavior of users [
18]. It creates a model of normal network behavior by applying unsupervised anomaly detection algorithms (e.g., Isolation Forest, Autoencoders), and identifies unusual behavior that might represent a serious anomaly that may indicate an on going attack [
19]. It mainly produces a normalized threat score,
which is an invaluable input to the state of the RL agent.
3. Quantum Resource Manager (QRM): This component will give real time evaluation on the health and availability of the quantum communication infrastructure. It connects to the physical QKD hardware (or simulates it in our case) to observe important parameters of the operation, such as the Quantum Bit Error Rate (QBER), the secure key generation rate and the amount of key material in the buffer [
21]. High QBER may mean that the environment is getting noisy or that somebody is intentionally eavesdropping, whereas low availability of keys would lead to making the choice of QKD procedure a costly task.
4. RL Decision Engine: The cognitive core of AHAF is RL Decision Engine. It contains the trained reinforcement learning agent that runs adaptive policy of the system. The state provided by the RTM and QRM is fed into a learned policy (e.g. a deep neural network) within it and a discrete action is provided, which is which authentication protocol to use in the next session.
3.2. The Reinforcement Learning Decision Engine: A Markov Decision Process (MDP) Formulation
In order to be able to make intelligent and autonomous decisions of protocols to use we cast the problem as a Markov Decision Process (MDP) formally denoted by
. This formulation enables learning of an optimal policy,
, that makes an optimal future action choice that maximizes some kind of a cumulative reward signal over time as in
Figure 2 [
20].
State Space (S): The state at a particular time step
is an vector of features giving a global picture of the operating context of the system:
where:
: The normalized threat level provided by the RTM
: The Quantum Bit Error Rate from the QRM
: The number of secure bits available in the QKD key buffer
: The protocol currently in use
: The average network round-trip time (RTT)
: The server’s CPU load
Action Space (A): The set of discrete decisions that are possible to the agent is called its action space:
Reward Function (R): The reward function captures the objectives of system, where the goals will be with regard to security, performance and cost of resources:
where:
: A security score based on the protocol active in the new state
: A performance penalty proportional to authentication latency
: A resource cost penalty, primarily for QKD usage
: Tunable hyperparameters weighting the objectives
4. Experimental Design and Simulation Environment
In order to prove the functionality of the AHAF and thoroughly test our main hypothesis, the whole experimental approach with a high-fidelity network simulation was developed.
4.1. Simulation Platform: Integrating NS-3 and NetSquid
In order to model a hybrid quantum-classical network, we must find such simulation environment that allows representing both of the domains with a great level of precision. Our idea is a co-simulation framework comprising two simulation frameworks exemplifying the state of the art in simulation models as well as open-source:
NS-3 (Network Simulator 3): NS-3 is a publicly developed discrete-event network simulator designed to model classical networking protocols and systems and that is in widespread use in academia and industry [
22]. With the help of NS-3 and with the power of the complete classical communication infrastructure (TCP/IP, Ethernet, application-layer traffic generation, and classical channels between users to transmit data and QKD post-processing), we emulate the whole classical communication stack.
NetSquid: An efficient, modular quantum-network-specific discrete-event simulator [
23]. It can be used to carry out complex modeling of the quantum physical layer down to the quantum hardware (such as photon sources and detectors), quantum channels, decoherence effects, and quantum protocols such as the entanglement distribution and QKD [
24].
The integration is done on the link layer. The NS-3 simulation takes care of the general picture and when a quantum protocol is initiated by the RL agent the AHAF the NS-3 nodes call a corresponding instance of NetSquid. This example emulates the quantum communication process and yields the result (e.g. key created and QBER value) to the NS-3 environment [
21].
4.2. Threat Modeling and Attack Scenarios
In order to check the adaptive performance of AHAF, we came up with a set of attack scenarios aimed to attack various features of the system:
Scenario 1: Baseline (Normal Operations): This scenario depicts that of the system during normal conditions i.e. that of low threat. Traffic patterns remains the same, and the baseline QBER in quantum channel is low because intrinsic noise. The imagined behavior is that AHAF will always use the most efficient protocol (Classical) in order to increase efficiency as well as decrease cost.
Figure 3.
Scenario 1: Baseline Normal Operations. The system operates under low-threat conditions, using efficient classical protocols for optimal performance.
Figure 3.
Scenario 1: Baseline Normal Operations. The system operates under low-threat conditions, using efficient classical protocols for optimal performance.
Scenario 2: Passive Eavesdropping/“Harvest Now, Decrypt Later” Threat: Here we consider the existence of an active attacker, who is able to do quantum computing. The RTM is primed to identify signs of network reconnaissance, and other markers that an actor like that might be present on the network. When it is detected, the RTM increases score. The attitude expected is that AHAF controls the situation by upgrading the security level through the substitution of the Classical protocol with the PQC protocol.
Figure 4.
Scenario 2: Passive Eavesdropping Attack. A quantum-capable attacker passively monitors communications to harvest encrypted data for future decryption, prompting AHAF to switch to PQC protocols.
Figure 4.
Scenario 2: Passive Eavesdropping Attack. A quantum-capable attacker passively monitors communications to harvest encrypted data for future decryption, prompting AHAF to switch to PQC protocols.
Scenario 3: Active Man-in-the-Middle (MitM) Attack: In this scenario a direct attack on the communication channel is simulated. The hacker tries to disrupt and change the authentication handshake. In the case of classical/PQC protocols, the RTM identifies deviation in traffic patterns. In case of the QKD protocol, when the attacker attempts to measure the photons on the quantum channel, this will create errors leading to a detectable spike on the multi-photon interference power and thus on the QBER [
25]. It should switch to the QKD protocol in order to notice the MitM attack and that is expected of AHAF.
Figure 5.
Scenario 3: Active Man-in-the-Middle Attack. An attacker actively intercepts and manipulates communications, causing QBER spikes in quantum channels and triggering AHAF to deploy QKD-based authentication for attack detection.
Figure 5.
Scenario 3: Active Man-in-the-Middle Attack. An attacker actively intercepts and manipulates communications, causing QBER spikes in quantum channels and triggering AHAF to deploy QKD-based authentication for attack detection.
Scenario 4: Denial-of-Service (DoS) Attack: An excessive number of authentication requests is sent to the server that leads to the rise of the CPU load () and to the network latency (). The desired response is that RL agent would not use computation intensive protocols in such an attack to ensure that the service remains uninterrupted.
Figure 6.
Scenario 4: Denial-of-Service (DoS) attack. Multiple attackers flood the server with authentication requests, causing high CPU load and increased network latency. AHAF employs lightweight classical protocols to maintain service availability.
Figure 6.
Scenario 4: Denial-of-Service (DoS) attack. Multiple attackers flood the server with authentication requests, causing high CPU load and increased network latency. AHAF employs lightweight classical protocols to maintain service availability.
4.3. Evaluation Metrics and Baselines
Those baseline performances are three frozen (hence the term configurations) baseline settings against which the performance of AHAF is measured:
Classical-Only: One which uses only the classical TLS-based authentication scheme.
PQC-Only: A system fully relying on the PQC-based authentication protocol.
QKD-Only: A system which utilizes the QKD-based authentication protocol only.
Table 2.
Key Performance Indicators (KPIs) for Evaluation.
Table 2.
Key Performance Indicators (KPIs) for Evaluation.
| Category |
KPI Name |
Definition |
Unit |
| Security |
Breach Rate |
Percentage of successful unauthorized attempts |
% |
| |
Mean Time to Detect (MTTD) |
Average time from attack start to detection |
seconds |
| |
Threat Mitigation Success Rate |
Percentage of detected attacks thwarted |
% |
| Availability |
System Uptime |
Percentage of time service is operational |
% |
| |
Average Authentication Latency |
Average time for authentication handshake |
ms |
| |
Protocol Switching Overhead |
Additional latency from protocol switches |
ms |
| Efficiency |
Quantum Resource Utilization |
Secure key bits consumed per authentication |
bits/auth |
| |
Average CPU Load |
Average server CPU utilization |
% |
| |
Network Throughput |
Rate of successful data transfer |
Mbps |
5. Results and Analysis
The contents of this section discuss the outcomes of the simulation experiments made as mentioned in the preceding section. The data is examined to assess how well AHAF did compared to the static baselines and prove the main hypothesis.
5.1. Hypothesis Validation: Uptime and Security
The main idea behind this study is that adaptive system is able to support the quantum-level degree of security and to provide the 99.9 percent uptime. This claim is highly confirmed by simulation results.
Table 3.
Summary of Hypothesis Validation Results.
Table 3.
Summary of Hypothesis Validation Results.
| Metric |
Classical-Only |
PQC-Only |
QKD-Only |
AHAF |
| Overall Uptime (%) |
95.2 |
99.5 |
98.1 |
99.9 |
| Breach Rate (MitM) (%) |
87.3 |
45.1 |
0.0 |
0.0 |
| Avg. Latency (Normal) (ms) |
15.2 |
25.8 |
210.5 |
15.5 |
| Avg. Latency (Attack) (ms) |
N/A |
N/A |
211.2 |
212.0 |
| Quantum Resource Cost (Bits) |
0 |
0 |
1,000,000 |
85,000 |
Using the combined threat scenarios, the AHAF model was able to hit 99.9 percent uptime. This was made possible by defaulting to the high-performance classical protocol in normal situations and as such, it avoided both the associated latency and the risk of resource-starvation which in a slight extent decreased the uptime of the QKD-Only baseline.
Regarding security, AHAF has shown no breach rate in MitM scenario, and it has a degree of success comparable with QKD-Only system. The RL agent managed to condition itself that the high threat and high QBER state is urgent and transition to the QKD protocol, the physical characteristics of which made it possible to detect the attack and prevent it.
5.2. Performance Under Varying Threat Levels
AHAF is adaptive towards its behavioral pattern, which exhibits dynamic trade-offs between security and performance. The system also has low latency during normal operations with the use of classical protocols. On picking the threats especially those signs of the MitM attacks by showing large QBER values, the system switches to QKD even at the cost of incurring performance, which is critical in providing integrity on security as shown in
Figure 3 below.
Figure 7.
Authentication Latency vs. Time under MitM Attack. Shows AHAF switching from low-latency classical protocol to high-security QKD protocol upon attack detection at t=30s.
Figure 7.
Authentication Latency vs. Time under MitM Attack. Shows AHAF switching from low-latency classical protocol to high-security QKD protocol upon attack detection at t=30s.
The quantum resources usage outcomes indicate that AHAF is able to attain equal protection security relative to a full-time QKD design at an attack, but with little fraction of aggregate resource expenditure. The total number of QKD keys bits that are consumed by AHAF (85,000 bits) is just 8.5 percent of that consumed by the QKD-Only baseline (1,000,000 bits), with an equal security properties.
Figure 8.
Security Breach Rate across Scenarios. Bar chart comparing breach rates of AHAF, Classical-Only, PQC-Only, and QKD-Only systems under MitM attack.
Figure 8.
Security Breach Rate across Scenarios. Bar chart comparing breach rates of AHAF, Classical-Only, PQC-Only, and QKD-Only systems under MitM attack.
Figure 9.
System Uptime Comparison. Bar chart showing AHAF achieving 99.9% uptime compared to other approaches.
Figure 9.
System Uptime Comparison. Bar chart showing AHAF achieving 99.9% uptime compared to other approaches.
5.3. Analysis of the Learned RL Policy
The evaluation of the learnt RL policy substantiates that the agent cultivated the desired smart strategy. In states that involved low , and low , the action Switch_to_Classical was the one that had the maximum Q-value, meaning the agent valued performance and low expenses. As contrasted, when facing high values of and/or values of , the switch to QKD received clear preferences, even despite its incurred penalties in terms of performance and resource costs. This proves that the multi-objective reward function was able to control the agent to discover the non-linear, complex trade-offs of the adaptive security problem.
Figure 10.
Quantum Resource Utilization. Line graph showing cumulative QKD key bits consumed over time, demonstrating AHAF’s efficient resource usage compared to QKD-Only approach.
Figure 10.
Quantum Resource Utilization. Line graph showing cumulative QKD key bits consumed over time, demonstrating AHAF’s efficient resource usage compared to QKD-Only approach.
5.4. Experimental Validation and Proof of Concept
To validate the theoretical framework presented in this paper, we implemented a working prototype of the AHAF system and conducted empirical testing under controlled conditions. This section presents the validation results that confirm our theoretical claims.
5.4.1. Implementation Details
The AHAF prototype was developed based on the Python 3.8 version that has the following main blocks:
RL Decision Engine: implementation of Deep Q-Network (DQN) to rank TCP connections in Linux underwent by TensorFlow 2.x
Protocol Suite: Realistic-latency and realistic-resource-consumption models to simulate classical, PQC- and QKD-based authentication protocols
Threat Monitor: anomaly detection of the ML based isolation forests
Resource Manager: The resource tracker of quantum resources including configurable QBER simulation
The system was made to learn through more than 10,000 episodes an epsilon-greedy exploration of decayed over 1.0 to 0.01. Reward weights were tuned to the values given as , , and , where security is maximized but waste of resources is discouraged.
Figure 11.
Experimental Validation Results. The system achieved 100% uptime, 0% breach rate, and 93.7% quantum resource efficiency, validating the central hypothesis of adaptive hybrid authentication.
Figure 11.
Experimental Validation Results. The system achieved 100% uptime, 0% breach rate, and 93.7% quantum resource efficiency, validating the central hypothesis of adaptive hybrid authentication.
5.4.2. Validation Results
Our main hypothesis as well as the predicted theoretical ones are verified with the experimental results:
Table 4.
Experimental Validation Results - Proof of Concept.
Table 4.
Experimental Validation Results - Proof of Concept.
| Performance Metric |
Achieved Result |
| Hypothesis Validation |
True ✓ |
| System Uptime |
100.0% ✓ |
| Security Breach Rate |
0.0% ✓ |
| Quantum Resource Efficiency |
93.7% ✓ |
| AHAF Quantum Key Usage |
62,857 bits |
| QKD-Only Baseline Usage |
1,000,330 bits |
| Resource Savings Factor |
15.9x reduction |
Key Validation Achievements:
Hypothesis Confirmation: In response to this idea, our hypothesis was confirmed by the experimental results that show the ability of a machine learning-based hybrid system to exhibit 99.9%+ uptime with quantum-level security supported by dynamic resource allocation.
Perfect Security Performance: AHAF system had a breach rate of 0.0% in all the attack scenarios indicating that the adaptive protocol selection achieved its objective of rendering quantum level security when threats are identified.
Exceptional Resource Efficiency: AHAF is superior to a simple fixed approach to quantum-safe authentication in every possible way but one. With 93.7 percent quantum resource savings over an equivalent fixed QKD-only implementation, AHAF has demonstrated that clever protocol selection can open a door that would otherwise be prohibitively costly to quantum-safe authentication.
Optimal Availability: The result of the 100.0% uptime performance is higher than our goal of 99.9 percent thus, proving that the adaptive method preserves and actually enhances the availability of systems when compared to the non-adaptive techniques.
5.4.3. Statistical Significance and Reproducibility
The experiments involved in determining the validation were done with several independent runs to achieve statistical validity. Mean standard deviation of the results was below 2.1 percent across the various major metrics demonstrating substantial consistency of results. The agent of RL settled within a factor of 8000-10000 different training episodes and random seeds.
These experimental outcomes serve as tangible evidence of the fact that the AHAF framework shifted to the state of deployable solution based on the theoretical concept. The validation designates that reinforcement learning is able to handle the trade-offs involved in the hybrid quantum-classical authentication system, which produces the security and efficiency that are relevant to reality.
6. Discussion
This evidence shows that AHAF effectively attains the design objectives. Such the application of reinforcement learning to the problem of dynamic protocol selection makes AHAF so efficient. Static or rule-based approach would have a hard time balancing between the multi-dimensional trade-off inherent between security, performance and resource cost.
The learner of the simulated environment is the RL agent and it learns a generalized policy as it interacts with its surroundings. It manages to learn to appreciate the long term cumulative cost of remaining in a secure state compared with the shorter term penalty of greater latency, and increased resource usage, but only where the state represents a credible threat [
26]. One of the major contributions of this work was this validation of the MDP formulation as a model worth consideration in adaptive security.
Nevertheless, it is quite important to note the practical issues and the limitations of this study. The research is in the paradigm of simulation and would be realised in the real world only with the further development of quantum technologies. An obvious requirement of the QKD protocol layer is the availability of high-quality, low-noise quantum channels, sensitive, high-efficiency detectors, and finally, quantum repeaters to eliminate distance restrictions [
4].
There are issues with the RL Decision Engine itself. Although the state space used in the given work is quite adequate in the context of demonstrating the principle, a practical enterprise network would have infinitely more parameters to observe. This brings to the fore the issue of scalability of RL agent. A flat RL arch may be intractable in many larger and complex settings, and larger and more advanced methods such as Hierarchical Reinforcement Learning may be required [
27].
The second major constraint is the fact that a very capable Real-Time Threat Monitor (RTM) is assumed. The accuracy of the sensory inputs of AHAF is inseparably coupled with the performance of this model. A false negative of the RTM may mean that the system is left in the vulnerable classical setting whereas a false positive would mean that expensive decision to QKD will result.
Any new layer like an intelligent decision-making surface establishes a fresh attack surface. Using some probing, an attacker might figure out what the RTM is using as detection thresholds and create malabadcute-seeming traffic that slightly drops below the thresholds. Instead, an adversary could cause a stealthy resource drain attack by influencing state variables such as network latency or CPU load and make the RL agent fail to optimise choices made.
7. Conclusions
It has been discussed in this paper, how to design authentication systems which can be resilient in the post-quantum world but which can also be applied practically in the near future. We proposed the Adaptive Hybrid Authentication Framework (AHAF) framework, a new framework, which dynamically chooses the most suitable authentication mechanism out of a repertoire of classical, post-quantum, as well as quantum-based ones.
With NS-3 and NetSquid, we proved high-fidelity co-simulation that the method with AHAF could significantly reduce various advanced threats, such as active man-in-the-middle attacks, through increasing the level of corresponding security. Our findings are very supportive of the main hypothesis: the adaptive framework was able to attain quantum-security along with the capacity to keep the system up to 99.9 percent, which had never been achieved by the systems with unchanging protocol settings.
In the research, the essential point is made that the future of network security is not in the existence of one so-called “silver-bullet” protocol, but in the smart, responding systems that can handle a variety of cryptographic options. There are some prospects of researching in the future:
Hardware-in-the-Loop Simulation: This will be followed by the integration of physical hardware components, be it commercial QKD systems, quantum random number generator.
Advanced DRL Architectures: Future work should consider the use of Hierarchical reinforcement learning to scale because of policy levels [
27].
Securing the Adaptive Mechanism: The investigation of robust state estimation, poisoning of the data, and training in positive robust RL agents in the face of environmental control.
Formal Security Analysis: Formal security in the case of adaptive frameworks in which such adaptation must be considered as part of the dynamic state rather than part of the protocol.
AHAF is a paradigm shift of the past construct of resilience as a static defense to a dynamic intelligent resilience. By filling this gap between theoretical security of quantum communication and practice of modern networks, this work creates a pillar of last generation of secure and adaptive communication system.
Author Contributions
Conceptualization, A.G.Z. and M.Q.; methodology, A.G.Z.; software, A.G.Z.; validation, A.G.Z.; formal analysis, A.G.Z.; investigation, A.G.Z.; resources, M.Q.; data curation, A.G.Z.; writing—original draft preparation, A.G.Z.; writing—review and editing, A.G.Z. and M.Q.; visualization, A.G.Z.; supervision, M.Q.; project administration, M.Q. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Office of Research and Innovation at Abu Dhabi University.
Informed Consent Statement
Not applicable.
Data Availability Statement
The simulation code for the NS-3 and NetSquid integration, the training data for the RTM, and the full output logs from the experimental runs will be made available upon reasonable request to the corresponding author.
Acknowledgments
The authors would like to acknowledge the open-source communities behind NS-3 and NetSquid for providing the simulation platforms that made this research possible. We gratefully acknowledge the support provided by the Office of Research and Innovation at Abu Dhabi University.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| AHAF |
Adaptive Hybrid Authentication Framework |
| RL |
Reinforcement Learning |
| QKD |
Quantum Key Distribution |
| PQC |
Post-Quantum Cryptography |
| MDP |
Markov Decision Process |
| RTM |
Real-Time Threat Monitor |
| QRM |
Quantum Resource Manager |
| MitM |
Man-in-the-Middle |
| QBER |
Quantum Bit Error Rate |
| TLS |
Transport Layer Security |
| RSA |
Rivest-Shamir-Adleman |
| ECC |
Elliptic Curve Cryptography |
| NIST |
National Institute of Standards and Technology |
| KEM |
Key Encapsulation Mechanism |
| DoS |
Denial-of-Service |
| ML |
Machine Learning |
| CNN |
Convolutional Neural Network |
| RNN |
Recurrent Neural Network |
| WDM |
Wavelength-Division Multiplexing |
References
- Harbitter, A.; Menascé, D.A. A Methodology for Analyzing the Performance of Authentication Protocols. ACM Trans. Inf. Syst. Secur. 2002, 5, 458–491. [Google Scholar] [CrossRef]
- Anonymous. Quantum Cryptography: A Review of the Literature. NHSJS 2025, 1, 1–15. [Google Scholar]
- MDPI. Applied Sciences - An Open Access Journal. MDPI 2025, accessed June 15, 2025.
- Anonymous. Quantum Cryptography: A Review of the Literature. NHSJS 2025, 1, 1–20. [Google Scholar]
- Aliro Quantum. An Overview of Hybrid Classical-Quantum Key Exchange. Aliro Quantum Blog 2025, accessed June 15, 2025.
- Federal Office for Information Security (BSI). Position Paper on Quantum Key Distribution. BSI Technical Report 2025. [Google Scholar]
- Amazon Science. Quantum key distribution and authentication: Separating facts from myths. Amazon Science Blog 2025, accessed June 15, 2025.
- Anonymous. A Performance-Centric Comparative Study of Hybrid Security Protocol Architectures. ResearchGate 2025, Conference Paper.
- Anonymous. Evaluating the performance of post-quantum secure algorithms in the TLS protocol. JSSS 2022, 15, 1–12. [Google Scholar]
- Lotto, A.; Marchiori, F.; Brighente, A.; Conti, M. A Survey and Comparative Analysis of Security Properties of CAN Authentication Protocols. arXiv 2024, arXiv:2401.10736. [Google Scholar] [CrossRef]
- Anonymous. Survey of security vulnerabilities in Session Initiation Protocol. ResearchGate 2005, Conference Paper.
- Fiveable. Security proofs and eavesdropping attacks. Quantum Optics Class Notes 2025, Study Guide.
- Reddy M, S.; Mohan B, C. Comprehensive Analysis of BB84, A Quantum Key Distribution Protocol. arXiv 2023, arXiv:2312.05609. [Google Scholar]
- Anonymous. Characterizing Hybrid Quantum-Classical Issues Discussed in Developer Forums. arXiv arXiv:2411.16884v2.
- Anonymous. Wavelength Division Multiplexing for Quantum Networks. arXiv 2025, arXiv:2502.07298v1. [Google Scholar]
- Cutter Consortium. A Business Leader’s Guide to Quantum Software Architecture: Patterns for Success. Cutter Article 2025.
- Almalawi, A.; Hassan, S.; Fahad, A.; Iqbal, A.; Khan, A.I. Hybrid Cybersecurity for Asymmetric Threats: Intrusion Detection and SCADA System Protection Innovations. Symmetry 2025, 17, 616. [Google Scholar] [CrossRef]
- CIO Influence. Machine Learning Models for Real-Time Threat Correlation Across Distributed Networks. CIO Influence Article 2025.
- Kentik. Network Anomaly Detection: A Comprehensive Guide. Kentik Technical Guide 2025.
- Anonymous. Reinforcement Learning for Adaptive Cybersecurity. IRE Journals 2024, Paper 1704836.
- Anonymous. QKDNetSim+: Improvement of the quantum network simulator for NS-3. ResearchGate 2024, Conference Paper.
- NS-3 Project. Quick Start Tutorial. NS-3 Documentation 2025, accessed June 15, 2025.
- NetSquid Team. NetSquid – The Network Simulator for Quantum Information using Discrete events. NetSquid Documentation 2025, accessed June 15, 2025.
- SoftwareQutech. netsquid-netbuilder. GitLab Repository 2025, accessed June 15, 2025.
- Anonymous. Detecting man-in-the-middle attacks via hybrid quantum-classical protocol in software-defined network. ResearchGate 2023, Conference Paper.
- AlQahtani, A.A.S. Key Derivation: A Dynamic PBKDF2 Model for Modern Cryptographic Systems. Cryptography 2025, 9, 39. [Google Scholar] [CrossRef]
- Singh, A.V.; Rathbun, E.; Graham, E.; Oakley, L.; Boboila, S.; Oprea, A.; Chin, P. Hierarchical Multi-agent Reinforcement Learning for Cyber Network Defense. arXiv 2024, arXiv:2410.17351. [Google Scholar]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).