ARTICLE | doi:10.20944/preprints202004.0426.v1
Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Bandit Algorithm; Upper Confidence Bounds; Kullback-Leibler divergence
Online: 24 April 2020 (04:24:31 CEST)
Upper confidence bound multi-armed bandit algorithms (UCB) typically rely on concentration in- equalities (such as Hoeffding’s inequality) for the creation of the upper confidence bound. Intu- itively, the tighter the bound is, the more likely the respective arm is or isn’t judged appropriately for selection. Hence we derive and utilise an optimal inequality. Usually the sample mean (and sometimes the sample variance) of previous rewards are the information which are used in the bounds which drive the algorithm, but intuitively the more infor- mation that taken from the previous rewards, the tighter the bound could be. Hence our inequality explicitly considers the values of each and every past reward into the upper bound expression which drives the method. We show how this UCB method fits into the broader scope of other information theoretic UCB algorithms, but unlike them is free from assumptions about the distribution of the data, We conclude by reporting some already established regret information, and give some numerical simulations to demonstrate the method’s effectiveness.
ARTICLE | doi:10.20944/preprints202306.1288.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Traffic shaping; Anomaly detection; Intrusion detection; Network security; Internet of Things; Network traffic analysis; Machine learning. (SDN (Software-defined networking); GNN (Graph neural network); MAB (Multi-armed bandit))
Online: 19 June 2023 (04:55:33 CEST)
Traffic shaping is a critical task in software-defined -IoT networks (SDN-IoTs) to efficiently manage network resources and ensure Quality of Service (QoS) for end-users. However, traditional traffic shaping approaches based on queuing theory or static policies may not be effective due to the dynamic and unpredictable nature of network traffic. In this paper, we propose a novel approach that leverages Graph Neural Networks (GNNs) and Multi-arm Bandit algorithms to dynamically optimize traffic shaping policies based on real-time network traffic patterns. Specifically, our approach uses a GNN model to learn and predict network traffic patterns and a Multi-arm Bandit algorithm to optimize traffic shaping policies based on these predictions. We evaluate the proposed approach on three different datasets, including a simulated corporate network (KDD Cup 1999), a collection of network traffic traces (CAIDA), and a simulated network environment with both normal and malicious traffic (NSL-KDD). The results demonstrate that our approach outperforms other state-of-the-art traffic shaping methods, achieving higher throughput, lower packet loss, and lower delay, while effectively detecting anomalous traffic patterns. The proposed approach offers a promising solution to traffic shaping in SDNs, enabling efficient resource management and QoS assurance