ARTICLE | doi:10.20944/preprints202007.0187.v1
Subject: Keywords: Intrusion Detection System; NSL-KDD Dataset; One Hot Encoding; Information Gain; Convolution Neural Network
Online: 9 July 2020 (12:14:10 CEST)
Cyber security plays an important role to protect our computer, network, program and data from unauthorized access. Intrusion detection system (IDS) and intrusion prevention system (IPS) are two main categories of cyber security, designed to identify any suspicious activities present in inbound and outbound network packets and restrict the suspicious incident. Deep neural network plays a significant role in the construction of IDS and IPS. This paper highlights a novel IDS using optimized convolution neural network (CNN-IDS). An optimized CNNIDS model is an improvement over CNN which selects the best weighted model by considering the loss in every epoch. All the experiments have been conducted on the well known NSL-KDD dataset. Information gain has been used for dimensionality reduction. The accuracy of the proposed model is evaluated through optimized CNN for both binary and multiclass categories. Finally, a critical comparison has been performed with other general classifiers like J48, Naive Bayes, NB tree, Random forest, Multilayer Perceptron (MLP), Support Vector Machine (SVM), Recurrent Neural Network (RNN) and Convolution Neural Network(CNN). All the experimental results demonstrate that the optimized CNN-IDS model records the best recognition rate with minimum model construction time.
ARTICLE | doi:10.20944/preprints202007.0648.v1
Subject: Keywords: Twitter; Social Media; NLP; Tweet; User Categorizations and Mathematical Frame Work
Online: 26 July 2020 (17:23:33 CEST)
Social networking applications such as Twitter have increasingly gained significance in terms of socio-economic, political, and religious as well as entertainment sectors. This in turn, has witnessed a wide gamut of information explosion in the social networking realm that can tend to be both useful as well as misleading at the same point of time. Spam detection is one such solution that caters to this problem through identification of irrelevant users and their data. However, existing research has so far laid primary focus on user profile information through activity detection and relevant techniques that may underperform when these profiles exhibit characteristics of temporal dependency, poor reflection of generated content from the user profile, etc. This is the primary motivation for this paper that addresses the aforementioned problem of user profiles by focusing on both profile information and content-based spam detection. To this end, this work delivers three significant contributions. Firstly, exhaustive use of Natural language processing (NLP) techniques has been rendered towards creation of a new comprehensive dataset with a wide range of content-based features. Secondly, this dataset has been fed into a customized state-of-art hybrid machine learning model that has been exclusively built using a combination of both machine learning and deep learning techniques. Extensive simulation based analysis not only records over 98% accuracy but also establishes the practical applicability of this proposal by proving that modeling based on the mixed profile and content-generated data is more capable of spam detection in contrast to each of these standalone approaches. Finally, a novel methodology based on logistic regression is proposed and supported by analytical formulations. This paves the way for the custom-built dataset to be analyzed and corresponding probabilities to be obtained that differentiate legitimate users from spammers. The obtained mathematical outcome can henceforth be used for future prediction of user categories through appropriate parameter tuning for any given dataset. This makes our method a truly generic one capable of identifying and classifying different user categories.
ARTICLE | doi:10.20944/preprints202007.0191.v1
Subject: Keywords: Intrusion Detection System; NSL-KDD Dataset; One Hot Encoding; Information Gain; Decision Tree
Online: 9 July 2020 (12:23:29 CEST)
. In today’s world, cyber attack is one of the major issues concerning the organizations that deal with technologies like cloud computing, big data, IoT etc. In the area of cyber security, intrusion detection system (IDS) plays a crucial role to identify suspicious activities in the network traffic. Over the past few years, a lot of research has been done in this area but in the current scenario, network attacks are diversifying in both volume and variety. In this regard, this research article proposes a novel IDS where a combination of information gain and decision tree algorithm has been used for the purpose of dimension reduction and classification. For experimental purpose the NSL-KDD dataset has been used. Initially out of 41 features present in the dataset only 5 high information gain valued features are selected for classification purpose. The applicability of the selected features are evaluated through various machine learning based algorithms. The experimental result shows that the decision tree based algorithm records highest recognition accuracy among all the classifiers. Based on the initial classification result a novel methodology based on decision tree has been further developed which is capable of identifying multiple attacks by analyzing the packets of various transactions in real time.