Article
Version 1
Preserved in Portico This version is not peer-reviewed
Byte-Pair and N-Gram Convolutional Methods of Analysing Automatically Disseminated Content on Social Platforms
Version 1
: Received: 7 April 2020 / Approved: 13 April 2020 / Online: 13 April 2020 (13:08:26 CEST)
How to cite: Liu, H. Byte-Pair and N-Gram Convolutional Methods of Analysing Automatically Disseminated Content on Social Platforms. Preprints 2020, 2020040214 (doi: 10.20944/preprints202004.0214.v1). Liu, H. Byte-Pair and N-Gram Convolutional Methods of Analysing Automatically Disseminated Content on Social Platforms. Preprints 2020, 2020040214 (doi: 10.20944/preprints202004.0214.v1).
Abstract
In this experiment, an efficient and accurate network of detecting automatically disseminated (bot) content on social platforms is devised. Through the utilisation of parallel convolutional neural network (CNN) which processes variable n-grams of text 15, 20, and 25 tokens in length encoded by Byte Pair Encoding (BPE), the complexities of linguistic content on social platforms are effectively captured and analysed. With validation on two sets of previously unexposed data, the model was able to achieve an accuracy of around 96.6% and 97.4% respectively — meeting or exceeding the performance of other comparable supervised ML solutions to this problem. Through testing, it is concluded that this method of text processing and analysis proves to be an effective way of classifying potentially artificially synthesized user data — aiding the security and integrity of social platforms.
Supplementary and Associated Material
Keywords
bot detection; machine learning; natural language processing; computation linguistics
Subject
MATHEMATICS & COMPUTER SCIENCE, Artificial Intelligence & Robotics
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.
Leave a public commentSend a private comment to the author(s)

