Version 1
: Received: 6 June 2021 / Approved: 7 June 2021 / Online: 7 June 2021 (16:16:18 CEST)
Version 2
: Received: 23 November 2021 / Approved: 23 November 2021 / Online: 23 November 2021 (14:45:31 CET)
Version 3
: Received: 9 February 2022 / Approved: 17 February 2022 / Online: 17 February 2022 (13:15:23 CET)
How to cite:
Shoeibi, N.; Shoeibi, N.; Chamoso, P.; Alizadehsani, Z.; Corchado, J.M. Similarity approximation of Twitter Profiles. Preprints.org2021, 2021060196. https://doi.org/10.20944/preprints202106.0196.v1.
Shoeibi, N.; Shoeibi, N.; Chamoso, P.; Alizadehsani, Z.; Corchado, J.M. Similarity approximation of Twitter Profiles. Preprints.org 2021, 2021060196. https://doi.org/10.20944/preprints202106.0196.v1.
Social media platforms are entirely an undeniable part of the lifestyle from the past decade. Analyzing the information being shared is a crucial step to understand humans behavior. Social media analysis is aiming to guarantee a better experience for the user and risen user satisfaction. But first, it is necessary to know how and from which aspects to compare users with each other. In this paper, an intelligent system has been proposed to measure the similarity of Twitter profiles. For this, firstly, the timeline of each profile has been extracted using the official Twitter API. Then, all information is given to the proposed system. Next, in parallel, three aspects of a profile are derived. Behavioral ratios are time-series-related information showing the consistency and habits of the user. Dynamic time warping has been utilized for comparison of the behavioral ratios of two profiles. Next, Graph Network Analysis is used for monitoring the interactions of the user and its audience; for estimating the similarity of graphs, Jaccard similarity is used. Finally, for the Content similarity measurement, natural language processing techniques for preprocessing and TF-IDF for feature extraction are employed and then compared using the cosine similarity method. Results have presented the similarity level of different profiles. As the case study, people with the same interest show higher similarity. This way of comparison is helpful in many other areas. Also, it enables to find duplicate profiles; those are profiles with almost the same behavior and content.
Keywords
Twitter; Social Media; Social Networking; Social Network Analytic; Graph Analytic; Text Similarity; Natural Language Processing; User Engagement.
Subject
Computer Science and Mathematics, Computer Science
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.