Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Similarity approximation of Twitter Profiles

Version 1 : Received: 6 June 2021 / Approved: 7 June 2021 / Online: 7 June 2021 (16:16:18 CEST)
Version 2 : Received: 23 November 2021 / Approved: 23 November 2021 / Online: 23 November 2021 (14:45:31 CET)
Version 3 : Received: 9 February 2022 / Approved: 17 February 2022 / Online: 17 February 2022 (13:15:23 CET)

How to cite: Shoeibi, N.; Shoeibi, N.; Chamoso, P.; Alizadehsani, Z.; Corchado, J.M. Similarity approximation of Twitter Profiles. 2021, 2021060196. Shoeibi, N.; Shoeibi, N.; Chamoso, P.; Alizadehsani, Z.; Corchado, J.M. Similarity approximation of Twitter Profiles. 2021, 2021060196.


Social media platforms are entirely an undeniable part of the lifestyle from the past decade. Analyzing the information being shared is a crucial step to understand humans behavior. Social media analysis is aiming to guarantee a better experience for the user and risen user satisfaction. But first, it is necessary to know how and from which aspects to compare users with each other. In this paper, an intelligent system has been proposed to measure the similarity of Twitter profiles. For this, firstly, the timeline of each profile has been extracted using the official Twitter API. Then, all information is given to the proposed system. Next, in parallel, three aspects of a profile are derived. Behavioral ratios are time-series-related information showing the consistency and habits of the user. Dynamic time warping has been utilized for comparison of the behavioral ratios of two profiles. Next, Graph Network Analysis is used for monitoring the interactions of the user and its audience; for estimating the similarity of graphs, Jaccard similarity is used. Finally, for the Content similarity measurement, natural language processing techniques for preprocessing and TF-IDF for feature extraction are employed and then compared using the cosine similarity method. Results have presented the similarity level of different profiles. As the case study, people with the same interest show higher similarity. This way of comparison is helpful in many other areas. Also, it enables to find duplicate profiles; those are profiles with almost the same behavior and content.


Twitter; Social Media; Social Networking; Social Network Analytic; Graph Analytic; Text Similarity; Natural Language Processing; User Engagement.


Computer Science and Mathematics, Computer Science

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0

Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.