Submitted:
12 February 2025
Posted:
13 February 2025
You are already at the latest version
Abstract
This study examines the security implications of generative artificial intelligence (GAI), focusing on models such as ChatGPT. As GAI technologies are increasingly integrated into industries like healthcare, education, and media, concerns are growing regarding security vulnerabilities, ethical challenges, and potential for misuse. To address these concerns, this research analyzes 1,047 peer-reviewed academic articles from the SCOPUS database using scientometric methods, including term frequency-inverse document frequency (TF-IDF) analysis, keyword centrality analysis, and latent dirichlet allocation (LDA) topic modeling. The results highlight significant contributions from countries such as the United States, China, and India, with leading institutions like the Chinese Academy of Sciences and the National University of Singapore driving research on GAI security. In the keyword centrality analysis, "ChatGPT" emerged as a highly central term, reflecting its prominence in the research discourse. However, despite its frequent mention, "ChatGPT" showed lower proximity centrality than terms like "model" and "AI." This suggests that while ChatGPT is broadly associated with other key themes, it has a less direct connection to specific research subfields. Topic modeling identified six major themes, including AI and security in education, language models, data processing, and risk management. The analysis emphasizes the need for robust security frameworks to address technical vulnerabilities, ensure ethical responsibility, and manage risks in the safe deployment of AI systems. These frameworks must not only incorporate technical solutions but also ethical accountability, regulatory compliance, and continuous risk management. This study underscores the importance of interdisciplinary research that integrates technical, legal, and ethical perspectives to ensure the responsible and secure deployment of GAI technologies.
Keywords:
1. Introduction
1.1. Backgrounds
1.2. Previous Research on Generative AI
1.3. Previous Research on Bibliometrics
2. Materials and Methods
2.1. Research Subjects
2.1.1 Collecting Research Data
2.1.2 Refining Research Data
- Case unification: All keywords were converted to lowercase to eliminate duplicates caused by case variations. For example, "ChatGPT" and "chatgpt" were standardized to a single keyword.
- Removal of special characters: Special characters such as hyphens (-), slashes (/), and spaces were removed to unify keywords with the same meaning. For example, "AI-based" and "AI based" were combined into a single term.
- Elimination of non-essential terms: Irrelevant terms that did not contribute to the analysis objectives were removed to retain only keywords with significant analytical value.
- Consolidating synonymous terms: Terms with similar meanings were standardized to ensure consistency across the dataset.
2.2. Research Method
2.2.1. Analysis Tool
2.2.2. Term Frequency and Term Frequency-Inverse Document Frequency Analyses
2.2.3. Keyword Centrality Analysis
- Degree centrality measures the number of direct connections a keyword has within the network, indicating the frequency with which it co-occurs with other keywords. Keywords with a high degree of centrality often co-occur with other significant terms, underscoring their prominence in the research network [38].
- Closeness centrality (also known as proximity centrality) assesses how close a keyword is to all other keywords in the network. It reflects a keyword’s ability to connect disparate concepts within the broader research landscape, thereby demonstrating its accessibility and importance [38].
- Betweenness centrality evaluates the extent to which a keyword acts as a bridge between different clusters of keywords. Keywords with high betweenness centrality facilitate the flow of information across disconnected groups, maintaining the coherence of the research network [38].
2.2.4. LDA Topic Modelling
3. Results
3.1. Literature Analysis Results
3.1.1. Status by Country
3.1.2. Status by Institution
3.2. Results of Keyword Frequency (TF-IDF) Analysis
3.3. Results of Keyword Centrality Analysis
3.4. LDA Topic Modeling
3.4.1. Coherence Score Measurement Results
3.4.2. Topic Classification Results
3.4.3. Weight by Topics
4. Discussion
5. Conclusion
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lund, B.D.; Wang, T. Chatting about ChatGPT: How May AI and GPT Impact Academia and Libraries? Library Hi Tech News 2023, 40, 26–29. [Google Scholar] [CrossRef]
- Zhang, S.; Liau, Z.Q.G.; Tan, K.L.M.; Chua, W.L. Evaluating the Accuracy and Relevance of ChatGPT Responses to Frequently Asked Questions Regarding Total Knee Replacement. Knee Surg. Relat. Res. 2024, 36, 15. [Google Scholar] [CrossRef] [PubMed]
- FirstPageSage. Top Generative AI Chatbots by Market Share – December 2024. Available online: https://www.firstpagesage.com (accessed on August 2024).
- Aljanabi, M. ChatGPT: Future Directions and Open Possibilities. Mesopotamian J. Cybersecur. 2023, 2023, 16–17. [Google Scholar] [CrossRef]
- Gill, S.S.; Kaur, R. ChatGPT: Vision and Challenges. Internet Things Cyber-Phys. Syst. 2023, 3, 262–271. [Google Scholar] [CrossRef]
- Javaid, M.; Haleem, A.; Singh, R.P. ChatGPT for Healthcare Services: An Emerging Stage for an Innovative Perspective. BenchCouncil Trans. Benchmarks Stand. Eval. 2023, 3, 100105. [Google Scholar] [CrossRef]
- George, A.S.; George, A.H. A Review of ChatGPT AI’s Impact on Several Business Sectors. Partn. Univ. Int. Innov. J. 2023, 1, 9–23. [Google Scholar]
- Li, L.; Ma, Z.; Fan, L.; Lee, S.; Yu, H.; Hemphill, L. ChatGPT in Education: A Discourse Analysis of Worries and Concerns on Social Media. arXiv2023 arXiv:2305.02201, 2023. [CrossRef]
- Lo, C.K. What Is the Impact of ChatGPT on Education? A Rapid Review of the Literature. Educ. Sci. 2023, 13, 410. [Google Scholar]
- Rathore, B. Future of AI & Generation Alpha: ChatGPT beyond Boundaries. Eduzone Int. Peer Rev./Ref. Multidiscip. J. 2023, 12, 63–68. [Google Scholar]
- Kaddour, J.; Harris, J.; Mozes, M.; Bradley, H.; Raileanu, R.; McHardy, R. Challenges and Applications of Large Language Models. arXiv2023 arXiv:2307.10169, 2023.
- Buchholz, K. Threads Shoots Past One Million User Mark at Lightning Speed. Statista, January 24, 2023. Available online: https://www.statista.com/chart/29174/time-to-one-million-users (accessed on 5 January 2023).
- Ahmad, N.; Murugesan, S.; Kshetri, N. Generative Artificial Intelligence and the Education Sector. Computer 2023, 56, 72–76. [Google Scholar] [CrossRef]
- Javaid, M.; Haleem, A.; Singh, R.P. A Study on ChatGPT for Industry 4. 0: Background, Potentials, Challenges, and Eventualities. J. Econ. Technol. 2023, 1, 127–143. [Google Scholar]
- Alawida, M.; Abu Shawar, B.; Abiodun, O.I.; Mehmood, A.; Omolara, A.E.; Al Hwaitat, A.K. Unveiling the Dark Side of ChatGPT: Exploring Cyberattacks and Enhancing User Awareness. Information 2024, 15, 27. [Google Scholar] [CrossRef]
- Rudolph, J.; Tan, S.; Tan, S. ChatGPT: Bullshit Spewer or the End of Traditional Assessments in Higher Education? J. Appl. Learn. Teach. 2023, 6, 342–363. [Google Scholar]
- Baidoo-Anu, D.; Owusu Ansah, L. Education in the Era of Generative Artificial Intelligence (AI): Understanding the Potential Benefits of ChatGPT in Promoting Teaching and Learning. J. AI 2023, 7, 52–62. [Google Scholar] [CrossRef]
- Wang, J. et al. Is ChatGPT a Good NLG Evaluator? A Preliminary Study. arXiv 2023, arXiv:2303.04048.
- Srivastava, M. A Day in the Life of ChatGPT as an Academic Reviewer: Investigating the Potential of Large Language Model for Scientific Literature Review. Preprint 2023.
- Kim, S.-D. Trends and Perspectives of mHealth in Obesity Control. Appl. Sci. 2025, 15, 74. [Google Scholar] [CrossRef]
- Wu, T.; et al. A Brief Overview of ChatGPT: The History, Status Quo and Potential Future Development. IEEE/CAA J. Autom. Sin. 2023, 10, 1122–1136. [Google Scholar] [CrossRef]
- Hosseini, M.; Horbach, S.P.J.M. Fighting Reviewer Fatigue or Amplifying Bias? Considerations and Recommendations for Use of ChatGPT and Other Large Language Models in Scholarly Peer Review. Res. Integr. Peer Rev. 2023, 8, 4.
- Cotton, D.R.E.; Cotton, P.A.; Shipway, J.R. Chatting and Cheating: Ensuring Academic Integrity in the Era of ChatGPT. Innov. Educ. Teach. Int. 2024, 61, 228–239. [Google Scholar] [CrossRef]
- Younis, H.A.; et al. A Systematic Review and Meta-Analysis of Artificial Intelligence Tools in Medicine and Healthcare: Applications, Considerations, Limitations, Motivation and Challenges. Diagnostics 2024, 14, 109. [Google Scholar] [CrossRef]
- Zhou, K.; Wang, J.; Ashuri, B.; Chen, J. Discovering the Research Topics on Construction Safety and Health Using Semi-Supervised Topic Modeling. Buildings 2023, 13, 1169. [Google Scholar] [CrossRef]
- Sun, L.; Yin, Y. Discovering Themes and Trends in Transportation Research Using Topic Modeling. Transp. Res. Part C Emerg. Technol. 2017, 77, 49–66. [Google Scholar] [CrossRef]
- Nie, B.; Sun, S. Using Text Mining Techniques to Identify Research Trends: A Case Study of Design Research. Appl. Sci. 2017, 7, 401. [Google Scholar] [CrossRef]
- Hall, B. Text Mining and Data Visualization: Exploring Cultural Formations and Structural Changes in Fifty Years of Eighteenth-Century Poetry Criticism (1967–2018). Data Vis. Enlight. Lit. Cult. 2021, 153–195. [Google Scholar]
- Yau, S.C.; et al. Detection of Topic on Health News in Twitter Data. Emerg. Adv. Integr. Technol. 2021, 2, 23–29. [Google Scholar] [CrossRef]
- Ma, B.; Liu, S.; Pei, F.; Su, Z.; Yu, J.; Hao, C.; Li, Q.; Jiang, L.; Zhang, J.; Gan, Z. Development of Hydrogen Energy Storage Industry and Research Progress of Hydrogen Production Technology. In Proceedings of the 2021 IEEE 4th International Electrical and Energy Conference, CIEEC 2021, Wuhan, China, 28–30 May 2021. [Google Scholar]
- Choi, C.; Lee, J.; Machado, J.; Kim, G. Big-Data-Based Text Mining and Social Network Analysis of Landscape Response to Future Environmental Change. Land 2022, 11, 2183. [Google Scholar] [CrossRef]
- Kim, J.; Han, S.; Lee, H.; Koo, B.; Nam, M.; Jang, K.; Lee, J.; Chung, M. Trend Research on Maritime Autonomous Surface Ships (MASSs) Based on Shipboard Electronics: Focusing on Text Mining and Network Analysis. Electronics 2024, 13, 1902. [Google Scholar] [CrossRef]
- Baas, J.; Schotten, M.; Plume, A.; Côté, G.; Karimi, R. Scopus as a Curated, High-Quality Bibliometric Data Source for Academic Research in Quantitative Science Studies. Quant. Sci. Stud. 2020, 1, 377–386. [Google Scholar] [CrossRef]
- Park, S.; Park, J. Identifying the Knowledge Structure and Trends of Outreach in Public Health Care: A Text Network Analysis and Topic Modeling. Int. J. Environ. Res. Public Health 2021, 18, 9309. [Google Scholar] [CrossRef]
- Aizawa, A. An Information-Theoretic Perspective of TF-IDF Measures. Inf. Process. Manag. 2003, 39, 45–65. [Google Scholar] [CrossRef]
- Xiang, L. Application of an Improved TF-IDF Method in Literary Text Classification. Adv. Multimed. 2022, 2022, 9285324. [Google Scholar] [CrossRef]
- Park, C.S. Using Text Network Analysis for Analyzing Academic Papers in Nursing. Perspect. Nurs. Sci. 2019, 16, 12–24. [Google Scholar] [CrossRef]
- Zhang, J.; Luo, Y. Degree Centrality, Betweenness Centrality, and Closeness Centrality in Social Network. In Proceedings of the 2017 2nd International Conference on Modelling, Simulation and Applied Mathematics (MSAM2017), Bangkok, Thailand, 26–27 March 2017; pp. 300–303. [Google Scholar]
- Jelodar, H.; Wang, Y.; Yuan, C.; Feng, X. Latent Dirichlet Allocation (LDA) and Topic Modeling: Models, Applications, a Survey. Multimed. Tools Appl. 2019, 78, 15169–15211. [Google Scholar] [CrossRef]
- Vorontsov, K.; Potapenko, A.; Plavin, A. Additive Regularization of Topic Models for Topic Selection and Sparse Factorization. In Proceedings of the Statistical Learning and Data Sciences: Third International Symposium, SLDS 2015, Egham, UK, 20–23 April 2015; pp. 193–202. [Google Scholar]
- O’Callaghan, D.; Greene, D.; Carthy, J.; Cunningham, P. An Analysis of the Coherence of Descriptors in Topic Modeling. Expert Syst. Appl. 2015, 42, 5645–5657. [Google Scholar] [CrossRef]
- Durbin, J.; Watson, G.S. Testing for Serial Correlation in Least Squares Regression. III. Biometrika 1971, 58, 1–9. [Google Scholar] [CrossRef]
- Seo, Y.; Kim, K.; Kim, J.-S. Trends of Nursing Research on Accidental Falls: A Topic Modeling Analysis. Int. J. Environ. Res. Public Health 2021, 18, 3963. [Google Scholar] [CrossRef]
- Röder, M.; Both, A.; Hinneburg, A. Exploring the Space of Topic Coherence Measures. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, Shanghai, China, 2–6 February 2015; pp. 399–408. [Google Scholar]
- Gallent Torres, C.; Zapata-González, A.; Ortego-Hernando, J.L. The Impact of Generative Artificial Intelligence in Higher Education: A Focus on Ethics and Academic Integrity. RELIEVE 2023, 29, 1–19. [Google Scholar]
- Alawida, M.; et al. A Comprehensive Study of ChatGPT: Advancements, Limitations, and Ethical Considerations in Natural Language Processing and Cybersecurity. Information 2023, 14, 462. [Google Scholar] [CrossRef]
- Horne, D. PwnPilot: Reflections on Trusting Trust in the Age of Large Language Models and AI Code Assistants. In Proceedings of the 2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE), IEEE, 2023.
- Charfeddine, M.; et al. ChatGPT’s Security Risks and Benefits: Offensive and Defensive Use-Cases, Mitigation Measures, and Future Implications. IEEE Access 2024. [CrossRef]
- Tang, L.; Bashir, M. A Comprehensive Analysis of Public Sentiment Towards ChatGPT’s Privacy Implications. In Proceedings of the International Conference on Human-Computer Interaction, Cham, Springer Nature Switzerland; 2024. [Google Scholar]
- Chen, C.W.; Walter, P.; Wei, J.C.-C. Using ChatGPT-Like Solutions to Bridge the Communication Gap Between Patients with Rheumatoid Arthritis and Health Care Professionals. JMIR Med. Educ. 2024, 10, e48989. [Google Scholar] [CrossRef]









| Preprocessing Type | Purpose | Example |
| Synonym Consolidation |
Standardizes multiple words with the same meaning for consistency | "chatGPT," "chatgpt," “Chatgpt,” "CHATGPT" → "ChatGPT" |
| Case Conversion | Prevents duplication caused by letter case variations | "Ai" and "ai" → "AI" |
| Stop Words Removal |
Removes words that do not add analytical value | "for," "with," "on" → removed |
| Rank | Word | Frequency | Word | TF-IDF | Rank | Word | Frequency | Word | TF-IDF |
|---|---|---|---|---|---|---|---|---|---|
| 1 | ChatGPT | 2,799.00 | model | 602 | 11 | education | 499 | analysis | 297 |
| 2 | model | 1,503.00 | language | 530 | 12 | intelligence | 472 | use | 294 |
| 3 | AI | 1,271.00 | AI | 433 | 13 | system | 466 | security | 265 |
| 4 | language | 958 | datum | 408 | 14 | result | 464 | development | 256 |
| 5 | datum | 768 | intelligence | 367 | 15 | use | 454 | information | 250 |
| 6 | LLM | 735 | technology | 360 | 16 | challenge | 453 | system | 239 |
| 7 | technology | 684 | result | 348 | 17 | analysis | 449 | finding | 237 |
| 8 | tool | 611 | application | 334 | 18 | security | 432 | method | 236 |
| 9 | application | 552 | tool | 331 | 19 | user | 420 | task | 232 |
| 10 | student | 544 | challenge | 307 | 20 | information | 403 | education | 231 |
| Rank | Word | Degree Centrality | Word | Closeness Centrality | Word | Betweenness Centrality |
|---|---|---|---|---|---|---|
| 1 | ChatGPT | 0.93878 | model | 0.67606 | ChatGPT | 0.02865 |
| 2 | AI | 0.87755 | language | 0.5931 | AI | 0.02539 |
| 3 | model | 0.81633 | LLM | 0.28357 | result | 0.02225 |
| 4 | result | 0.81633 | ChatGPT | 0.25657 | datum | 0.01983 |
| 5 | datum | 0.79592 | AI | 0.1293 | model | 0.01907 |
| 6 | LLM | 0.77551 | use | 0.07502 | technology | 0.01786 |
| 7 | technology | 0.77551 | tool | 0.06478 | LLM | 0.01683 |
| 8 | analysis | 0.7551 | application | 0.05542 | analysis | 0.01675 |
| 9 | development | 0.71429 | technology | 0.0525 | development | 0.015 |
| 10 | application | 0.69388 | learning | 0.03999 | challenge | 0.01393 |
| 11 | tool | 0.67347 | intelligence | 0.03711 | application | 0.01301 |
| 12 | education | 0.67347 | capability | 0.035 | student | 0.01177 |
| 13 | security | 0.67347 | performance | 0.03384 | education | 0.01171 |
| 14 | challenge | 0.67347 | education | 0.02849 | security | 0.01165 |
| 15 | capability | 0.63265 | impact | 0.02775 | capability | 0.01117 |
| 16 | student | 0.63265 | generation | 0.0238 | user | 0.01016 |
| 17 | system | 0.63265 | response | 0.02245 | tool | 0.01007 |
| 18 | performance | 0.61224 | student | 0.01998 | information | 0.00916 |
| 19 | field | 0.61224 | datum | 0.0199 | response | 0.00914 |
| 20 | method | 0.61224 | system | 0.01972 | performance | 0.00883 |
| Topic 1 (422 Papers) | Topic 2 (170 Papers) | Topic 3 (161 Papers) | ||||||
|---|---|---|---|---|---|---|---|---|
| Rank | Keyword | Prob | Rank | Keyword | Prob | Rank | Keyword | Prob |
| 1 | ChatGPT | 0.062 | 1 | model | 0.073 | 1 | ChatGPT | 0.044 |
| 2 | AI | 0.042 | 2 | LLM | 0.054 | 2 | code | 0.034 |
| 3 | education | 0.02 | 3 | language | 0.046 | 3 | model | 0.021 |
| 4 | student | 0.02 | 4 | datum | 0.022 | 4 | result | 0.018 |
| 5 | technology | 0.017 | 5 | ChatGPT | 0.019 | 5 | question | 0.017 |
| 6 | tool | 0.017 | 6 | text | 0.016 | 6 | tool | 0.016 |
| 7 | use | 0.014 | 7 | task | 0.014 | 7 | language | 0.015 |
| 8 | intelligence | 0.013 | 8 | dataset | 0.012 | 8 | software | 0.015 |
| 9 | challenge | 0.012 | 9 | method | 0.012 | 9 | task | 0.013 |
| 10 | concern | 0.011 | 10 | performance | 0.011 | 10 | Detection | 0.013 |
| AI and Security in Education (40%) | Security in Language Models and Data Processing (16%) | Secure Software Development with AI (16%) | ||||||
| Topic 4 (146Papers) | Topic 5 (98Papers) | Topic 6 (49Papers) | ||||||
| Rank | Keyword | Prob | Rank | Keyword | Prob | Rank | Keyword | Prob |
| 1 | security | 0.025 | 1 | ChatGPT | 0.051 | 1 | healthcare | 0.03 |
| 2 | system | 0.02 | 2 | user | 0.028 | 2 | patient | 0.02 |
| 3 | datum | 0.018 | 3 | factor | 0.022 | 3 | health | 0.019 |
| 4 | AI | 0.016 | 4 | analysis | 0.012 | 4 | care | 0.016 |
| 5 | technology | 0.016 | 5 | attitude | 0.012 | 5 | treatment | 0.012 |
| 6 | service | 0.015 | 6 | medium | 0.012 | 6 | response | 0.012 |
| 7 | development | 0.015 | 7 | intention | 0.011 | 7 | medicine | 0.011 |
| 8 | application | 0.014 | 8 | perception | 0.01 | 8 | datum | 0.008 |
| 9 | ChatGPT | 0.013 | 9 | result | 0.01 | 9 | ChatGPT | 0.008 |
| 10 | model | 0.013 | 10 | model | 0.01 | 10 | accuracy | 0.008 |
| AI Systems Security and Risk Management (14%) |
User Privacy and AI Security (9%) | Healthcare Security with AI (5%) | ||||||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).