Submitted:
17 July 2024
Posted:
18 July 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. Publications Explore the Topic of Visualization of Bibliometric Data
1.2. Relevance of the Scilit Abstract Database for Bibliometric Analysis
- Scilit aggregates data from over 40,000 publishers.
- Scilit covers 166 million scholarly publications.
- Scilit provides citation score.
- Scilit Rankings: ranking of top publishers, journals and countries by number of journal articles published.
- Related articles widget: engine to recommend papers from Scilit based on keywords.
- Provides export of bibliometric data in a convenient form for analysis.
1.3. The Topic of Visualization in Bibliometric Analysis According Scilit Abstract Database Data
| Term | Count | Term | Count |
|---|---|---|---|
| bibliometric analysis | 797 | literature review | 36 |
| bibliometric | 657 | citation analysis | 35 |
| citespace | 430 | data visualization | 34 |
| vosviewer | 421 | biblioshiny | 33 |
| visualization | 163 | sustainability | 33 |
| web of science | 99 | cancer | 28 |
| visual analysis | 84 | knowledge graph | 27 |
| visualization analysis | 78 | knowledge map | 25 |
| research trends | 67 | machine learning | 25 |
| covid-19 | 52 | bibliometrix | 24 |
| scopus | 52 | inflammation | 24 |
| research hotspots | 51 | bibliometric study | 23 |
| artificial intelligence | 42 | knowledge mapping | 23 |
| trends | 40 | gut microbiota | 22 |
| hotspots | 38 | deep learning | 21 |
1.4. Justification of the Novelty of the Ongoing Research
1.5. Some Advantages of Scimago Graphica for Bibliometric Analysis
- Scimago Graphica provides users with the possibility to create a wide variety of complex and interactive data visualizations without coding knowledge.
- Scimago Graphica is an efficient tool for data analysis on bibliometric datasets, in addition to its capabilities in visualization.
- Scimago Graphica is an application that democratizes data visualization, enabling researchers and institutions with limited resources to create professional-quality bibliometric data visualizations.
2. Materials and Methods
2.1. Data Source
- Subject: AI & Machine Learning
- Year: 2021–2023
- Language: English
- Sort by Times cited
2.2. Text Preprocessing
- removing of unused substrings, e.g., abbreviations in brackets, hieroglyphs, Cyrillic characters, mathematical formulas (usually in Latex), markup tags including SVG markup, substrings such as “Published by Elsevier B.V.. All rights reserved” and so on
- lemmatization, a dictionary lemmatization collected mostly on github and augmented with new entries such as blockchains → blockchain was used. The dictionary included 260530 substitutions
- removal of stop words, stop words taken from GATE (General Architecture for Text Engineering) and spaCy programs were used
- the text was converted to lower case, in some cases spaces within compound keywords were replaced with underscores in order to perceive it as a whole
2.3. Programs and Utilities
3. Results and Discussions
3.1. Visualization of Title and Annotation Text Clustering Performed by GSDMM Algorithm
| Term name | Term name |
|---|---|
| machine_learn | convolutional_neural_network |
| deep_learn | computational_model |
| feature_selection | cluster_algorithm |
| anoma_detection | multi-criterion_decision-make |
| feature_extraction | rough_sett |
| intrusion_detection | big_datum |
| decision_make | intuitionistic_fuzzy_sett |
| datum_model | multi-criterion_decision_make |
| internet_of_thing | particle_swarm_optimization |
| support_vector_machine | fuzzy_logic |
| neural_network | analytic_hierarchy_process |
| intrusion_detection_system | genetic_algorithm |
| datum_mine | classification_algorithm |
| artificial_intelligence | three-way_decision |
| fuzzy_sett | decision_tree |
| predictive_model | time_series_analysis |
| task_analysis | aggregation_operator |
| ensemble_learn | time_series |
| artificial_neural_network | machine_learn_algorithm |
| random_forest | recurrent_neural_network |
3.2. Scientific Landscape Visualization with VOSviewer
3.3. Keywords Clustering Visualization with Scimago Graphica
3.4. Visual Selection of Multiple Terms to Build Queries Using the Alluvial Diagram
4. Conclusion
List of Files in the Attached Archive
Funding
References
- Li, J.; et al. Bibliometric Analysis for Intelligent Assessment of Data Visualization // Computer Science and Education / ed. Hong W., Weng Y. Singapore: Springer Nature Singapore, 2023. Vol. 1811. P. 363–373. [CrossRef]
- Szomszor, M.; et al. Szomszor M. et al. Interpreting Bibliometric Data // Front. Res. Metr. Anal. 2021. Vol. 5. P. 628703. [CrossRef]
- Xu, Y.; et al. Bibliometrics and Visualization Analysis of Knowledge Map in Metallurgical Field // Advances in Intelligent Systems and Interactive Applications / ed. Xhafa F., Patnaik S., Zomaya A.Y. Cham: Springer International Publishing, 2018. Vol. 686. P. 361–366. [CrossRef]
- Liao, H.; et al. A Bibliometric Analysis and Visualization of Medical Big Data Research // Sustainability. 2018. Vol. 10, № 1. P. 166. [CrossRef]
- Vílchez-Román, C. , Sanguinetti S., Mauricio-Salas M. Applied bibliometrics and information visualization for decision-making processes in higher education institutions // LHT. 2020. Vol. 39, № 1. P. 263–283. [CrossRef]
- Gu, N. , Hahnloser R.H.R. SciLit: A Platform for Joint Scientific Literature Discovery, Summarization and Citation Generation. 2023. [CrossRef]
- Delgado-Quirós, L. , Ortega J.L. Completeness degree of publication metadata in eight free-access scholarly databases // Quantitative Science Studies. 2024. Vol. 5, № 1. P. 31–49. [CrossRef]
- Venkatesan, A.; et al. SciLite: a platform for displaying text-mined annotations as a means to link research articles with biological data // Wellcome Open Res. 2016. Vol. 1. P. 25. [CrossRef]
- Chigarev, B. Analyzing the Possibilities of Using the Scilit Platform to Identify Current Energy Efficiency and Conservation Issues. 2024. [CrossRef]
- Hassan-Montero, Y. , De-Moya-Anegón F., Guerrero-Bote V.P. SCImago Graphica: a new tool for exploring and visually communicating data // EPI. 2022. P. e310502. [CrossRef]
- Li, L. The Study on Food Safety of 15 ‘RCEP’ Countries: Based on VOSviewer and Scimago Graphica // Science & Technology Libraries. 2024. Vol. 43, № 2. P. 147–154. [CrossRef]
- Chigarev, B. Identification of Actual Bibliometric/Scientometric Issues Based on 2018-2022 Data from the Lens Platform by Building Key Term Co-occurrence Network. 2022. [CrossRef]
- Van Eck, N.J. , Waltman L. Software survey: VOSviewer, a computer program for bibliometric mapping // Scientometrics. 2010. Vol. 84, № 2. P. 523–538. [CrossRef]
- Borgelt, C. Frequent item set mining // WIREs Data Min & Knowl. 2012. Vol. 2, № 6. P. 437–456. [CrossRef]
- Yin, J. , Wang J. A Dirichlet multinomial mixture model-based approach for short text clustering // Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. New York New York USA: ACM, 2014. P. 233–242. [CrossRef]
- Tang, Y.; et al. A new basic probability assignment generation and combination method for conflict data fusion in the evidence theory // Sci Rep. 2023. Vol. 13, № 1. P. 8443. [CrossRef]
- Dubois, D.; et al. The basic principles of uncertain information fusion. An organised review of merging rules in different representation frameworks // Information Fusion. 2016. Vol. 32. P. 12–39. [CrossRef]
- Liu, P. , Teng F. Multiple criteria decision-making method based on normal interval-valued intuitionistic fuzzy generalized aggregation operator // Complexity. 2016. Vol. 21, № 5. P. 277–290. [CrossRef]
- Luo, H.; et al. Agent oriented intelligent fault diagnosis system using evidence theory // Expert Systems with Applications. 2012. Vol. 39, № 3. P. 2524–2531. [CrossRef]
- Clauset, A. , Newman M.E.J., Moore C. Finding community structure in very large networks // Phys. Rev. E. 2004. Vol. 70, № 6. P. 066111. [CrossRef]
- Xue, Y. , Deng Y. A decomposable Deng entropy // Chaos, Solitons & Fractals. 2022. Vol. 156. P. 111835. [CrossRef]
- Deng, Y. Deng entropy // Chaos, Solitons & Fractals. 2016. Vol. 91. P. 549–553. [CrossRef]
- Aydemir, S.B. , Yilmaz Gunduz S. Fermatean fuzzy TOPSIS method with Dombi aggregation operators and its application in multi-criteria decision making // IFS. 2020. Vol. 39, № 1. P. 851–869. [CrossRef]
- Wang, W. , Tong M., Yu M. Blood Glucose Prediction with VMD and LSTM Optimized by Improved Particle Swarm Optimization // IEEE Access. 2020. Vol. 8. P. 217908–217916. [CrossRef]
- Jin F. et al. Consistency and trust relationship-driven social network group decision-making method with probabilistic linguistic information // Applied Soft Computing. 2021. Vol. 103. P. 107170. [CrossRef]
- Boutsidis, C.; et al. Randomized Dimensionality Reduction for $k$ -Means Clustering // IEEE Trans. Inform. Theory. 2015. Vol. 61, № 2. P. 1045–1062. [CrossRef]
- Chen, Y. , Yu J., Khan S. Spatial sensitivity analysis of multi-criteria weights in GIS-based land suitability evaluation // Environmental Modelling & Software. 2010. Vol. 25, № 12. P. 1582–1591. [CrossRef]
- Nizam, H.; et al. Real-Time Deep Anomaly Detection Framework for Multivariate Time-Series Data in Industrial IoT // IEEE Sensors J. 2022. Vol. 22, № 23. P. 22836–22849. [CrossRef]
- Sahin, Y. , Bulkan S., Duman E. A cost-sensitive decision tree approach for fraud detection // Expert Systems with Applications. 2013. Vol. 40, № 15. P. 5916–5923. [CrossRef]
- Salekshahrezaee, Z. , Leevy J.L., Khoshgoftaar T.M. The effect of feature extraction and data sampling on credit card fraud detection // J Big Data. 2023. Vol. 10, № 1. P. 6. [CrossRef]
- Buczak, A.L. , Guven E. A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection // IEEE Commun. Surv. Tutorials. 2016. Vol. 18, № 2. P. 1153–1176. [CrossRef]
- Vilone, G. , Longo L. Explainable Artificial Intelligence: a Systematic Review. 2020. [CrossRef]
- Xiaorong, Z. , Dianchun W., Changguo Y. A New Feature Extraction Method of Intrusion Detection // Proceedings of the 2009 First International Workshop on Education Technology and Computer Science - Volume 02. USA: IEEE Computer Society, 2009. P. 504–507. [CrossRef]
- Siddiqi, M.A. , Pak W. An Agile Approach to Identify Single and Hybrid Normalization for Enhancing Machine Learning-Based Network Intrusion Detection // IEEE Access. 2021. Vol. 9. P. 137494–137513. [CrossRef]















| DOI | Author Keywords | IEEE Terms |
| 10.1109/ETCS.2009.373 Ref. 33 |
RSVM;KPCA;intrusion detection;PSVM | Feature extraction;Intrusion detection;Principal component analysis;Data mining;Educational technology;Paper technology;Kernel;Support vector machines;Educational institutions;Support vector machine classification |
| 10.1109/ACCESS.2021.3118361 | Anomaly detection;Bot-IoT;CIC-IDS 2017;intrusion detection;IoT;ISCX-IDS 2012;normalization;NSL KDD;skewness;scaling;transformation;UNSW-NB15 | Intrusion detection;Mathematical models;Feature extraction;Training;Standards;Statistical analysis;Numerical models |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).