Submitted:
25 August 2024
Posted:
26 August 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Background
3. State of the Art
4. Materials and Methods
4.1. Data Analysis
4.2. Model Analysis
5. Experimental Analysis
5.1. BERT Optimization
| Batch Size | Learning Rate | Precision (NOT) | Recall (NOT) | F1 Score (NOT) | Precision (OFF) | Recall (OFF) | F1 Score (OFF) | Macro Average Precision | Macro Average Recall | Macro Average F1 Score |
|---|---|---|---|---|---|---|---|---|---|---|
| 32 | 5.00e-05 | 0.88 | 0.88 | 0.88 | 0.69 | 0.68 | 0.69 | 0.78 | 0.78 | 0.78 |
| 32 | 3.00e-05 | 0.88 | 0.90 | 0.89 | 0.72 | 0.68 | 0.70 | 0.80 | 0.79 | 0.79 |
| 32 | 2.00e-05 | 0.88 | 0.90 | 0.89 | 0.72 | 0.68 | 0.70 | 0.80 | 0.79 | 0.79 |
| 16 | 5.00e-05 | 0.87 | 0.89 | 0.88 | 0.69 | 0.66 | 0.68 | 0.78 | 0.77 | 0.78 |
| 16 | 3.00e-05 | 0.88 | 0.89 | 0.89 | 0.71 | 0.70 | 0.70 | 0.80 | 0.79 | 0.80 |
| 16 | 2.00e-05 | 0.89 | 0.90 | 0.89 | 0.73 | 0.70 | 0.72 | 0.81 | 0.80 | 0.81 |
| Batch Size | Learning Rate | Precision (TIN) | Recall (TIN) | F1 Score (TIN) | Precision (UNT) | Recall (UNT) | F1 Score (UNT) | Macro Average Precision | Macro Average Recall | Macro Average F1 Score |
|---|---|---|---|---|---|---|---|---|---|---|
| 32 | 5.00e-05 | 0.91 | 0.97 | 0.94 | 0.58 | 0.25 | 0.35 | 0.74 | 0.61 | 0.65 |
| 32 | 3.00e-05 | 0.92 | 0.95 | 0.94 | 0.52 | 0.40 | 0.45 | 0.72 | 0.68 | 0.69 |
| 32 | 2.00e-05 | 0.91 | 0.97 | 0.94 | 0.57 | 0.29 | 0.39 | 0.74 | 0.63 | 0.66 |
| 16 | 5.00e-05 | 0.93 | 0.97 | 0.95 | 0.70 | 0.44 | 0.54 | 0.81 | 0.71 | 0.75 |
| 16 | 3.00e-05 | 0.91 | 0.96 | 0.93 | 0.50 | 0.29 | 0.37 | 0.70 | 0.62 | 0.65 |
| 16 | 2.00e-05 | 0.93 | 0.96 | 0.94 | 0.63 | 0.44 | 0.52 | 0.78 | 0.70 | 0.73 |
5.2. Modelling Toxicity
| Model | Precision (NOT) | Recall (NOT) | F1 Score (NOT) | Precision (OFF) | Recall (OFF) | F1 Score (OFF) | Macro Average Precision | Macro Average Recall | Macro Average F1 Score |
|---|---|---|---|---|---|---|---|---|---|
| OLID | 0.971 | 0.915 | 0.942 | 0.451 | 0.718 | 0.554 | 0.711 | 0.817 | 0.748 |
| OLID + Reddit 1 | 0.967 | 0.978 | 0.973 | 0.746 | 0.661 | 0.701 | 0.857 | 0.819 | 0.837 |
| OLID + Reddit 2 | 0.968 | 0.985 | 0.977 | 0.819 | 0.665 | 0.734 | 0.893 | 0.825 | 0.855 |
| Model | Precision (TIN) | Recall (TIN) | F1 Score (TIN) | Precision (UNT) | Recall (UNT) | F1 Score (UNT) | Macro Average Precision | Macro Average Recall | Macro Average F1 Score |
|---|---|---|---|---|---|---|---|---|---|
| OLID | 0.385 | 0.907 | 0.541 | 0.893 | 0.349 | 0.502 | 0.639 | 0.628 | 0.521 |
| OLID + Reddit 1 | 0.536 | 0.684 | 0.601 | 0.837 | 0.737 | 0.782 | 0.687 | 0.709 | 0.691 |
| OLID + Reddit 2 | 0.577 | 0.539 | 0.557 | 0.798 | 0.822 | 0.810 | 0.688 | 0.681 | 0.684 |
| Model | Non-Toxic (NOT/ OFF+UNT) - Precision | Non-Toxic (NOT/ OFF+UNT) - Recall | Non-Toxic (NOT/ OFF+UNT) - F1 Score | Toxic (OFF + TIN) - Precision | Toxic (OFF + TIN) - Recall | Toxic (OFF + TIN) - F1 Score | Macro Average Precision | Macro Average Recall | Macro Average F1 Score |
|---|---|---|---|---|---|---|---|---|---|
| OLID | 0.988 | 0.908 | 0.946 | 0.158 | 0.618 | 0.252 | 0.573 | 0.763 | 0.599 |
| OLID + Reddit 1 | 0.983 | 0.978 | 0.980 | 0.337 | 0.394 | 0.363 | 0.660 | 0.686 | 0.672 |
| OLID + Reddit 2 | 0.981 | 0.987 | 0.984 | 0.433 | 0.342 | 0.382 | 0.707 | 0.664 | 0.683 |
| Model | Precision (NOT) | Recall (NOT) | F1 Score (NOT) | Precision (OFF) | Recall (OFF) | F1 Score (OFF) | Macro Average Precision | Macro Average Recall | Macro Average F1 Score |
|---|---|---|---|---|---|---|---|---|---|
| BERT OLID | 0.971 | 0.915 | 0.942 | 0.451 | 0.718 | 0.554 | 0.711 | 0.817 | 0.748 |
| HateBERT OLID | 0.859 | 0.885 | 0.872 | 0.679 | 0.625 | 0.651 | 0.769 | 0.755 | 0.761 |
| Model | Precision (TIN) | Recall (TIN) | F1 Score (TIN) | Precision (UNT) | Recall (UNT) | F1 Score (UNT) | Macro Average Precision | Macro Average Recall | Macro Average F1 Score |
|---|---|---|---|---|---|---|---|---|---|
| BERT OLID | 0.385 | 0.907 | 0.541 | 0.893 | 0.349 | 0.502 | 0.639 | 0.628 | 0.521 |
| HateBERT OLID | 0.354 | 0.908 | 0.509 | 0.860 | 0.254 | 0.393 | 0.607 | 0.581 | 0.451 |
6. Conclusions and Future Works
References
- Oxford University Press, “Assessment noun - Definition, pictures, pronunciation and usage notes | Oxford Advanced American Dictionary,” 2024. https://www.oxfordlearnersdictionaries.com/definition/american_english/toxicity (accessed May 24, 2024).
- P. Fortuna, J. Soler-Company, and L. Wanner, “Toxic, hateful, offensive or abusive? What are we really classifying? An empirical analysis of hate speech datasets,” in LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings, 2020, pp. 6786–6794. Accessed: May 24, 2024. [Online]. Available: https://aclanthology.org/2020.lrec-1.838.
- P. D. Falko, L. Leuphana, and U. Lüneburg, “The Success of the Freemium Business Model. How Riot Games flourishes with a free to play game,” Manager Journal, vol. 29, no. 1, pp. 114–124, 2019, Accessed: May 24, 2024. [Online]. Available: https://www.proquest.com/openview/fa79ec9ae04a87cebb761a62c21f4f1a/1?pq-origsite=gscholar&cbl=2032296.
- V. Kanaparthi, “Credit Risk Prediction using Ensemble Machine Learning Algorithms,” in 6th International Conference on Inventive Computation Technologies, ICICT 2023 - Proceedings, 2023, pp. 41–47. [CrossRef]
- V. K. Kanaparthi, “Examining the Plausible Applications of Artificial Intelligence & Machine Learning in Accounts Payable Improvement,” FinTech, vol. 2, no. 3, pp. 461–474, Jul. 2023. [CrossRef]
- V. Kanaparthi, “Robustness Evaluation of LSTM-based Deep Learning Models for Bitcoin Price Prediction in the Presence of Random Disturbances,” Jan. 2024. [CrossRef]
- V. Kanaparthi, “Evaluating Financial Risk in the Transition from EONIA to ESTER: A TimeGAN Approach with Enhanced VaR Estimations,” Jan. 2024. [CrossRef]
- V. K. Kanaparthi, “Navigating Uncertainty: Enhancing Markowitz Asset Allocation Strategies through Out-of-Sample Analysis,” Dec. 2023. [CrossRef]
- S. Donaldson, “I predict a riot: Making and breaking rules and norms in league of legends,” in Proceedings of the 2017 DiGRA International Conference, DiGRA 2017, 2017.
- P. Mishra, H. Yannakoudakis, and E. Shutova, “Tackling Online Abuse: A Survey of Automated Abuse Detection Methods,” Aug. 2019, Accessed: May 24, 2024. [Online]. Available: https://arxiv.org/abs/1908.06024v2.
- Z. Waseem, T. Davidson, D. Warmsley, and I. Weber, “Understanding abuse: A typology of abusive language detection subtasks,” in Proceedings of the Annual Meeting of the Association for Computational Linguistics, May 2017, pp. 78–84. [CrossRef]
- Y. W. Jeong, Y. R. Han, S. K. Kim, and H. S. Jeong, “The frequency of impairments in everyday activities due to the overuse of the internet, gaming, or smartphone, and its relationship to health-related quality of life in Korea,” BMC Public Health, vol. 20, no. 1, pp. 1–16, Jun. 2020. [CrossRef]
- A. Grossman, “Nihilistic Software’s VAMPIRE: THE MASQUERADE—REDEMPTION . by robert huebner,” in Postmortems from Game Developer, Routledge, 2021, pp. 62–73. [CrossRef]
- A. Paraschiv and D. C. Cercel, “UPB at GermEval-2019 task 2: BERT-based offensive language classification of German tweets,” in Proceedings of the 15th Conference on Natural Language Processing, KONVENS 2019, 2020, pp. 398–404. Accessed: May 24, 2024. [Online]. Available: https://www.researchgate.net/publication/337007402.
- S. Wazir, G. S. Kashyap, and P. Saxena, “MLOps: A Review,” Aug. 2023, Accessed: Sep. 16, 2023. [Online]. Available: https://arxiv.org/abs/2308.10908v1.
- S. Naz and G. S. Kashyap, “Enhancing the predictive capability of a mathematical model for pseudomonas aeruginosa through artificial neural networks,” International Journal of Information Technology 2024, pp. 1–10, Feb. 2024. [CrossRef]
- G. S. Kashyap, K. Malik, S. Wazir, and R. Khan, “Using Machine Learning to Quantify the Multimedia Risk Due to Fuzzing,” Multimedia Tools and Applications, vol. 81, no. 25, pp. 36685–36698, Oct. 2022. [CrossRef]
- N. Marwah, V. K. Singh, G. S. Kashyap, and S. Wazir, “An analysis of the robustness of UAV agriculture field coverage using multi-agent reinforcement learning,” International Journal of Information Technology (Singapore), vol. 15, no. 4, pp. 2317–2327, May 2023. [CrossRef]
- S. Wazir, G. S. Kashyap, K. Malik, and A. E. I. Brownlee, “Predicting the Infection Level of COVID-19 Virus Using Normal Distribution-Based Approximation Model and PSO,” Springer, Cham, 2023, pp. 75–91. [CrossRef]
- P. Kaur, G. S. Kashyap, A. Kumar, M. T. Nafis, S. Kumar, and V. Shokeen, “From Text to Transformation: A Comprehensive Review of Large Language Models’ Versatility,” Feb. 2024, Accessed: Mar. 21, 2024. [Online]. Available: https://arxiv.org/abs/2402.16142v1.
- G. S. Kashyap, A. Siddiqui, R. Siddiqui, K. Malik, S. Wazir, and A. E. I. Brownlee, “Prediction of Suicidal Risk Using Machine Learning Models.” Dec. 25, 2021. Accessed: Feb. 04, 2024. [Online]. Available: https://papers.ssrn.com/abstract=4709789.
- M. Ostendorff, P. Bourgonje, M. Berger, J. Moreno-Schneider, G. Rehm, and B. Gipp, “Enriching BERT with knowledge graph embeddings for document classification,” in Proceedings of the 15th Conference on Natural Language Processing, KONVENS 2019, Sep. 2020, pp. 307–314. Accessed: May 24, 2024. [Online]. Available: https://arxiv.org/abs/1909.08402v1.
- I. Mollas, Z. Chrysopoulou, S. Karlos, and G. Tsoumakas, “ETHOS: a multi-label hate speech detection dataset,” Complex and Intelligent Systems, vol. 8, no. 6, pp. 4663–4678, Jun. 2022. [CrossRef]
- H. Hosseini, S. Kannan, B. Zhang, and R. Poovendran, “Deceiving Google’s Perspective API Built for Detecting Toxic Comments,” Feb. 2017, Accessed: May 24, 2024. [Online]. Available: https://arxiv.org/abs/1702.08138v1.
- T. Caselli, V. Basile, J. Mitrović, and M. Granitzer, “HateBERT: Retraining BERT for Abusive Language Detection in English,” in WOAH 2021 - 5th Workshop on Online Abuse and Harms, Proceedings of the Workshop, Oct. 2021, pp. 17–25. [CrossRef]
- L. A. Nexø and S. Kristiansen, “Players Don’t Die, They Respawn: a Situational Analysis of Toxic Encounters Arising from Death Events in League of Legends,” European Journal on Criminal Policy and Research, vol. 29, no. 3, pp. 457–476, Sep. 2023. [CrossRef]
- J. C. Aguerri, M. Santisteban, and F. Miró-Llinares, “The Enemy Hates Best? Toxicity in League of Legends and Its Content Moderation Implications,” European Journal on Criminal Policy and Research, vol. 29, no. 3, pp. 437–456, Sep. 2023. [CrossRef]
- A. Ghosh, “Analyzing Toxicity in Online Gaming Communities,” Apr. 2021. Accessed: May 24, 2024. [Online]. Available: https://www.turcomat.org/index.php/turkbilmat/article/view/5182.
- V. Kanaparthi, “Examining Natural Language Processing Techniques in the Education and Healthcare Fields,” International Journal of Engineering and Advanced Technology, vol. 12, no. 2, pp. 8–18, Dec. 2022. [CrossRef]
- V. Kanaparthi, “Exploring the Impact of Blockchain, AI, and ML on Financial Accounting Efficiency and Transformation,” Jan. 2024, Accessed: Feb. 04, 2024. [Online]. Available: https://arxiv.org/abs/2401.15715v1.
- V. Kanaparthi, “Transformational application of Artificial Intelligence and Machine learning in Financial Technologies and Financial services: A bibliometric review,” Jan. 2024. [CrossRef]
- V. Kanaparthi, “AI-based Personalization and Trust in Digital Finance,” Jan. 2024, Accessed: Feb. 04, 2024. [Online]. Available: https://arxiv.org/abs/2401.15700v1.
- G. S. Kashyap et al., “Detection of a facemask in real-time using deep learning methods: Prevention of Covid 19,” Jan. 2024, Accessed: Feb. 04, 2024. [Online]. Available: https://arxiv.org/abs/2401.15675v1.
- M. Kanojia, P. Kamani, G. S. Kashyap, S. Naz, S. Wazir, and A. Chauhan, “Alternative Agriculture Land-Use Transformation Pathways by Partial-Equilibrium Agricultural Sector Model: A Mathematical Approach,” Aug. 2023, Accessed: Sep. 16, 2023. [Online]. Available: https://arxiv.org/abs/2308.11632v1.
- G. S. Kashyap et al., “Revolutionizing Agriculture: A Comprehensive Review of Artificial Intelligence Techniques in Farming,” Feb. 2024. [CrossRef]
- G. S. Kashyap, D. Mahajan, O. C. Phukan, A. Kumar, A. E. I. Brownlee, and J. Gao, “From Simulations to Reality: Enhancing Multi-Robot Exploration for Urban Search and Rescue,” Nov. 2023, Accessed: Dec. 03, 2023. [Online]. Available: https://arxiv.org/abs/2311.16958v1.
- G. S. Kashyap, A. E. I. Brownlee, O. C. Phukan, K. Malik, and S. Wazir, “Roulette-Wheel Selection-Based PSO Algorithm for Solving the Vehicle Routing Problem with Time Windows,” Jun. 2023, Accessed: Jul. 04, 2023. [Online]. Available: https://arxiv.org/abs/2306.02308v1.
- H. Habib, G. S. Kashyap, N. Tabassum, and T. Nafis, “Stock Price Prediction Using Artificial Intelligence Based on LSTM– Deep Learning Model,” in Artificial Intelligence & Blockchain in Cyber Physical Systems: Technologies & Applications, CRC Press, 2023, pp. 93–99. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
