1. Introduction
In tropical and Low- and Middle-Income countries (LMICs), febrile diseases characterized by fever and frequently accompanied by other systemic symptoms present significant diagnostic and treatment challenges. Malaria, typhoid fever, urinary tract infections, Human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS), respiratory tract infections, and tuberculosis are among the illnesses that greatly increase morbidity and mortality, particularly in LMICs. These febrile diseases often present overlapping symptoms which results in misdiagnosis and suboptimal treatments, which prolongs illness, raises healthcare expenses, and in extreme situations, can lead to death [
1,
2,
3]. This emphasizes the need for trustworthy and interpretable diagnostic tools to aid healthcare professionals in making timely and efficient decisions.
The Advancements in Artificial intelligence (AI) and machine learning (ML) have transformed the healthcare sector by providing data-driven insights into disease management, diagnosis, and treatment [
4,
5]. These technologies use massive datasets to find relationships and patterns that conventional diagnostic techniques might miss [
6]. In healthcare diagnostics, data-driven approaches are becoming increasingly significant because they offer improved efficiency and precision such as in medical diagnosis [
7,
8,
9], personalized medicine, and healthcare analytics [
10,
11,
12]. They rely on medical data collection, processing, and analysis to identify trends, aid clinical decision-making, and enhance patient outcomes [
13]. ML techniques such as unsupervised, supervised, and reinforcement learning can be applied to develop medical diagnostic systems. However, while ML models like Random Forest (RF), Extreme Gradient Boost (XGBoost), and Multi-Layer Perceptron (MLP) algorithms have demonstrated impressive predictive capabilities, they are often criticized for their “black-box” nature [
14,
15,
16]. This “black-box” nature hinders their acceptance in crucial healthcare applications where transparency is essential by making it difficult for clinicians to comprehend the reasoning behind their predictions [
17]. Explainable AI (XAI) has addressed this gap by enhancing the interpretability of making ML models without sacrificing their predictive capabilities [
18]. Frameworks like Large Language Models (LLMs) and Local Interpretable Model-agnostic Explanations (LIME) offer visual representations and textual outputs that make model decisions easy to understand especially by non-experts [
19,
20]. Combining these frameworks allows diagnostic models to provide valuable insight, clarifying complex algorithm decisions and fostering professional trust. This strategy is important for febrile diseases, where early diagnosis and medical attention are essential for favorable patient outcomes.
The main aim of this study is to develop a data-driven, intelligent methodology for building explainable and accurate diagnostic models for febrile diseases, integrating ML algorithms with interpretability frameworks to enhance clinical decision-making and promote transparency in AI-driven healthcare systems. The study employs a data-driven approach, leveraging a rich dataset of patient records and symptoms to develop an explainable disease diagnostic model tailored for febrile diseases. ML algorithms like Random Forest, XGBoost, and Multi-Layer Perceptron can provide the basis for precise disease prediction, optimizing their performance through extensive hyperparameter tuning and validation techniques, and XAI techniques such as LIME can ensure that the predictions are transparent and give clinicians confidence they need to comprehend and trust the system. Large Language Models, such as ChatGPT can offer explanations in natural language, improving interpretability and making the diagnostic process easier for patients and medical professionals to understand. This study’s significant contributions are:
In this study, section 2 presents the methodology, which includes dataset description, preprocessing, and model development, integration of XAI, system implementation, and model performance metrics.
Section 3 discusses the results, which include an analysis of the models' performance and the explainability of ChatGPT and LIME. Section 4 concludes the study, presenting the innovative aspect of the system as well as limitations and areas for further research.
2. Methodology
2.1. Enhanced Diagnostic Framework
The components of the enhanced diagnostic framework and their interrelationships are structured with the components comprising medical experts from which patient data were collected, data Preprocessing, diagnostic system, model evaluation, healthcare provider, the patient, the mobile device, and the cloud storage, as illustrated in
Figure 1. The Android-based mobile device serves as the interface between healthcare providers and the diagnostic system. Through the app's user-friendly interface, the healthcare provider interacts directly with the system to enter personal data, patient history, and examination results, including temperature, blood pressure, respiratory rate, height, weight, and patient symptoms. The decision filter correctly groups patient vitals and symptoms for diagnostic decisions, simulating the reasoning of a knowledgeable doctor. The diagnosis and recommended treatment component gives medical professionals the patient's diagnosis based on the diagnostic system, which includes the RF diagnosis, LIME interpretation, additional GPT engine explanations, and suggested treatment. Diagnostic results, patient data, model data, and other system records are stored in the cloud. The medical experts are skilled doctors with experience in tropical fevers from secondary and tertiary hospitals who gathered information from patients with fevers during clinic days.
2.2. Dataset Description and Preprocessing
The dataset used in the work was obtained from a study funded by the New Frontiers in Research Fund (NFRF) to develop a system to help frontline health workers make early differential diagnoses of tropical diseases. The dataset contains 4870 patient records comprising patient symptoms, risk factors, demographic data, suspected diagnoses, further investigation, and confirmed diagnoses [
21].
Following the data collection, data exploration was required to examine the size, features, types of data, and structure of the dataset. According to the dataset, 225 patient records were obtained during the dry season, 40 during harmattan, and 4605 during the rainy season. There were 2175 male and 2695 female patients in the dataset, according to the descriptive statistics in
Table 1. The data exploration also displayed the number of patients in the dataset who were nursing mothers and those who were pregnant in the first, second, and third trimesters, along with the corresponding months.
Table 2 presents the number of suspected and confirmed diagnoses as well as the symptoms in the dataset.
A five-point rating system was used to describe the patient's symptoms (1=absent; 2=mild; 3=moderate; 4=severe; 5=very severe), and a six-point rating system to describe the diagnoses (1=absent; 2=very low; 3=low; 4=moderate; 5=high; 6=very high) and the sample patient dataset is shown in
Figure 2. Because patients under the age of five (5) were unable to adequately express certain symptoms and the data collection tool did not account for certain symptoms of patients under the age of five, records of these patients were eliminated from the data during the preprocessing stage. Additionally, the doctor's suspected diagnosis columns were eliminated from the dataset, leaving only the symptoms and verified diagnoses following additional research. Additionally, the symptoms of the haemorrhagic fevers (dengue, yellow, and Lassa fever) were eliminated because these illnesses were not taken into account in this study. As shown in
Figure 3, the dataset was cleaned up and reduced to 3914 records with 32 symptoms and 8 confirmed diagnoses while
Table 3 lists the symptoms and diseases along with their abbreviations.
To further reduce the number of confirmed diseases to the six diseases that were included in the scope of this study, upper respiratory tract infections (URTI), and lower respiratory tract infections (LRTI), as well as upper urinary tract infections (UPUTI) and lower urinary tract infections (LWUTI), were combined into respiratory tract infections (RTI) and urinary tract infections (UTI), respectively. Max operation also known as max function was applied to combine the two sets of severity levels into a single value. The max operation combines severity scales, emphasizing the highest severity recorded across multiple metrics, by taking the maximum value from two or more related measurements. Given that
and
are the severity levels of upper and lower urinary tract infections respectively, the Max Operation
can be expressed as:
where
represents the first input to the max operation while
represents the second input to the max operation, the max function returns the maximum value between
and
. This process guarantees that the worst-case scenario from the two columns is appropriately represented by the combined severity level. The Max Operation is a dependable method of combining severity scales when the objective is to identify the most severe medical condition.
Figure 4 displays the dataset following the application of the Max Operation.
Figure 5 illustrates the results of the disease severity Absent (1) was mapped to binary 0 using custom mapping, and very-low to very-severe (2 to 6) to binary 1. A lambda function mapping the disease severity to 0 and 1 is employed, with
. By mapping the diseases, the dataset was prepared and the disease severity was represented in a straightforward, binary format for efficient training of machine learning models.
2.3. Data Preprocessing and Oversampling
Three machine learning algorithms were taken into consideration in this study: MLP, RF, XGBOOST, and random forest which was the model with the highest performance, was used to develop the diagnostic system. Using multiple decision trees, the RF model leveraged the power of ensemble learning to provide reliable diagnoses. By combining the diagnoses from all trees and using multiple symptoms as input features, this RF model enhances the diagnostic outcomes of febrile diseases. RF uses patient data, including symptoms and diseases, to construct numerous decision trees on identical nodes. Then, it combines the decisions from these decision trees to arrive at a solution that is the average of all the decision trees [
22]. XGBoost was chosen due to its advanced gradient boosting implementation and ensemble technique, which makes it a portable, flexible, and efficient option for disease diagnosis. XGBoost builds classification trees sequentially, training the subsequent tree with the residuals from the previous tree. As its basis, XGBoost uses gradient-boosted decision trees and regularization techniques to enhance model generalization. In a stepwise fashion, weak learners are progressively added to the group, with each member concentrating on fixing the mistakes of the others. During training, it minimizes a predetermined loss function using the gradient descent optimization technique [
22,
23]. MLP was also chosen due to its capacity to model intricate relationships, learn from high-dimensional datasets, and handle a variety of data types, making it a useful tool for disease diagnosis [
24]. When paired with the right interpretability and training strategies, MLPs can provide accurate and useful information for medical diagnosis. This feedforward artificial neural network consists of three layers: an output layer, one or more hidden layers, and an input layer. To ensure robust evaluation, an 80/20 ratio was used to divide the dataset on febrile disease into training and testing subsets. The models were trained using the training set, which made up 80% of the dataset, and the testing set, which made up the remaining 20%, was used to assess generalization. To optimize the model's performance, hyperparameter tuning was carried out using Grid search cross-validation (GridSearchCV) with 5-fold cross-validation (CV=5). For cross-validation, the training data is split into five equal subsets or folds. The model was trained on four-folds, and testing was done on the last fold. This procedure was carried out five times, each time using a different fold as the test set to ensure that every observation in the training data was used for validation. Cross-validation is a helpful method for assessing the model's resilience and reducing the possibility of overfitting. Finding the best hyperparameters is also aided by averaging the performance across all folds. The hyperparameters for each model were adjusted using GridSearchCV in order to determine which combination produced the best results. Hyperparameter tuning is used to optimize the model's performance by controlling how it learns from the data, and it has a significant impact on the model's accuracy and efficiency. For Random Forest (RF), max_depth, which specifies the maximum depth of each tree, was set to [None, 10, 20], and n_estimators, which determines the number of trees in the forest, was tested with values [100, 200, 300]. For the Multi-layer Perceptron (MLP), three hyperparameters were adjusted: the activation function (which transforms input data in the neural network) with ['relu', 'tanh', 'logistic'), the regularization term alpha with [0.0001, 0.001, 0.01], and the hidden_layer_sizes (number of neurons in each layer) with [(100,), (50, 50), (50, 25, 10)]. Finally, for XGBoost, [100, 200, 30] and [3, 5, 7] were used to adjust the n_estimators and max_depth parameters
2.4. Integration of Explainable AI
Interpretability was provided locally through the use of LIME, which approximated the model's behaviour around a specific diagnosis using a simpler model. LIME helps healthcare professionals understand why a model diagnoses a disease for a specific patient based on their symptoms, which is very useful when diagnosing diseases where patient cases may differ significantly from one another. This localized explanation aids in identifying any irregularities or errors in the diagnosis, thereby increasing the diagnostic model's accuracy and dependability. LIME offers versatility and broad applicability in a range of diagnostic scenarios because it is model-agnostic and works with a variety of machine-learning models [
25,
26]. GPT uses its powerful natural language processing capabilities to analyze and understand complex medical data, significantly improving the diagnosis of febrile diseases [
27,
28]. By offering comprehensive explanations for its recommendations, it enhances the effectiveness of disease diagnosis when paired with an ML model, ultimately improving patient outcomes. It makes use of the transformer architecture, which is the fundamental model in natural language processing because of its effectiveness in handling sequential data. With the help of the ML models' predicted labels and probabilities, GPT produces explanations that are comprehensible to humans. The GPT receives a structured prompt with the patient's symptoms and the likelihood of each disease based on the results of an ML model. GPT produces an explanation for every predicted disease by combining medical knowledge with its probability score and related symptoms.
2.5. System Implementation
Visual Studio Code (VS Code), PythonAnywhere, and Google Colaboratory (Google Colab) were used in the system's implementation. VS Code is a robust, lightweight, and open-source source code editor that supports numerous extensions and speeds up the coding process. PythonAnywhere is an integrated development environment (IDE) and online web hosting service based on the Python programming language. Google Colab is an online tool that offers free access to a Graphical Processing Unit (GPU) and Tensor Processing Unit (TPU) for creating ML and deep learning models. This cloud-based infrastructure provides powerful processing resources for training and testing complex machine learning models, including Random Forest, XGBoost, and MLP, at no cost. Figma was used to build the system's UI/UX design because it offers a robust, team-based platform for creating logical and visually appealing user interfaces. To make sure the app's design was user-friendly for patients and healthcare professionals, developers and designers could easily collaborate thanks to its real-time collaboration features. Flet was chosen for mobile app development because it provides an easy-to-use framework for creating interactive desktop, mobile, and web applications with Python. Because of its simplicity of use and seamless integration with other Python libraries, such as Random Forest for machine learning, LIME for explainability, and GPT for natural language processing, it was the best choice. MySQL is a popular relational database management system that is well-known for being scalable and open-source. The laws governing the privacy of medical data make it suitable for handling sensitive health data. MySQL facilitates smooth data transfer between the models and the database by integrating with the Python backend and running GPT, LIME, and RF with ease.
2.6. Model Performance Metrics
The performance of the ML models was assessed using Recall, Precision, and F1 Score because they ensure that the model's diagnoses are accurate and reliable.
Recall: Recall quantifies the ratio of accurately anticipated positive observations to all observations made during the actual class. It shows how well positive samples can be identified by the model. In medical screenings, for example, where missing a positive case (false negative) can be crucial, recall is crucial when the cost of false negatives is high.
Precision: This metric quantifies the ratio of accurately predicted positive observations to all predicted positive observations. It shows how accurate positive forecasts are. When the cost of false positives is high, accuracy is essential. For instance, a false positive in a medical diagnosis could result in needless treatments.
F1 Score: The F1 Score is calculated by taking the harmonic mean of recall and precision. It offers a balance between recall and precision, which is especially helpful for datasets that are unbalanced. When precision and recall must be balanced, the F1 Score can be helpful, particularly when there is an unequal distribution of classes.
3. Results and Discussion
Three algorithms were used in this study to demonstrate the capability to model the complex relationships in the dataset, and RF emerged as the best-performing model due to its balance between high predictive results and computational efficiency. While MLP displayed limitations in recall and F1 scores for specific diseases, suggesting possible limitations in handling imbalanced data within the dataset, XGBoost showed promising results but required intensive tuning. Across all models, Random Forest algorithm emerged as the best-performing model and achieved the highest diagnostic performance across most diseases with an f1-score of 88% for malaria, 60% for enteric fever, 51% for HIV and AIDS, 72% for urinary tract infection, 72% for respiratory tract infection, and 60% for tuberculosis followed by XGBoost (87%, 60%, 48%, 70%, 72%, and 65%) and the MLP (85%, 51%, 46%, 70%, 69%, 64%) model as presented in
Table 4 and
Figure 6.
A Model Interpretability Framework (MIF) was incorporated into the diagnostic system to address the vital need for explainability and transparency in disease diagnosis. The LIME framework and LLM (ChatGPT) were selected for their ability to provide intuitive visual and textual explanations of model decisions. By applying LIME to the Random Forest model, it became clear how particular symptoms influenced the diagnostic predictions. By applying LIME to the test subset and locally approximating the model with an interpretable substitute model, this study was able to identify important symptoms that influenced the model's diagnoses for particular instances, as illustrated in
Figure 7,
Figure 8 and
Figure 9 for the three models. How much each symptom contributed to the final diagnosis is shown by the length of the bars. The diagnosis is moved into the positive class, which is the presence of disease, by the symptoms on the right, and into the negative class, which is the absence of a disease, by the symptoms on the left. This makes it possible for medical professionals to comprehend the logic behind predictions, and the system promotes adoption, builds trust, and facilitates well-informed decision-making.
Complex diagnostic outputs were translated into natural language using ChatGPT. Since the development environment was based on the Python platform, this was readily available. As seen in
Table 5, the sample prompt uses the LIME model's list of diagnosed illnesses and patient symptoms along with how each contributed to the diagnosis. Prior to being sent as a request to the ChatGPT API, the generated prompt was assigned to a variable and transformed into JSON format.
Figure 10 shows the results produced by the ChatGPT platform, where the diagnoses, important symptom contributions, and detrimental contributors are explained.
The minimum requirements for this basic app are an Android OS version 4.0 or higher, 4 GB of RAM (at least 2 GB), 8 GB of ROM, a portrait display layout, and an Internet connection. The healthcare worker can use the form in
Figure 11 to create an account following a successful installation. Once an account has been created, the system administrator must confirm the healthcare worker's information before sending the password to the healthcare worker's email address so they can log in. Through
Figure 12, the healthcare professional logs into the system using their email address and password.
The healthcare worker is shown a user-friendly dashboard in
Figure 13 following a successful system login. The dashboard allows the healthcare professional to register new patients (
Figure 14) and view the list of registered patients (
Figure 15).The healthcare professional can either automatically navigate to the patient's dashboard (
Figure 16) following a successful patient registration or search for and click on the patient's name from the patient list. The patient's dashboard allows them to view their past medical history as well as take and examine their history (
Figure 17). As seen in
Figure 18, the mobile app provides provisional diagnoses following a successful history taking and examination. It lists all probable diseases the patient may have along with a LIME chart and a ChatGPT explanation of the diagnoses.
Figure 11.
Healthcare Worker Signup.
Figure 11.
Healthcare Worker Signup.
Figure 13.
Healthcare Dashboard.
Figure 13.
Healthcare Dashboard.
Figure 14.
Patient Registration Page.
Figure 14.
Patient Registration Page.
Figure 16.
Patient Dashboard.
Figure 16.
Patient Dashboard.
Figure 17.
History Taking and Examination.
Figure 17.
History Taking and Examination.
Figure 18.
Explainable Diagnosis Page.
Figure 18.
Explainable Diagnosis Page.
This is a dependable tool to aid healthcare workers in diagnosing febrile diseases while addressing the critical need for transparency in AI-driven healthcare solutions. With a good balance between interpretability and diagnostic performance, Random Forest was reliable and easily interpreted, showing a strong performance in diagnosing most of the febrile diseases, its moderate complexity makes it easier to integrate into mobile apps for real-time diagnoses. RF's interpretability through LIME makes it easier for healthcare professionals to understand the diagnoses and validate the system, crucial for real-world application in healthcare settings. The GPT model explanation is suitable for use in our system because of its context-based explanations of complex results. GPT, as a large language model, can generate relevant and contextually appropriate diagnostic information based on patient symptoms. In this case, the diagnoses for diseases like Typhoid Fever, HIV/AIDS, Urinary Tract Infection, Respiratory Tract Infection, and Tuberculosis are aligned with known medical presentations. The combination of symptom-specific input and advanced language processing allows the GPT model to interpret complex medical data, making it valuable for diagnosis in resource-scarce settings.
5. Conclusions
This study successfully developed a data-driven and explainable diagnostic model for febrile diseases, combining ML algorithms with XAI frameworks and LLMs. The model demonstrated strong predictive performance and addressed the critical issue of transparency in AI-driven healthcare. The RF, XGBoost, and MLP algorithms exhibited robust capabilities, with RF achieving the highest performance and interpretability metrics. Integration with XAI frameworks such as LIME and LLMs like ChatGPT provided textual and visual explanations, increasing trust and usability for healthcare providers. The system proved adaptable to multiple febrile diseases and showcased potential for broader application in diverse healthcare environments. The findings of this research highlight the clinical relevance of the diagnostic framework. Rigorous data preprocessing techniques ensured a clean and well-structured dataset, allowing for optimal ML model training. The system enhanced clinical decision-making by simulating expert reasoning, enabling timely and efficient diagnosis and treatment. The framework’s scalability and its potential for mobile deployment make it particularly suited for resource-limited settings. Additionally, the study emphasized the importance of transparency in AI tools, as the interpretability provided by XAI frameworks significantly improved user trust and system adoption. To maximize the system’s real-world impact, several recommendations are proposed. Healthcare providers in LMICs should consider piloting the model in clinical settings, supported by collaborations with public health agencies. Expanding the dataset to include diverse populations, especially pediatric patients under five years old, and broader disease categories, such as hemorrhagic fevers, will enhance the system’s applicability. Continuous training and updates using new patient data and advancements in AI are also crucial. Moreover, deploying the model on mobile platforms can increase accessibility for frontline healthcare workers, particularly in remote and underserved areas. Despite its strengths, the study faced certain limitations. The exclusion of pediatric populations under five years of age and diseases like hemorrhagic fevers restricted the dataset’s scope, reducing the model’s generalizability. While LIME and ChatGPT provided valuable explanations, further refinement is needed to mitigate potential biases or oversimplifications in their outputs. The resource-intensive nature of the model’s development and the need for extensive training of healthcare providers could also pose challenges to its adoption in LMICs. Additionally, long-term validation is necessary to confirm the model’s effectiveness and reliability in real-world scenarios. In conclusion, this study demonstrates the transformative potential of integrating explainable AI and ML methodologies to address diagnostic challenges in LMICs. By promoting transparency, scalability, and clinical relevance, the proposed system represents a significant step forward in improving healthcare delivery for febrile diseases. Addressing the identified limitations and expanding the system’s applicability will further enhance its impact on global health outcomes
Author Contributions
Conceptualization, F.-M.U., C.A., and K.A.; methodology, K. A., C.A., and F.-M.U.; validation, F.-M.U., K. A., and C.A. .; formal analysis, K. A.; data curation, K.A..; writing—original draft preparation, K. A., and C.A. and F.-M.U.; writing—review and editing, K. A., C. A., and F.-M.U. supervision, F.-M.U., and C. A. ; project administration, F.-M.U. funding acquisition, F.-M.U. All authors have read and agreed to the published version of the manuscript
Funding
Please add: This research was funded by the New Frontier Research Fund, grant number NFRFE-2019-01365 between April 2020 and March 2024.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The raw data supporting the conclusions of this article will be made available by the authors on request.
Conflicts of Interest
The authors declare no conflict of interest
References
- Premaratna, R. Dealing with acute febrile illness in the resource-poor tropics. Trop. Med. Surg. 2013, 1, 1. [Google Scholar] [CrossRef]
- Butcher, L. Prognosis? Misdiagnosis! The High Price of Getting It Wrong. Manag. Care (Langhorne, Pa.) 2019, 28 (3), 32–36.
- Attai, K.; Amannejad, Y.; Vahdat Pour, M.; Obot, O.; Uzoka, F. M. A systematic review of applications of machine learning and other soft computing techniques for the diagnosis of tropical diseases. Trop. Med. Infect. Dis. 2022, 7(12), 398. [Google Scholar] [CrossRef]
- Bagam, N. Applications of Machine Learning in Healthcare Data Analysis. Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol. 2020. [CrossRef]
- Naveed, M. A. Transforming healthcare through artificial intelligence and machine learning. Pak. J. Health Sci. 2023, 01 (1). [CrossRef]
- Kupusinac, A.; Doroslovački, R. An Overview of the Algorithmic Diagnostics Methodology: A Big Data Approach. 2018 Zooming Innovation in Consumer Technologies Conference (ZINC) 2018, 104–105. [Google Scholar] [CrossRef]
- Wu, W.; Zhou, H. Data-driven diagnosis of cervical cancer with support vector machine-based approaches. IEEE Access 2017, 5, 25189–25195. [Google Scholar] [CrossRef]
- Gupta, D.; Kose, U.; Le Nguyen, B.; Bhattacharyya, S. Artificial Intelligence for Data-Driven Medical Diagnosis; De Gruyter: Berlin, Boston, 2021. [Google Scholar] [CrossRef]
- Jiang, S.; Wang, T.; Zhang, K. H. Data-driven decision-making for precision diagnosis of digestive diseases. Biomed. Eng. Online 2023, 22(1), 87. [Google Scholar] [CrossRef] [PubMed]
- Hu, J.; Perer, A.; Wang, F. Data-driven analytics for personalized healthcare. In Healthcare Information Management Systems: Cases, Strategies, and Solutions; Springer, 2016; pp 529–554. [CrossRef]
- Melnykova, N.; Shakhovska, N.; Gregus, M.; Melnykov, V.; Zakharchuk, M.; Vovk, O. Data-driven analytics for personalized medical decision-making. Mathematics 2020, 8(8), 1211. [Google Scholar] [CrossRef]
- Mendhe, D.; Dogra, A.; Nair, D. S.; Punitha, S.; Preetha, D. S.; Babu, G. T. AI-Enabled Data-Driven Approaches for Personalized Medicine and Healthcare Analytics. 2024 Ninth International Conference on Science Technology Engineering and Mathematics (ICONSTEM) 2024, 1–5. [Google Scholar] [CrossRef]
- Ivanović, M.; Autexier, S.; Kokkonidis, M. AI Approaches in Processing and Using Data in Personalized Medicine. Symposium on Advances in Databases and Information Systems 2022. [CrossRef]
- Ekanayake, I. U.; Meddage, D. P.; Rathnayake, U. S. A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP). Case Stud. Constr. Mater. 2022. [CrossRef]
- Kulaklıoğlu, D. Explainable AI: Enhancing Interpretability of Machine Learning Models. Hum.-Comput. Interact. 2024. [CrossRef]
- Alblooshi, M.; Alhajeri, H.; Almatrooshi, M.; Alaraj, M. Unlocking Transparency in Credit Scoring: Leveraging XGBoost with XAI for Informed Business Decision-Making. 2024 International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA), 2024, 1–6. [CrossRef]
- Quinn, T. P.; Jacobs, S.; Senadeera, M.; Le, V.; Coghlan, S. The Three Ghosts of Medical AI: Can the Black Box Present Deliver? Artificial Intelligence in Medicine, 2020. [CrossRef]
- Inukonda, J.; Rajasekhara Reddy Tetala, V.; Hallur, J. Explainable Artificial Intelligence (XAI) in Healthcare: Enhancing Transparency and Trust. International Journal For Multidisciplinary Research, 2024. [CrossRef]
- Huang, S.; Mamidanna, S.; Jangam, S.; Zhou, Y.; Gilpin, L. Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations. ArXiv, 2023, abs/2310.11207. [CrossRef]
- Hsu, C.; Wu, I.; Liu, S. Decoding AI Complexity: SHAP Textual Explanations via LLM for Improved Model Transparency. 2024 International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan), 2024, 197–198. [CrossRef]
- University of Uyo Teaching Hospital; Mount Royal University. NFRF Project Patient Dataset with Febrile Diseases [Data Set]. Zenodo, 2024. [CrossRef]
- Murphy, A.; Moore, C. Random Forest (Machine Learning). Radiopaedia.org, 2019. [CrossRef]
- Yadav, D. C.; Pal, S. Analysis of Heart Disease Using Parallel and Sequential Ensemble Methods With Feature Selection Techniques. International Journal of Big Data and Analytics in Healthcare, 2021. [CrossRef]
- Yu, H.; Samuels, D. C.; Zhao, Y.; Guo, Y. Architectures and Accuracy of Artificial Neural Network for Disease Classification from Omics Data. BMC Genomics, 2019, 20. [CrossRef]
- Ribeiro, M.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016. [CrossRef]
- Thakkar, P. Drug Classification Using Black-Box Models and Interpretability. International Journal for Research in Applied Science and Engineering Technology, 2021. [CrossRef]
- Sirriani, J.; Sezgin, E.; Claman, D. M.; Linwood, S. Medical Text Prediction and Suggestion Using Generative Pretrained Transformer Models with Dental Medical Notes. Methods of Information in Medicine, 2022, 61, 195–200. [Google Scholar] [CrossRef]
- Kumar, T.; Kait, R. ; Ankita; Rani, S. Possibilities and Pitfalls of Generative Pre-Trained Transformers in Healthcare. 2023 International Conference on Advanced Computing & Communication Technologies (ICACCTech), 2023, 37–44. [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).