Preprint
Article

This version is not peer-reviewed.

Enhancing Adverse Event Monitoring and Management in Phase IV Chronic Disease Drug Trials: Applications of Machine Learning

Submitted:

07 February 2025

Posted:

10 February 2025

You are already at the latest version

Abstract
At present, Phase IV studies usually rely on RWD and RWE to determine the long-term safety, effectiveness, and, cost benefits of drugs. This is applicable to public health and medical practices and impacts drug and health decision-making at the clinical level. In this paper, we firstly present a novel Transformer model-based method which could further promote the long-term application benefits of the chronic disease drugs including the metreleptin in lipodystrophy. Through the self-attention mechanism, the model could interestingly capture the time series and correlation in clinical data, and process the multimodal real world data (e.g., patient history, long-term follow up data, etc.), and is capable of real-time monitoring of drug safety and efficacy. In fact, due to the addition of data augmentation and self-supervised learning strategy, the model can still achieve high prediction performance in low sample settings, and quickly and accurately identify potential adverse events including the acute pancreatitis, liver adverse events, etc. and their role in glycosylated hemoglobin (HbA1c), triglyceride and other indicators. Experimental results demonstrate that the proposed Transformer model significantly outperforms traditional methods in adverse event prediction and risk assessment, making it a valuable tool for accurate long-term drug monitoring.
Keywords: 
;  ;  ;  

1. Introduction

Phase IV clinical trials, the final phase of drug testing, serve a key role in post-marketing surveillance. While classical early-phase clinical trials are designed to assess a drug’s initial safety, efficacy in small populations, and pharmacokinetics, the main objective of Phase IV is to assess a drug’s long-term performance in a real-world setting, including safety, efficacy, and economic benefits. The phase doesn’t conduct research in a controlled clinical setting but instead studies how the drug performs or behaves in a variety of real-world use cases. In order to enable informed clinical decisions and public health policies, Phase IV studies determine how drugs performed in the post-marketing market, thus providing patients, physicians, and regulatory agencies with pertinent data [1]. Hence, Phase IV studies are vital in assessing the real clinical value of a drug, which will impact market sustainability, health insurance coverage policies, and the formulation of public health strategies.
Unlike conventional RCTs, Phase IV studies often enroll a population of patients that is more representative of patients in the “real-world,” including patients of all ages, genders, and ethnic backgrounds with various comorbidities. Due to strict enrollment conditions, RCT samples are unlikely to reflect real world effectiveness; in contrast, Phase IV subjects reflect true drug performance across different populations and various clinical settings. Patients’ disease state can often be more complex and multiple diseases might coexist within patients at different treatment stages. This enables the results of Phase IV studies to not just be representative of the general efficacy of the drug but also give a measure of treatment efficacy and safety for specific patient populations [2]. These studies typically have high external validity (i.e., generalizability to the real-world clinical situation) and can therefore provide more substantive evidence for common use of the drug.
The real-world data and real world evidence based analysis is the most crucial part of the phase IV studies. These data are gathered from the treatment work of health care providers and patients, and are primarily collected through electronic health records, health insurance claims data, patient registration databases, and medical cost data. This includes information about drug effectiveness, safety, patient compliance, and the socio-economic advantages of various treatment choices, as it covers long-term use of drugs in clinical settings [3]. According to RWD and RWE, Phase IV studies can now go beyond tightly controlled conditions in clinical trials to obtain more representative clinical information in the real world. Analyzing this data, researchers can not only assess the effectiveness of drugs in certain groups of people, but also examine the economics of drugs in different treatment settings, offering additional evidence for public health policies, health cover policies, and drug pricing. As the concept of personalized medicine is gradually promoted, the results of the Phase IV study will help promote the adjustment of medical policies in different regions of the world and support the adoption of personalized and precise treatment plans [4].
Nonetheless, due to the prevalence of drugs in clinical applications and the expanding population of patients, monitoring and managing adverse drug events (AEs) has become increasingly complicated. In particular, the drugs for chronic diseases often take long-term or even life-longing use, and so the adverse events of drugs often have the cumulative effect and may also show different characteristics in different time points of drug use. For instance, the side effects of certain drugs for chronic diseases may at first be subtle, but the frequency of adverse events may increase exponentially with extended utilization [5]. Moreover, patients with chronic diseases frequently have several comorbidities, meaning that the effects of drugs could be influenced by multiple factors, thus complicating monitoring. Because of the individuality of patients and the heterogeneity in drug administration, real-time monitoring and effective management of adverse events during long-term use should be addressed in Phase IV studies.
Conventional methods of safety monitoring depend on clinician and patient self-reporting and manual screening, all of which have major limitations. First, patients and physicians may not use this information in a timely manner due to information asymmetry, time pressure, or neglecting symptoms. Second, the manual screening for large amounts of data is time-consuming and labor-intensive and vulnerable to influence by factors like patient-reported bias and clinical judgment bias that might lead to underestimation of adverse drug events [6]. As the number of drug users and the amount of data increase, efficiently mining valuable information from the huge clinical data and identifying the potential drug risk factors is becoming more and more important. As the patient population expands and the volume of adverse event data grows, the monitoring work cannot rely on traditional manual methods, but must use modern technical means, especially machine learning and artificial intelligence technology, to improve the efficiency and accuracy of analysis.

2. Related Work

Medeiros-Ribeiro et al. [7] assessed the immunogenicity and safety of the inactivated CoronaVac vaccine manufactured by Sinovac in patients with autoimmune rheumatism. The results of a recent study showed that the CoronaVac vaccine induced an effective immune response among patients with rheumatoid arthritis and that the vast majority of patients experienced no serious adverse events. The study also found that a small number of people reported mild side effects like pain and fever at the injection site. This multicenter, open-label, Phase 4 study has provided helpful clinical data regarding COVID-19 vaccination in patients with autoimmune diseases.
Humby et al. [8] investigated the effectiveness of biopsy-guided precision treatment strategies in patients with rheumatoid arthritis (RA). This study applied a multicenter, randomized-controlled study in the study of the response of different subtypes of patients with RA after 16 weeks of treatment. Moreover, the results showed that individualized treatment strategies can significantly improve the symptoms of patients with specific subtypes, which has put forward a new evidence for the treatment of rheumatoid arthritis. The study indicates that patient stratification in clinical practice through individualized treatment not only improves efficacy, but could also serve as effective guidance in clinical practice. Lagoumtzi et al. [9] the use of senolytics and senomorphics in aging and chronic diseases. Their study notes that the huge potential of aging treatments does require further validation of their efficacy and safety through clinical trials. The study also mentioned potential adverse effects of treatments targeting aging processes, particularly in the context of chronic diseases like cardiovascular, metabolic, and neurodegenerative diseases.
The study proposed by Crosby et al. [10] discussed effects of ketogenic diet for chronic disease management By conducting a review of the available literature they assessed the impacts of the ketogenic diet on chronic diseases, including diabetes, obesity, and cardiovascular disease. In some patients, the ketogenic diet has a much deeper impact on the chronic disease management, however, it may come with side effects like kidney problems and electrolyte imbalance. Dravet syndrome cannot be treated, and patients with it only have medications available to control seizures. Nevertheless, the ketogenic diet is a potential treatment that is still in the clinic and it should be used carefully and monitored closely. Lazarus et al. [11] performed a Phase 4 UK feasibility trial (ComFluCOV) to assess interaction between COVID-19 vaccination (ChAdOx1 or BNT162b2), given with seasonal influenza vaccination/*. “This study found that the COVID-19 vaccine with the flu vaccine resulted in a robust immune response in most adults without significant side effects,” said study co-author, Dr. Richard McLarnon, from the University of Wisconsin, whose expertise in infectious disease and vaccine development helped design the study. Such empirical data will assist in informing public health policies for vaccine combinations and will support vaccination strategies worldwide.
Clemens et al. [12] discussed a Phase 4 non-inferiority study to assess immune response in adults who received two doses of the CoronaVac vaccine after a booster (third booster) dose. Study results demonstrated notable increases in antibody levels following booster vaccination with no serious adverse events. The study have important clinical evidence for booster vaccination that showed the third dose has a more stimulating effect on the immune system.

3. Methodologies

3.1. Spatiotemporal Graph Convolutional Network and Self-Attention Mechanism

In the analysis of the long-term application effect of chronic disease drugs, it is crucial to capture the changes in time series data and the spatial dependence between patients. First, we introduce the dynamic adaptive timing adjustment function γ t , which is used to weighting the effects of different time points in the self-attention mechanism. By modifying the calculation formulas of queries, keys and values, the following self-attention formulas are obtained, such as Equation 1:
A Q , K , V , t = S o f t m a x Q K T d k + γ t · V , 1
where Q is the query vector, K is the key vector, V is the value vector, and d k is the dimension of the key vector. To enhance the flexibility of the model to the change in drug efficacy, we designed a time-weighted function γ t , as shown in Equation 2:
γ t = λ 1 · 1 + exp λ 2 t . 2
The function of this function is to adjust the influence of each time step to ensure that the model is less responsive to the drug effect in a short period of time, and in the long term, the model pays more attention to the long-term effect as the data accumulates.
At the same time, the model obtains the time-series weighted output T t through the self-attention output of the weighted time step, which is calculated by Equation 3:
T t = i = 1 N φ t · A Q i , K i , V i , t , 3
where φ t is the time gain function, defined as Equation 4:
φ t = λ 3 · 1 + exp λ 4 t . 4
This gain function ensures that over time, the model will place more and more emphasis on learning about long-term drug effects. Hyperparameter tuning is a critical step when building a machine learning model, and hyperparameters like lambda (regularization strength) control the complexity of the model and help avoid overfitting. In order to process multimodal data such as patient history, laboratory data, and imaging data, we introduce a weighted cross-modal fusion mechanism. The features of each modality are weighted and fused by the following Equation 5:
F X 1 , X 2 , , X M = m = 1 M ω m · M L P m X m + m = 1 M η m · A t t e n t i o n X m , 5
where ω m and η m are the weighting coefficients corresponding to the mode m , respectively, which are obtained by backpropagation learning. In this way, the model can weighted and fused information from different data sources to improve the predictive ability of drug efficacy and adverse events.
At the same time, considering the spatial relationship between patients, especially the similarity of drug responses of the same patient population, we introduced the Spatiotemporal Graph Convolutional Network (ST-GCN). The network can capture the spatial dependencies between patients through the graph convolutional layer, and infer the overall effect of the patient population while learning the individual effects of patients. The formula for calculating the convolution of a spatiotemporal graph is Equation 6:
H l + 1 = σ A ^ H l W l , 6
where A ^ is the normalized adjacency matrix between patients, H l is the node feature matrix of the l -th layer, W l is the weight matrix of the graph convolutional layer, and σ is the activation function. Through graph convolution operations, the model can share information about drug efficacy among different patient populations to improve the overall prediction ability.

3.2. Self-Supervised Learning with Graph Embedding

Data scarcity is often an issue when dealing with adverse event predictions, especially in long-term studies of chronic disease drugs. To this end, we introduce a self-supervised learning strategy to improve the robustness of the model by designing a self-supervised loss function. The self-supervised loss function combines drug efficacy prediction and negative sampling strategies, as in Equation 7:
L S S L = i = 1 n X ^ i X i 2 + λ 5 · j = 1 k L o s s X j , Y j + λ 6 · R θ , 7
where X ^ i is the predicted value, X i is the actual observed value, L o s s X j , Y j is the prediction error of each pair of samples, and R θ is the regularization term of the model parameters to prevent overfitting. Through self-supervised learning, the model can extract the relationship between potential drug effects and adverse events from scarce data, thereby improving the prediction accuracy.
In terms of drug effect prediction, graph embedding technology provides an effective way to combine the individual characteristics of patients with drug effects. Figure embedding derives the effects of drugs in different patient populations by learning the relationship between drugs and patients. The graph embedding process is calculated by Equation 8:
z = G r a p h E m b e d G , θ = S o f t m a x A Z + B C , 8
where A and B are the matrices of the relationship between the patient and the drug in the graph, C is the normalized coefficient, and Z is the embedded representation of the drug effect. AZ + B means that Z (the embedded representation of the drug) is first linearly transformed with A (the relationship between the patient and the drug), and then the B bias term is added to obtain a weighted result, which can be used as a predictor of the drug’s efficacy in different patients.
In order to further improve the accuracy and stability of the model, especially in the monitoring of adverse drug events, we optimized the model by combining the negative sampling strategy and the regularization mechanism. When dealing with adverse events, learning from negative samples becomes particularly critical. Prediction errors and regularization terms are minimized by optimizing the following loss functions, as in Equation 9:
L f i n a l = L S S L + λ 7 · L r e g + λ 8 · L n e g , 9
where L r e g is the regularization term, L n e g is the learning loss of negative samples, and λ 7 and λ 8 are hyperparameters.
By using the self-supervised learning and negative sampling strategies, the model can effectively reduce overfitting and improve the prediction accuracy of adverse events. Through the design of these innovative formulas, our model can effectively improve the accuracy of adverse event monitoring and drug efficacy evaluation in the study of long-term use of drugs for chronic diseases, and maintain high robustness in the face of scarce data.
The threshold selection method for cost is to choose a threshold to minimize the overall misclassification cost or maximize a custom utility function to optimize the model’s decision-making effect when the cost of false positive and false negative misclassification is different.

4. Experiments

4.1. Experimental Setup

Above all, the national rheumatic disease patient registry (NRDPR) was utilized. This database integrates long-term clinical data from patients with rheumatic diseases in several countries/regions, particularly patients with rheumatoid arthritis (RA) and other autoimmune disorders. The dataset is multi-dimensional including the personal data, medical history, medication, lab tests, disease activity index, the complications and responses to treatment, etc. It has the characteristics of high-dimensional multi-type data (time series data, categorical data, continuous numerical data, etc.), and a large number of missing and inconsistent respectively, bringing difficulties to the training of machine learning models. The data includes information on patients of different ethnicities, gender, and ages, with a wide range of severity and types of complication, and over an extended time frame, with some patients followed for greater than 5 years. The analysis of safety and efficacy during the long-term use of drugs for patients with chronic diseases, based on this real-world data, provides insight into the clinical responses that cannot be obtained through clinical trials, including an in-depth analysis of clinical responses, such as adverse events that may occur during treatment and their influence on drug efficacy.

4.2. Experimental Analysis

To validate the effectiveness of the proposed model, four common comparison methods were used, random forest (RF), support vector machine (SVM), gradient boosting tree (GBDT) and long short-term memory network (LSTM). As an ensemble learning method, random forest is well suited for handling high-dimensional data and missing values, having been commonly used for predicting diseases and drug efficacy. A support vector machine (SVM) processes nonlinear data by constructing an optimal hyperplane, which is suitable for small-sample, high-dimensional tasks. Gradient boosting trees are able to reduce errors by developing many decision trees, which is suitable for capturing complex nonlinear relationships and derive high precise prediction results. Unlike convolutional neural networks, which have high computational costs associated with their quadratic time complexity, long short-term memory networks are recurrent neural networks that work well with time-series clinical data, such as drug long-term effects or cumulative adverse events..
Using specificity as the primary evaluation metric, we assess the performance of various algorithms across diverse thresholds. Specificity is the true negative rate, a measure of a classification model’s ability to identify negative class samples. As we can see from our results for each algorithm including the mentioned RF, SVM, GBDT, LSTM, and our Ours method. We plotted the specificity as a function of threshold which visually compared performance under different thresholds.
The results of Figure 1 indicated that the specificity of all methods increased to varying degrees as the threshold increased. More specifically, this is because with a higher threshold, the model is more stringent in predicting positive samples and thus, the false positives decrease while specificity increases. The Ours method outperforms all other methods consistently and has the highest specificity. This shows that the Ours method is superior to identify negative samples under multiple thresholds through provide more accurate judgment on some occasions (such as drug safety monitoring). In summary, the Ours method is particularly advantageous for fast specificity increase as well as final stability—especially under high threshold conditions, outperforming other methods. By means of the visual analysis, we further confirm the feasibility and practical relevance of our model.
The following the Matthews correlation coefficient (MCC) was taken as the evaluation index, and MCC is an important metric to assess between the performance of dichotomous model and has high value especially in the event of data imbalance. The range of MCC is -1 to 1, where 1 is perfect classification effect, -1 is completely wrong classification rules, and 0 is random classification.
From above Figure 2, we can observe that the Ours method outperforms all other works in terms of MCC in the comparison of multiple methods, and the MCC values of Ours method increase with the increase of threshold setting, which indicates that it can maintain good classification performance for different settings of thresholds. Compared with other methods ( such as RF, SVM, GBDT, and LSTM ), the MCC performance of the Ours method is better than that of other methods, especially at higher thresholds, which can balance positive and false positive better and get more accurate classification results. Such that the Ours method has the capability to increase stability and predictive ability on the complex data background, especially in the long-term efficacy analysis of chronic disease drugs and better ability of exploration base for therapeutic effect and potential risks.

5. Conclusion

In general, in the analysis of drug effects for chronic diseases and especially for real-world data, this study presents the benefits the proposed method can achieve for traditional machine learning method measured by Matthews correlation coefficient (MCC). Under different threshold settings, the proposed method achieves a better classification performance, which confirms that the proposed method is more robust on the monitoring for drug efficacy and adverse events. This is especially critical for Phase IV clinical trials, in which long-term efficacy and safety of the drug is analyzed to inform deliberation and public health policy. In future research, others may incorporate more diverse data inputs to enhance the model’s predictive capabilities. Simultaneously, advanced technologies, such as semi-supervised learning or active learning, could address the data imbalance and missing data issue, and further enhance the business performance of the model.

References

  1. Kanemitsu, Yukihide, et al. “Primary tumor resection plus chemotherapy versus chemotherapy alone for colorectal cancer patients with asymptomatic, synchronous unresectable metastases (JCOG1007; iPACS): a randomized clinical trial.” Journal of Clinical Oncology 39.10 (2021): 1098-1107.
  2. Wu, S.; Huang, J.; Zhang, Z.; Wu, J.; Zhang, J.; Hu, H.; Zhu, T.; Zhang, J.; Luo, L.; Fan, P.; et al. Safety, tolerability, and immunogenicity of an aerosolised adenovirus type-5 vector-based COVID-19 vaccine (Ad5-nCoV) in adults: preliminary report of an open-label and randomised phase 1 clinical trial. Lancet Infect. Dis. 2021, 21, 1654–1664. [Google Scholar] [CrossRef] [PubMed]
  3. Sanyal, A.J.; Bedossa, P.; Fraessdorf, M.; Neff, G.W.; Lawitz, E.; Bugianesi, E.; Anstee, Q.M.; Hussain, S.A.; Newsome, P.N.; Ratziu, V.; et al. A Phase 2 Randomized Trial of Survodutide in MASH and Fibrosis. New Engl. J. Med. 2024, 391, 311–319. [Google Scholar] [CrossRef] [PubMed]
  4. Kyriazopoulou, Evdoxia, et al. “Early treatment of COVID-19 with anakinra guided by soluble urokinase plasminogen receptor plasma levels: a double-blind, randomized controlled phase 3 trial.” Nature medi-cine 27.10 (2021): 1752-1760.
  5. Monk, Phillip D., et al. “Safety and efficacy of inhaled nebulised interferon beta-1a (SNG001) for treatment of SARS-CoV-2 infection: a randomised, double-blind, placebo-controlled, phase 2 trial.” The Lancet Respiratory Medicine 9.2 (2021): 196-206.
  6. Herrera, D.; Sanz, M.; Kebschull, M.; Jepsen, S.; Sculean, A.; Berglundh, T.; Papapanou, P.N.; Chapple, I.; Tonetti, M.S.; Consultant, E.W.P.A.M. Treatment of stage IV periodontitis: The EFP S3 level clinical practice guideline. J. Clin. Periodontol. 2022, 49, 4–71. [Google Scholar] [CrossRef] [PubMed]
  7. Medeiros-Ribeiro, A.C.; Aikawa, N.E.; Saad, C.G.S.; Yuki, E.F.N.; Pedrosa, T.; Fusco, S.R.G.; Rojo, P.T.; Pereira, R.M.R.; Shinjo, S.K.; Andrade, D.C.O.; et al. Immunogenicity and safety of the CoronaVac inactivated vaccine in patients with autoimmune rheumatic diseases: a phase 4 trial. Nat. Med. 2021, 27, 1744–1751. [Google Scholar] [CrossRef] [PubMed]
  8. Humby, Frances, et al. “Rituximab versus tocilizumab in anti-TNF inadequate responder patients with rheumatoid arthritis (R4RA): 16-week outcomes of a stratified, biopsy-driven, multicentre, open-label, phase 4 randomised controlled trial.” The Lancet 397.10271 (2021): 305-317.
  9. Lagoumtzi, S.M.; Chondrogianni, N. Senolytics and senomorphics: Natural and synthetic therapeutics in the treatment of aging and chronic diseases. Free. Radic. Biol. Med. 2021, 171, 169–190. [Google Scholar] [CrossRef] [PubMed]
  10. Crosby, L.; Davis, B.; Joshi, S.; Jardine, M.; Paul, J.; Neola, M.; Barnard, N.D. Ketogenic Diets and Chronic Disease: Weighing the Benefits Against the Risks. Front. Nutr. 2021, 8, 702802. [Google Scholar] [CrossRef] [PubMed]
  11. Lazarus, Rajeka, et al. “Safety and immunogenicity of concomitant administration of COVID-19 vaccines (ChAdOx1 or BNT162b2) with seasonal influenza vaccines in adults in the UK (ComFluCOV): a multi-centre, randomised, controlled, phase 4 trial.” The Lancet 398.10318 (2021): 2277-2287.
  12. Clemens, Sue Ann Costa, et al. “Heterologous versus homologous COVID-19 booster vaccination in pre-vious recipients of two doses of CoronaVac COVID-19 vaccine in Brazil (RHH-001): a phase 4, non-inferiority, single blind, randomised study.” The Lancet 399.10324 (2022): 521-529.
Figure 1. Comparison of Specificity across Thresholds.
Figure 1. Comparison of Specificity across Thresholds.
Preprints 148595 g001
Figure 2. Matthews Correlation Coefficient Comparison.
Figure 2. Matthews Correlation Coefficient Comparison.
Preprints 148595 g002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated