3.4. Algorithm and Execution Time Comparison
To determine computational efficiency, execution times for both algorithms were compared before and after optimization techniques. First, Apriori was more efficient due to the sparsity of the data, where there were fewer overlapping sets of combinations. But after noise (introduced complexity) was introduced to Apriori and optimization techniques (like removing infrequent items) were applied to FP-Growth, the latter was more efficient[25-26].
| Algorithm |
Execution Time (s) |
| Apriori |
~2.3 |
| FP-Growth |
~3.6 |
Initial Execution Time:
After Optimization:
| Algorithm |
Execution Time (s) |
| Apriori (with added noise) |
~6.2 |
| FP-Growth (optimized) |
~1.8 |
Insights:
Apriori is more efficient on smaller and denser datasets with fewer candidate item sets generated.
FP-Growth, although initially slower, handles larger datasets better when optimized with faster execution and deeper pattern exploration.
These findings suggest that the choice of algorithm must be driven by the data structure and the analysis goals, with FP-Growth better optimized for scalability and performance on more complex datasets.
Implementation of Apriori Algorithm
The Apriori algorithm is a foundational technique in the field of data mining, particularly useful for uncovering associations among items within large datasets. Its primary strength lies in its ability to identify co-occurring elements—such as drugs—in complex records, and generate meaningful association rules. The algorithm works by performing multiple scans over the dataset to identify recurring combinations of drugs implicated in overdose fatalities. Through its systematic pattern discovery, Apriori allows analysts to identify potentially lethal substance combinations, guiding forensic investigations and public health interventions (GeeksforGeeks, 2018).
Steps for Apriori Algorithm Implementation
Apriori algorithm implementation in this research is guided by an ordered set of procedures. To begin with, the dataset is passed through to extract frequent itemsets—drug pairs that occur simultaneously in overdose incidents. These are then filtered for statistical significance against a minimum support threshold. For example, if Heroin and Fentanyl tend to co-occur together, then they represent a valid candidate itemset. Next, the algorithm generates candidate item sets by growing these frequent pairs to include larger sets, such as Heroin + Cocaine + Fentanyl. Sets with less than minimum support is discarded in a process known as pruning, which reduces computational overhead and focuses analysis on high-impact interactions[27-29].
Following itemset generation, the algorithm produces association rules from residual frequent combinations. The rules are evaluated against crucial measures:
Support how frequently a combination appears
Confidence, the probability that one drug appears given the presence of the other
Lift, how much more frequently drugs co-appear than by chance
This generation of the rule offers valuable information on the most lethal pairings of drugs that commonly lead to death.
Analysis of Apriori Algorithm
In an effort to generate meaningful association rules, support and confidence threshold values were set. A minimum support of 0.05 (5%) was set so that a set of drugs needs to occur in at least 5% of the overdose cases in order to be considered. A minimum confidence of 0.60 (60%) was also set so that any rule has to have at least 60% precision in predicting the presence of one drug when the other is present. These thresholds balance between discovering beneficial rules and avoiding spurious relations that may occur randomly.
Frequent Itemset Generation
The algorithm was also used to identify frequent drug pairs at the defined support threshold. For instance, combinations such as Fentanyl + Cocaine and Fentanyl + Heroin were identified with high frequency among overdoses. Such findings provide the bottom level for rule generation and help determine which of the drugs are more likely to be ingested together in fatal scenarios.
Association Rule Extraction
Association rules were derived by examining frequent item sets with the association_rules() function. Rules show how the presence of some drugs signifies the presence of others. Rules were filtered according to the set confidence threshold to only retain those with 60% or higher confidence. The most significant measures displayed in the final output are:
Antecedents: The combinations of drugs that were seen
Consequents: The drugs that were predicted to be present
Support: Combination prevalence
Confidence: Predictive performance of the rule
Lift: Strength of association above random chance, i.e., lift > 1 and statistically significant.
The most confident rule found was Xylazine → Fentanyl with 99% confidence and a lift of 1.48, showing an extremely strong and non-chance-dependent association. These findings map directly to revealing high-risk drug combinations that may warrant policy and clinical attention.
Results Interpretation:
The Apriori algorithm extracted key association rules which reveal a strong correlation between the drugs used in overdose cases. The following table listing some of the top rules according to lift and confidence values:
| Antecedents |
Consequents |
Support |
Confidence |
Lift |
| (Heroin/Morph/Codeine) |
(Fentanyl) |
0.1149 |
63% |
0.95 |
| (Cocaine) |
(Fentanyl) |
0.2772 |
72% |
1.08 |
| (Heroin/Cocaine) |
(Fentanyl) |
0.0586 |
61% |
0.91 |
| (Ethanol) |
(Fentanyl) |
0.1814 |
67% |
1.01 |
| (Xylazine) |
(Fentanyl) |
0.0896 |
99% |
1.48 |
FP-Growth Algorithm Implementation
The FP-Growth algorithm serves as a scalable, efficient alternative to traditional association rule mining algorithms such as Apriori. Because FP-Growth does not do multiple scans of the database and does not generate extensive candidacies, it eliminates the inefficiencies attributed to Apriori's high computational costs. Instead, FP-Growth compresses the dataset into a compact structure, known as a Frequent Pattern Tree (FP-tree), enabling direct mining of frequent itemsets without generating candidate sets. This property makes FP-Growth well-suited for large databases and cases where mining for frequent itemsets must occur with minimal overhead.
FP-Growth Algorithm Fast Performance
The FP-growth algorithm was created to overcome the major drawbacks of Apriori: the excessive data scan and huge candidate itemsets generation. These causes could slow the processes, especially when it comes to large and complicated datasets[30-32]. Hence, the FP-growth algorithm addresses all these problems by providing a prefix-based tree structure called FP-tree, which retains all necessary compressed information about the frequent patterns. Hence, the algorithm can directly mine the itemsets stored in the tree, which minimizes the required computation, therefore making it faster and memory-efficient.
3.5. Steps for the Implementation of FP-Growth Algorithm
The FP-growth algorithm works by a well-defined procedure ensuring its efficiency and scalability. The first step is data compression, where the entire dataset is converted to an FP-tree. The structure of this tree retains only the essential frequency information of itemsets and does away with the generation of candidates by exhaustion. Thereafter follows mining the FP-tree, where frequent itemsets are generated based on a set minimum support threshold. The tree is then recursively divided into smaller conditional FP-trees for each item for localized mining. Finally-in the third phase-frequent itemsets are derived and bounds representing statistically significant relationships between items are generated as association rules.
Analysis of the FP-Growth Algorithm:
The thresholds were established to extract only meaningful associations in the relevant data using the FP-Growth algorithm. Accordingly, a minimum support criterion level of 5% and a minimum confidence criterion level of 60% were set. The algorithm efficiently finds the most frequent and relevant drug combinations in the dataset based on parameters.
The analysis showed that Fentanyl accounted for the maximum support of 67.16%, with Cocaine and Heroin following at 38.21% and 29.86%, respectively. Two of the combinations, most important being Fentanyl plus Cocaine (27.72%) and Fentanyl plus Heroin (16.86%), have even higher co-occurrence values, indicating that this identifies critical dependency between them.
Association Rule Extraction
As soon as itemsets have been realized frequently, the following stage was association rule extraction using a minimum confidence threshold in order to select only statistically significant associations that aroused the need for obtaining a rule. The function association_rules served to analyze drug combinations and to derive interpretative rules with each rule comprising two segments' antecedents (existing drug combinations) and 'consequents': the drug predicted to appear alongside the antecedent. The following three measures rated the strength and relevance of these rules. Support records how often the drug combination occurs in the database. Confidence is the probability of finding the consequent drug when the antecedents are present. Lift evaluates the strength of the association beyond random chance — a value greater than 1 suggests a meaningful, nonrandom relationship. This set of metrics was further exploited to determine the most important combinations of substances causing overdose fatalities.
Results Interpretation
Following the application of FP-Growth algorithm, a list of association rules was derived that emphasized key drug associations in cases of overdose. The most relevant rules are given in the table below along with their corresponding support, confidence, and lift values:
| Antecedents |
Consequents |
Support |
Confidence |
Lift |
| (Heroin/Morph/Codeine) |
(Fentanyl) |
0.1149 |
63.84% |
0.95 |
| (Cocaine, Heroin) |
(Fentanyl) |
0.0587 |
61.50% |
0.91 |
| (Ethanol) |
(Fentanyl) |
0.1815 |
67.93% |
1.01 |
| (Xylazine) |
(Fentanyl) |
0.0896 |
99.63% |
1.48 |
| (Heroin, Fentanyl) |
(Heroin/Morph/Codeine) |
0.1132 |
67.17% |
3.73 |
Analysis of the Results
The rule (Heroin/Morph/Codeine) → (Fentanyl) showed a confidence level of 63.84%, indicating that in nearly two-thirds of the cases where these opioids were present, fentanyl was also detected. The lift value of 0.95 suggests that this association is just below the level of random chance, pointing to a weak predictive relationship. The combination (Cocaine, Heroin) → (Fentanyl) had a 61.50% confidence and a lift of 0.91. While the confidence is moderate, the lift being below 1 suggests that co-occurrence of cocaine and heroin does not significantly increase the likelihood of fentanyl being present, making it a weak association.
In the rule (Ethanol) → (Fentanyl), confidence reached 67.93% at a lift of 1.01, indicating an almost random relationship. Although ethanol is often found with fentanyl, its presence does not seem to work strongly as an indicator for fentanyl use, nor does it suggest a strong association. A highly significant rule was (Xylazine) → (Fentanyl), with confidence of 99.63% and lift of 1.48. Such a strong lift value is indicative of a meaningful relationship between the two substances and supports the concerns of public health agencies regarding their co-occurrence in overdose deaths. Finally, the rule (Heroin, Fentanyl) → (Heroin/Morph/Codeine) showed confidence at 67.17% with a particularly high lift of 3.73, thus suggesting that when heroin and fentanyl are both present, then there is a strong and non-random prospect that prescription opioids such as morphine or codeine are also involved. This rule emphasizes the complex interdependencies between synthetic, illicit, and prescription opioids in overdose situations.
Key Observations of Association Analysis
1. High-Risk Drug Pairs for Identification
Typical drug combinations have been identified through association rule mining which has critical operational application and using overdoses as an outcome mortality. High-risk interactions are indicated. The most striking patterns reveal the link between Xylazine and Fentanyl. The two had a confidence of 99% with a lift of 1.48, indicating strong and nonrandom co-occurrence. Cocaine and fentanyl similarly had 72% confidence and lift of 1.08, another common potentially lethal pair. The existence of those relationships shows certain drugs, in particular combinations, strongly increase the chances of dying.
They also discovered lethal combinations through association rule mining using electronic health records in subjects similar to Nazyrova et al. (2023). In many cases, according to the U.S. Drug Enforcement Administration (DEA, 2022), the increase in contamination by Xylazine presented as 23% of fentanyl powder and 7% of fentanyl pills in 2022, further signifying its looming danger. Therefore, such evidence validates the applicability of such algorithmically inferred patterns to real-world health public data and emphasizes the importance for more targeted surveillance on these high-risk drug mixtures.
| High-Risk Drug Pair |
Confidence |
Lift |
| Xylazine → Fentanyl |
99% |
1.48 |
| Cocaine → Fentanyl |
72% |
1.08 |
Execution speed of Apriori and FP-Growth algorithms
Comparison performance-wise between Apriori and FP-Growth algorithms shows that both these algorithms are good, but FP-Growth is superior to Apriori with an increasing dataset size. The merit of FP-Growth owes to the tree-based structure, which eliminates the diminished candidate itemset generation and testing —A major drawback in the design of Apriori lies. However, for small or sparse datasets like the one analyzed, at times Apriori ran faster due to a less frequent drug co-occurrence, hence limiting the complexity of candidate generation.
In their study, Mythili and Shanavas (2013) suggest that FP-Growth is highly favorable for high-dimensional datasets as it saves memory and execution time. The introduced complexity in Apriori and optimizations done in FP-Growth, thereby eliminating unnecessary iterations and rare ones, accounted for the significant reduction in execution time of FP-Growth. Their findings suggest that Apriori is useful for small and simple datasets, whereas FP-Growth is advantageous as a fast and scalable approach when optimization comes into play for complex and large-scale data.
| Algorithm |
Initial Time (s) |
After Optimization (s) |
| Apriori |
~2.3 |
~6.2 |
| FP-Growth |
~3.6 |
~1.8 |
Recommendations Based on Findings
Based on the findings, a number of recommendations have been made for public health, forensic science, and future research. One priority for public health agencies should be to monitor high-risk drug combinations, especially Xylazine and Fentanyl, due to their overwhelming association with overdose fatalities. Informing people about combinations often seen with Fentanyl: Heroin, Cocaine, or Ethanol could lower accidents rates with these combinations. Governments ought to tighten up policies regarding emerging substances entering the drug market.
For forensic and law enforcement ranks, the use of association rule mining can help in the development of early detection of overdose patterns which will allow for timely intervention and hence prevent the illicit drug distribution networks from working further. Further development of an AI-based early warning system that includes inputs such as toxicology reports, social media posting, and dark web monitoring would assist in predicting what will become dangerous combinations, hence preventing large-scale destruction.
For the youth education, gamification approaches such as an interactive mobile application simulating the consequences of drug misuse can be used to effectively teach about polysubstance abuse and drugs. Finally, there is room for upcoming studies integrating a hybrid modeling approach that employs Apriori and FP-Growth algorithms with deep learning or graph-based neural networks for better accuracy to identify novel and counterintuitive drug interactions.
| Recommendation Domain |
Actions Suggested |
| Public Health |
Monitor drug mixtures, especially Xylazine-Fentanyl |
| Forensics & Law |
Use rule mining for trend detection and interventions |
| AI & Drug Surveillance |
Develop predictive models for emerging combinations |
| Youth Education |
Create gamified learning tools for substance abuse awareness |
| Research & Modeling |
Explore hybrid models and advanced pharmacokinetics integration |