Preprint
Article

This version is not peer-reviewed.

Scalability and Accuracy Assessment of Frequent Pattern Mining Algorithms Applied to Large-Scale Hospital Databases

Submitted:

23 November 2025

Posted:

01 December 2025

You are already at the latest version

Abstract
Frequent pattern mining (FPM) has become an essential analytical technique in healthcare for discovering clinically relevant associations, predicting disease risks, and improving decision-making systems. As hospital databases continue to grow in size and complexity, evaluating the scalability and accuracy of FPM algorithms becomes increasingly important. This study provides a comparative assessment of three widely used FPM algorithms—Apriori, FP-Growth, and ECLAT—when applied to large-scale hospital datasets. Using simulated and real-world electronic health records (EHRs), the algorithms were compared based on runtime efficiency, memory consumption, scalability, and accuracy in identifying meaningful disease co-occurrences and risk factors. Results show that FP-Growth significantly outperforms Apriori and ECLAT in scalability and computational efficiency, while ECLAT demonstrates better performance in sparse datasets. Apriori, although accurate, struggles with large datasets due to exponential candidate generation. The study concludes with practical recommendations for algorithm selection in healthcare data mining environments.
Keywords: 
;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated