1. Introduction
Polycystic Ovary Syndrome (PCOS) is a prevalent endocrine and metabolic disorder among women of reproductive age. It is characterized by hyperandrogenism, ovulatory dysfunction, and polycystic ovarian morphology [
1] and is a leading cause of infertility, affecting 5% to 20% of women in this age group globally [
2]. Despite the effectiveness of ovulation induction and lifestyle modifications for many patients, a significant proportion of women with PCOS still require assisted reproductive technology (ART), particularly in vitro fertilization (IVF), to achieve pregnancy. However, PCOS patients undergoing IVF face several challenges, including an increased risk of ovarian hyperstimulation syndrome (OHSS), reduced embryo quality, and lower clinical pregnancy rates [
3].
In recent years, advancements in medical technology and a deeper understanding of the pathophysiology of PCOS have led to improvements in IVF success rates. Nevertheless, a considerable number of PCOS patients still fail to achieve pregnancy after IVF treatment. Studies have shown that various factors may influence the success of IVF in PCOS patients, including age, body mass index (BMI), duration of infertility, hormonal imbalances, tubal factors, uterine anomalies, and more [
4]. Moreover, it is increasingly recognized that the interplay between these factors can significantly impact IVF outcomes. For example, insulin resistance and hyperandrogenemia have been shown to synergistically impair oocyte competence and endometrial receptivity in PCOS patients undergoing IVF [
5].
In clinical practice, accurately predicting the success of IVF treatment for PCOS patients is crucial for developing personalized treatment plans. However, due to the complex and multifactorial nature of the influences on IVF outcomes, traditional statistical methods often fall short of fully capturing these intricate relationships. Therefore, there is a need for more advanced data analysis techniques to uncover hidden patterns and associations within clinical data, thereby providing more robust support for clinical decision-making. Machine learning approaches such as random forest and gradient boosting have recently demonstrated promising performance in predicting IVF outcomes, particularly in heterogeneous populations such as those with PCOS [
6].
Association rule mining (ARM) is a data mining technique that identifies hidden relationships in data by discovering “if–then” patterns. It has been successfully applied in the medical field to reveal dependencies between clinical factors and inform disease diagnosis and treatment [
7,
8]. In this study, we aimed to employ ARM to uncover the interactions between different factors associated with IVF failure in PCOS patients and determine their collective impact on IVF outcomes. By doing so, we hope to provide clinicians with more precise decision-making support to enhance the success rates of IVF treatments for PCOS patients.
2. Methods and Materials
2.1. Data Source
This study retrospectively analyzed de-identified electronic medical records (EMRs) from the Reproductive Hospital of Guangxi, covering all in vitro fertilization (IVF) treatment cycles between 2018 and 2023 for females diagnosed with Polycystic Ovary Syndrome (PCOS). Inclusion was restricted to PCOS patients, diagnosed according to standard clinical criteria, undergoing IVF with or without intracytoplasmic sperm injection (ICSI) during the study period. The dataset captured each patient’s baseline characteristics, relevant diagnoses, treatment details, and IVF outcomes. Clinical pregnancy (the outcome of interest) was defined as a positive fetal heartbeat on ultrasound, approximately 6–7 weeks post-embryo transfer, and this binary outcome—pregnant or not pregnant—was recorded for each IVF cycle. Only records with complete information on key variables and outcomes were retained for analysis. The study was approved by the hospital’s institutional ethics board, and due to its retrospective nature and the use of de-identified data, informed consent was waived.
2.2. Data Preprocessing
The data from raw EMR extracts were rigorously preprocessed to ensure high-quality data for further analysis. The following steps were implemented:
- (1)
Terminology standardization: Clinical descriptions and diagnoses were normalized to a consistent vocabulary. In the raw EMRs, the same concept could be documented with varying terms or abbreviations. For example, “Polycystic Ovary Syndrome” was recorded as “Polycystic ovarian syndrome”, “PCOS”, “Poly-ovary syndrome”, etc., depending on the doctor’s style. To address this, all symptom and diagnosis labels were mapped to a unified terminology, effectively harmonizing synonyms and abbreviations into one standard descriptor. This normalization ensured that each clinical concept, such as hyperprolactinaemia, was represented uniformly across all records, preventing duplicate features arising from naming variations.
- (2)
Record filtering and de-duplication: Duplicate entries and records with substantial missing or incomplete data were removed. It is common for secondary-use healthcare data to contain duplicate records or omissions, which can bias the results. Any repeated patient entries, as well as cases lacking critical fields, such as outcome or key diagnoses, were identified and removed. This step improved data integrity and reliability by focusing the analysis on unique, complete cases.
- (3)
One-hot encoding: One-hot encoding is a standard data transformation technique for converting categorical data into a numeric matrix for machine learning [
9]. In practice, each distinct feature value, such as
primary infertility,
secondary infertility, or different stimulation protocols, was turned into its own column with a binary value, where the value “1” indicates the presence of that attribute in a given record and “0” indicates its absence. This encoding resulted in a Boolean feature matrix where each row corresponds to a patient’s IVF cycle, and each column corresponds to the presence of a specific condition or attribute. Continuous variables such as age, body mass index (BMI), and duration of infertility were discretized into clinically meaningful ranges before encoding, according to the standard of the World Health Organization [
10,
11].
All data cleaning and preprocessing steps were carried out in compliance with best practices for secondary analysis of clinical data. Overall, this process yielded a curated dataset of PCOS IVF cases that was consistent, free of major quality issues, and ready for pattern mining analyses.
2.3. Feature Selection
In this study, a comprehensive dataset of clinical and treatment features believed to be relevant to IVF outcomes in PCOS patients based on previous studies was extracted from the cleaned EMRs. These variables included demographic factors, comorbid conditions, anatomical factors, and treatment parameters, as detailed in
Table 1.
All features were encoded as 0/1 variables after preprocessing. Continuous measurements (age, BMI, infertility years, oocyte count) were categorized into discrete bins to allow their inclusion as categorical items in the association rule analysis. The selection of these features was guided by clinical domain knowledge and the literature on PCOS and infertility, ensuring that the dataset captured most known factors that might interact to influence IVF success in PCOS patients.
2.4. Association Rule Mining
Association rule mining is a data mining technique that identifies hidden relationships in data by finding
if–then patterns [
22], and it has been shown to be useful in the medical domain for discovering dependencies between clinical factors [
23,
24]. In this analysis, the classical
Apriori algorithm [
25,
26] was used to identify frequently co-occurring feature sets in the dataset and then derive association rules that predict the likelihood of clinical pregnancy, with a focus on rules where the consequent—the “then” part of the rule—was
clinical pregnancy failure as the IVF outcome in PCOS patients.
The Apriori algorithm comprises two main steps: frequent itemset generation and association rule extraction.
- (1)
Frequent itemset generation: The one-hot-encoded dataset was used to compute the frequent itemset. Apriori systematically explores combinations of items (features) and counts their occurrences in the one-hot-encoded dataset generated in the data preprocessing stage. A minimum support threshold of 0.05 was set, meaning that an itemset had to appear in at least 5% of patient records to be considered frequent, balancing the need to find non-trivial patterns with the desire to ignore extremely rare combinations and ensuring that any reported association involves a patient subset of meaningful size in the cohort.
- (2)
Association rule extraction: Upon the identification of frequent itemsets, the association rules function commenced the calculation of rule metrics. A minimum confidence threshold of 0.60 was applied to filter the rules. Confidence measures the conditional probability of the consequent given the antecedent. It was utilized to assess the likelihood of pregnancy under specific conditions. A rule’s confidence was required to meet or exceed 60% for consideration in this study, ensuring that, among patients exhibiting the antecedent feature set, at least 60% achieved the expected result. Furthermore, a criterion of lift > 1.0 was imposed for all rules. Lift, defined as the ratio of the rule’s confidence to the baseline probability of the consequent, indicates the degree to which the antecedent and outcome co-occur more frequently than expected by chance. The lift > 1 requirement guaranteed that any reported rule represented a positive association, signifying an improvement over random chance in predicting pregnancy.
After generating the initial set of rules, we specifically filtered for those rules ending in ⇒ clinical pregnancy failed. This yielded rules of the form antecedents to clinical pregnancy outcome, highlighting combinations of patient features that were associated with failed conception in this PCOS IVF cohort. Rule results were then examined for clinical plausibility and ranked by their metrics. In the end, only filtered rules meeting all the threshold criteria and reaching statistical significance according to Fisher’s test were retained for interpretation.
2.5. Computational Environment
All analyses were conducted using Python version 3.12 in a Jupyter Notebook environment. Data handling and preprocessing were performed with the
pandas library and
scikit-learn. Association rule mining was carried out with the
mlxtend library. The tools and models employed in this study are shown in
Table 2.
Throughout the analysis, we adhered to an academically rigorous approach: cleaning the clinical dataset to a high standard, encoding features in a manner suitable for the discovery of patterns, and using validated data mining techniques with justified parameter choices. The flow of methods and materials is shown in
Figure 1.
3. Results
3.1. Descriptive Statistics
3.1.1. Clinical Pregnancy Outcomes
The overall clinical pregnancy rate among patients diagnosed with Polycystic Ovary Syndrome (PCOS) undergoing their first IVF treatment was approximately
18.1%, indicating that around one-fifth of IVF cycles successfully resulted in clinical pregnancy. This finding aligns with previously published clinical pregnancy rates observed among PCOS populations undergoing IVF [
27].
3.1.2. Demographic and Clinical Characteristics
The demographic characteristics indicated a relatively young patient population, consistent with typical PCOS demographics. The mean age was
31.39 ± 4.45 years (range: 20–46 years). The body mass index (BMI) averaged
23.53 ± 3.43 kg/m2, with a range between 14.42 and 35.58 kg/m
2, suggesting that most patients were in the normal-weight to overweight category. The average duration of infertility was
4.45 ± 3.17 years, ranging widely from 0.1 to 22 years, reflecting diverse patient fertility histories (
Figure 2).
Clinical conditions were analyzed according to their prevalence among successful and unsuccessful IVF cycles, presented in
Figure 3. Conditions such as bilateral tubal obstruction and habitual abortion appeared predominantly in unsuccessful pregnancy cases, implying negative implications for IVF success. Pelvic adhesions and undiagnosed adnexal masses were relatively frequent diagnoses in both outcomes, indicating a high prevalence but limited specificity concerning pregnancy outcomes. In contrast, conditions such as hypertension, insulin resistance, and hyperprolactinemia were less prevalent overall, thus limiting their interpretative significance in isolation.
3.1.3. Comparison of Pregnancy Outcomes by Demographic and Clinical Groups
The comparative prevalence of each clinical feature between pregnancy success and failure groups is illustrated in
Figure 4.
Conditions such as luteinized unruptured follicle syndrome (LUFS), secondary infertility, and pelvic adhesion were highly prevalent in both successful and unsuccessful outcomes, indicating their common occurrence among PCOS-IVF patients. Features that were notably more frequent in the pregnancy failure group included bilateral tubal obstruction (39.3% in failure vs. 31.2% in success) and age greater than 35 years (18.4% in failure vs. 17.7% in success). Conversely, some conditions, such as adnexal mass (undiagnosed), hyperprolactinemia, and hypertension, demonstrated extremely low overall prevalence, limiting their discriminatory power. These results demonstrate that certain clinical factors, particularly tubal obstruction and advanced age, can potentially serve as meaningful indicators for clinical pregnancy outcomes, emphasizing their importance in clinical evaluation and patient counseling prior to IVF treatment.
3.1.4. Correlation of Features with Pregnancy Outcome
Pearson correlation analysis was conducted to quantify relationships between individual clinical features and IVF pregnancy outcomes, as shown in
Figure 5:
- (1)
Bilateral tubal obstruction exhibited the strongest negative correlation (−0.064), aligning with the clinical understanding of impaired fertility associated with significant tubal pathology.
- (2)
Pelvic adhesion (0.048) and adnexal mass (undiagnosed) (0.045) demonstrated small positive correlation with pregnancy success, an unexpected finding possibly influenced by confounding clinical management practices.
- (3)
Habitual abortion (−0.033) and secondary infertility (−0.024) negatively correlated with pregnancy success, confirming their relevance as adverse prognostic factors.
- (4)
Other factors, including BMI, insulin resistance, hypertension, and age, exhibited negligible or minimal correlations, suggesting limited predictive value when assessed individually in this cohort.
In sum, the correlation analysis indicated that no single factor exerted a significant effect on the clinical outcome of an IVF cycle in patients with PCOS. Further data mining techniques need to be implemented to discover the interactive and combined effects of different factors.
3.2. Association Rule Mining Outcomes
Association rule mining (ARM) was employed to identify clinical diagnoses and patient characteristics significantly associated with an increased risk of IVF clinical pregnancy failure among patients with PCOS.
3.2.1. Overall Rule Summary
ARM analysis was specifically directed toward uncovering associations where the consequent was fixed as clinical pregnancy failure, thereby enabling the identification of clinical factors and conditions that elevate the risk of unsuccessful IVF outcomes. After applying stringent thresholds—minimum support ≥ 0.05, confidence ≥ 0.60, and lift > 1.0—and statistical validation via the chi-square (χ
2) test, a total of 26 significant rules were generated, as shown in
Figure 6. These rules frequently involved antecedents comprising ovarian dysfunction factors (e.g., luteinized unruptured follicle syndrome (LUFS)), structural uterine anomalies, tubal factors (e.g., bilateral tubal obstruction), and advanced maternal age.
The frequency and strength of associations of individual clinical features (single itemsets) with IVF clinical pregnancy failure in PCOS patients are summarized in
Table 3:
- (1)
The most frequent clinical features associated with IVF failure were luteinized unruptured follicle syndrome (LUFS) and secondary infertility, each identified in 13 instances.
- (2)
Luteinized unruptured follicle syndrome (LUFS) had a high support rate (79.67%), indicating its high prevalence, although it demonstrated a relatively weak lift (1.0017), suggesting limited discriminative power when considered alone.
- (3)
Bilateral tubal obstruction exhibited the strongest association with clinical pregnancy failure, showing the highest lift value (1.0422) and a confidence of 85.25%, emphasizing its significance as a risk factor.
- (4)
Other important features, such as years of infertility >5 and BMI >24, demonstrated moderate frequencies and lifts (1.0104 and 1.0196, respectively), indicating their meaningful, though more limited, contributions as individual predictors.
- (5)
Pelvic adhesion lacked calculated metrics in this analysis, limiting the interpretation of its independent predictive strength.
In summary, the statistical results revealed that, when considered in isolation, most features exhibited limited predictive value for clinical pregnancy outcomes in PCOS patients undergoing IVF. This suggests that single factors alone may not sufficiently explain treatment success or failure, highlighting the importance of analyzing multidimensional risk combinations.
3.2.2. Top-Ranked Association Rules
The top 10 association rules, ranked by lift and confidence, are summarized in
Figure 7. All rules share a common consequent—clinical pregnancy failure—and were identified using the Apriori algorithm with support ≥ 0.05, confidence ≥ 0.60, and lift > 1.
These top-ranked rules reveal that combinatorial effects of anatomical, ovarian, and demographic risk factors substantially heighten the likelihood of IVF failure:
- (1)
Bilateral tubal obstruction appears in 7 of the top 10 rules, underscoring its centrality as a structural impediment to successful implantation or embryo transport.
- (2)
Luteinized unruptured follicle syndrome (LUFS) frequently co-occurs with both tubal factors and metabolic risks (e.g., BMI > 24), suggesting that ovarian dysfunction and metabolic dysregulation synergistically compromise reproductive outcomes.
- (3)
Secondary infertility and prolonged infertility (>5 years) repeatedly emerge in multifactorial rule sets, reflecting the compounded difficulty of achieving pregnancy in patients with a history of prior conception failure.
These insights highlight the clinical relevance of multi-feature pattern recognition, which offers a more nuanced risk stratification than single risk factors evaluated in isolation. They also provide a foundation for developing AI-assisted decision support tools to predict IVF failure and inform personalized treatment strategies.
4. Discussion
4.1. Comparative Analysis
4.1.1. Comparison with Previous Literature or Known Clinical Evidence
The findings of this research provide a
multidimensional perspective on IVF outcomes of PCOS patients that complements prior knowledge from more reductionist analyses. Several of our key associations—such as the detrimental effects of hydrosalpinx (bilateral tubal obstruction) and prolonged infertility—are strongly supported by the literature. For instance, a study by Ou et al. [
28] established that the presence of a hydrosalpinx can
diminish implantation rates and increase early pregnancy loss in IVF due to embryotoxic fluid and poor endometrial receptivity. This aligns with the rule in this study that any PCOS patient with an uncorrected bilateral tubal blockage (likely hydrosalpinx) almost invariably fails to conceive via IVF unless the tube issue is addressed. The logical clinical action is to perform a salpingectomy or proximal tubal occlusion prior to IVF in such patients, a recommendation echoed in many studies that showed improved IVF success after hydrosalpinx treatment [
29,
30].
The strong influence of infertility duration on IVF outcome in this study reinforces a consistent theme:
the sooner, the better. An early meta-analysis [
31] found a negative association between the duration of infertility and IVF success, and more recent analyses concur that beyond roughly 3–5 years of trying, each additional year is associated with lower pregnancy odds [
32,
33]. This study’s PCOS-specific data suggest that even within a relatively young cohort, those who had been infertile for ≥5 years had significantly reduced success. This could be partly because a longer duration often correlates with older age and other factors; however, even after accounting for age in some of our combined rules, duration remained a factor. One interpretation is that long-term infertility may indicate underlying intractable issues (e.g., poor egg/embryo quality, endometrial dysfunction) that persist despite IVF. It might also reflect that these patients have undergone multiple prior treatments or IVF cycles without success, hinting at recurrent implantation failure scenarios. Clinically, this stresses that practitioners might consider escalating treatment or exploring adjunct therapies (immunological work-ups, use of donor gametes, etc.) when faced with a patient who has had many years of unexplained infertility.
This research reaffirms the critical impact of female age—which remains the single strongest determinant of IVF success across all populations. Advanced age (≥35, especially ≥40) dramatically elevated failure risk in our PCOS cohort, consonant with general IVF outcomes. Likewise, obesity and metabolic factors are well-documented to impair fertility treatment outcomes [
34]. A 2024 systematic review noted that, in women with PCOS, high BMI independently lowers clinical pregnancy and live birth rates and raises miscarriage risk. In this article, obesity featured in some rules, though not the top rule, implying that while obesity is indeed harmful, other factors, such as tubal status or duration, were even more dominant in our dataset. It is possible that because a majority of our PCOS patients were overweight, BMI did not differentiate outcomes as sharply—a type of range restriction effect. However, we did observe that
lean PCOS patients had slightly better success rates than obese PCOS patients, aligning with the consensus that weight management can improve IVF outcomes in PCOS.
4.1.2. Unexpected Findings in the Current Analysis
The data-driven discovery of the LUFS + tubal factor combination as a high-failure profile appears to be a novel insight with limited direct precedent in the literature. LUFS is a subtle form of ovulatory dysfunction, and while it is known to cause infertility [
35], it is not commonly discussed in the context of IVF outcomes because ovarian stimulation with an HCG trigger is expected to circumvent follicle rupture problems. However, the results suggest that some PCOS patients may still experience issues analogous to LUFS even in IVF (e.g., follicles that luteinize without yielding an egg). A recent study by Li et al. [
36] noted that LUF cycles negatively affected pregnancy outcomes in natural-cycle FET, highlighting that luteinization without ovulation can disrupt timing and endometrial preparation. In stimulated IVF cycles, an argument can be made that if a patient has a tendency toward LUFS, careful monitoring and trigger timing are crucial—or alternatives such as a
dual trigger (HCG + GnRH agonist) might be beneficial to ensure oocyte release. The combination with the tubal factor is likely a proxy for the overall
severity of infertility: these patients effectively have two strikes against them, and indeed, our analysis shows that they fare poorly. While not previously reported as a combined risk in the literature, this finding is intuitive and underscores the importance of addressing
all known factors in a multi-disciplinary management approach to give the best chance of success.
Another novel discovery is that our approach identified interactions that traditional multivariable models might miss or not emphasize. For instance, using logistic regression, Liu et al. [
37] found that PCOS per se was not an independent predictor of live birth after adjusting for confounders, meaning that if age, BMI, etc., were controlled, PCOS patients performed as well as others. However, that same analysis showed that within the PCOS group, factors such as younger age, shorter infertility, and good embryo quality were associated with higher live birth rates. The results in this study complement this by explicitly highlighting
combinations (e.g., older age + fewer good embryos) that lead to failure. Essentially, ARM provides a human-readable set of rules that align with what an experienced clinician might surmise through years of practice. The benefit is that ARM can systematically scan through dozens of features to flag combinations that merit attention, possibly revealing less obvious patterns.
Besides, no individual predictor exhibited a robust, statistically significant effect. However, a small subset of variables—such as bilateral tubal obstruction—displayed modest associations, with lift=1.042.
4.2. The Strengths and Innovation of This Study
A strength of this study is the demonstration of how data mining techniques such as Apriori can be applied in reproductive medicine. This approach can handle many variables and uncover associations without the need to pre-specify an outcome model. The rules generated are intuitively understandable (“IF X and Y, THEN Z”), which could aid clinical decision-making more directly than a complex predictive model. To our knowledge, this is the first study to report association rules in the context of IVF outcomes for PCOS. It thus opens the door for using similar methods on larger IVF databases to perhaps discover phenotype-specific patterns (for example, does a combination of certain hormone levels predict ovarian hyperstimulation syndrome risk in PCOS?). Additionally, by focusing on clinical pregnancy failure, this research highlighted an outcome (failure to conceive) that is often less reported than success rates, yet is critically important when counseling patients about their prognosis and when planning interventions.
4.3. Limitations of This Study
Despite its insights, this research has several limitations. First, the study is retrospective and observational; association does not imply causation. The rules we found do not prove that, say, LUFS causes IVF failure—only that they occur together frequently. There could be underlying confounders (for example, perhaps women with LUFS also had poor ovarian reserve, which was the real driver of failure). We attempted to mitigate spurious findings by requiring relatively high support and by conducting statistical tests, but some associations might still be coincidental or due to bias in the dataset. Second, the dataset size (N ~300) is moderate; a larger sample would allow the detection of associations with lower support (rarer but potentially important scenarios). Our choice of a 10% support threshold means we likely missed rules involving very rare conditions (e.g., uncommon genetic factors or severe male factor cases)—these might still be clinically significant for individual patients but were not detectable in our analysis. Third, our feature encoding, while comprehensive, was limited to what was recorded. We did not include some potentially relevant variables such as AMH levels, insulin resistance indices, or detailed embryo morphology scores. The inclusion of such data might yield additional rules (for instance, a combination of low AMH + PCOS could predict poor ovarian response). Fourth, the analysis was confined to PCOS patients at a single center; thus, the rules reflect that specific population and practice (e.g., the stimulation protocols used, the prevalence of certain issues in that clinic). Caution is needed in generalizing the results to all PCOS patients or other IVF centers. What holds true in our data (for example, proportion of patients with hydrosalpinx) may differ elsewhere. Validation on external datasets would strengthen the confidence in these rules.
Part of the results showed inconsistent with the conclusions that have been confirmed by previous studies, for instance, the influence of BMI on the clinical outcomes of IVF was not reflected, while it is a an axiom widely recognized by the most researchers[
38,
39,
40]. The reason might be that over 80 % of patients were below BMI threshold, 24 kg m⁻² for Asian women standard by WHO, diluting its discriminatory power. In future, multi-centre datasets with broader BMI ranges may clarify obesity-specific gradients.
Finally, as with any data mining, there is a risk of overfitting or finding patterns that lack biological plausibility. We have tried to interpret only those rules that made clinical sense and matched some external evidence. It is reassuring that our top findings were aligned with known mechanisms (e.g., tubal fluid harming implantation, long infertility reflecting tougher cases). However, we remain careful not to overinterpret combinations that could be artifacts.
5. Conclusions
In conclusion, association rule mining was applied to identify key combinations of factors associated with IVF failure in PCOS patients. The results highlight that it is often the convergence of multiple adverse factors—such as ovulatory dysfunction, tubal pathology, and long-standing infertility—that dramatically lowers the chances of pregnancy in this high-risk group. These findings are largely in agreement with the existing literature on individual risk factors while also providing a novel integrated view of how these factors interact. For clinicians, the insights underscore the importance of comprehensive infertility work-ups in PCOS: a patient with both PCOS and another infertility factor (e.g., tubal obstruction) should be counseled about the lower success probability and the need to possibly correct the remediable factor before IVF. Similarly, aggressive management (or earlier transition to IVF) may be warranted for those with many years of infertility rather than prolonged attempts with lesser treatments.
This work demonstrates the utility of data-driven approaches in reproductive medicine. By uncovering patterns that might be overlooked by traditional analyses, association rule mining can generate hypotheses for further research (e.g., investigating the mechanistic link between LUFS and IVF outcomes in PCOS) and potentially inform clinical decision support systems. Future studies should validate these rules in larger, multi-center cohorts and assess their predictive value prospectively. It would also be worthwhile to extend this analysis to IVF success rules (profiles of patients who succeed) and to compare PCOS with non-PCOS infertile populations to see whether different rules apply. Ultimately, translating these findings into practice—for example, developing a risk score or checklist based on the presence of multiple factors—could help personalize IVF counseling and treatment for PCOS patients. In summary, women with PCOS are a heterogeneous group, and our study highlights that the sum of their reproductive challenges determines IVF outcomes. Recognizing and addressing each element of that sum holds promise for improving fertility success in this prevalent and challenging condition.
Author Contributions
Xuehong Zhu: Data curation; formal analysis ; investigation ; writing – original draft. Guanghui Dong: project administration (equal); writing – original draft . Zhong Lin: Supervision; Fund Provider. Lina Ge: Investigation, formal analysis. Feng Han: Conception; Framework Construction; writing – review and editing. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by National Natural Science Foundation of China, grant number [82460309] and the APC was funded by Reproductive Hospital of Guangxi.
Data Availability Statement
All raw data and code are available upon request.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Azziz, R.; Carmina, E.; Chen, Z.; Dunaif, A.; Laven, J.S.; Legro, R.S.; Lizneva, D.; Natterson-Horowtiz, B.; Teede, H.J.; Yildiz, B.O. Polycystic ovary syndrome. Nat. Rev. Dis. Primer 2016, 2, 1–18.
- Bozdag, G.; Mumusoglu, S.; Zengin, D.; Karabulut, E.; Yildiz, B.O. The prevalence and phenotypic features of polycystic ovary syndrome: a systematic review and meta-analysis. Hum. Reprod. 2016, 31, 2841–2855. [CrossRef]
- Sunkara, S.K.; Rittenberg, V.; Raine-Fenning, N.; Bhattacharya, S.; Zamora, J.; Coomarasamy, A. Association between the number of eggs and live birth in IVF treatment: an analysis of 400 135 treatment cycles. Hum. Reprod. 2011, 26, 1768–1774. [CrossRef]
- McGee, E.A.; Hsueh, A.J. Initial and cyclic recruitment of ovarian follicles. Endocr. Rev. 2000, 21, 200–214. [CrossRef]
- Dumesic, D.A.; Oberfield, S.E.; Stener-Victorin, E.; Marshall, J.C.; Laven, J.S.; Legro, R.S. Scientific statement on the diagnostic criteria, epidemiology, pathophysiology, and molecular genetics of polycystic ovary syndrome. Endocr. Rev. 2015, 36, 487–525. [CrossRef]
- Kavakiotis, I.; Tsave, O.; Salifoglou, A.; Maglaveras, N.; Vlahavas, I.; Chouvarda, I. Machine learning and data mining methods in diabetes research. Comput. Struct. Biotechnol. J. 2017, 15, 104–116. [CrossRef]
- Agrawal, R.; Imieliński, T.; Swami, A. Mining association rules between sets of items in large databases. In Proceedings of the Proceedings of the 1993 ACM SIGMOD international conference on Management of data; ACM: Washington D.C. USA, 1993; pp. 207–216.
- Wu, W.-T.; Li, Y.-J.; Feng, A.-Z.; Li, L.; Huang, T.; Xu, A.-D.; Lyu, J. Data mining in clinical big data: the frequently used databases, steps, and methodological models. Mil. Med. Res. 2021, 8, 44. [CrossRef]
- Samuels, J. In One-Hot Encoding and Two-Hot Encoding: An Introduction; 2024.
- World Health Organization Standards for Maternal and Neonatal Care 2007.
- World Health Organization WHO recommendations on antenatal care for a positive pregnancy experience 2016.
- Ombelet, W. WHO fact sheet on infertility gives hope to millions of infertile couples worldwide. Facts Views Vis. ObGyn 2020, 12, 249.
- World Health Organization Infertility prevalence estimates, 1990–2021; World Health Organization, 2023; ISBN 92-4-006831-7.
- Coussa, A.; Hasan, H.A.; Barber, T.M. Impact of contraception and IVF hormones on metabolic, endocrine, and inflammatory status. J. Assist. Reprod. Genet. 2020, 37, 1267–1272. [CrossRef]
- Herman, T.; Csehely, S.; Orosz, M.; Bhattoa, H.P.; Deli, T.; Torok, P.; Lagana, A.S.; Chiantera, V.; Jakab, A. Impact of Endocrine Disorders on IVF Outcomes: Results from a Large, Single-Centre, Prospective Study. Reprod. Sci. 2023, 30, 1878–1890. [CrossRef]
- Vannuccini, S.; Clifton, V.L.; Fraser, I.S.; Taylor, H.S.; Critchley, H.; Giudice, L.C.; Petraglia, F. Infertility and reproductive disorders: impact of hormonal and inflammatory mechanisms on pregnancy outcome. Hum. Reprod. Update 2016, 22, 104–115. [CrossRef]
- Wang, L.; Yu, X.; Xiong, D.; Leng, M.; Liang, M.; Li, R.; He, L.; Yan, H.; Zhou, X.; Jike, E.; et al. Hormonal and metabolic influences on outcomes in PCOS undergoing assisted reproduction: the role of BMI in fresh embryo transfers. BMC Pregnancy Childbirth 2025, 25, 368. [CrossRef]
- Harrison, R.F.; Bonnar, J.; Thompson, W. Diagnosis and Management of Tubo-Uterine Factors in Infertility; Springer Science & Business Media, 2012; Vol. 4;.
- Ozgur, K.; Bulut, H.; Berkkanoglu, M.; Coetzee, K.; Kaya, G. ICSI pregnancy outcomes following hysteroscopic placement of Essure devices for hydrosalpinx in laparoscopic contraindicated patients. Reprod. Biomed. Online 2014, 29, 113–118. [CrossRef]
- Qiu, J.; Du, T.; Chen, C.; Lyu, Q.; Mol, B.W.; Zhao, M.; Kuang, Y. Impact of uterine malformations on pregnancy and neonatal outcomes of IVF/ICSI–frozen embryo transfer. Hum. Reprod. 2022, 37, 428–446. [CrossRef]
- Tournaye, H. Male factor infertility and ART. Asian J. Androl. 2011, 14, 103. [CrossRef]
- Solanki, S.K.; Patel, J.T. A survey on association rule mining. In Proceedings of the 2015 fifth international conference on advanced computing & communication technologies; IEEE, 2015; pp. 212–216.
- Altaf, W.; Shahbaz, M.; Guergachi, A. Applications of association rule mining in health informatics: a survey. Artif. Intell. Rev. 2017, 47, 313–340. [CrossRef]
- Pradhan, G.N.; Prabhakaran, B. Association Rule Mining in Multiple, Multidimensional Time Series Medical Data. J. Healthc. Inform. Res. 2017, 1, 92–118. [CrossRef]
- Al-Maolegi, M.; Arkok, B. An improved Apriori algorithm for association rules. ArXiv Prepr. ArXiv14033948 2014. [CrossRef]
- Hegland, M. THE APRIORI ALGORITHM – A TUTORIAL. In Lecture Notes Series, Institute for Mathematical Sciences, National University of Singapore; WORLD SCIENTIFIC, 2007; Vol. 11, pp. 209–262 ISBN 978-981-270-905-9.
- Melo, A.S.; Ferriani, R.A.; Navarro, P.A. Treatment of infertility in women with polycystic ovary syndrome: approach to clinical practice. Clinics 2015, 70, 765–769. [CrossRef]
- Ou, H.; Sun, J.; Lin, L.; Ma, X. Ovarian Response, Pregnancy Outcomes, and Complications Between Salpingectomy and Proximal Tubal Occlusion in Hydrosalpinx Patients Before in vitro Fertilization: A Meta-Analysis. Front. Surg. 2022, 9, 830612. [CrossRef]
- Capmas, P.; Suarthana, E.; Tulandi, T. Management of Hydrosalpinx in the Era of Assisted Reproductive Technology: A Systematic Review and Meta-analysis. J. Minim. Invasive Gynecol. 2021, 28, 418–441. [CrossRef]
- Xu, B.; Zhang, Q.; Zhao, J.; Wang, Y.; Xu, D.; Li, Y. Pregnancy outcome of in vitro fertilization after Essure and laparoscopic management of hydrosalpinx: a systematic review and meta-analysis. Fertil. Steril. 2017, 108, 84-95.e5. [CrossRef]
- Zhang, L.; Cai, H.; Li, W.; Tian, L.; Shi, J. Duration of infertility and assisted reproductive outcomes in non-male factor infertility: can use of ICSI turn the tide? BMC Womens Health 2022, 22, 480. [CrossRef]
- Huang, C.; Shi, Q.; Xing, J.; Yan, Y.; Shen, X.; Shan, H.; Sun, H.; Mei, J. The relationship between duration of infertility and clinical outcomes of intrauterine insemination for younger women: a retrospective clinical study. BMC Pregnancy Childbirth 2024, 24, 199. [CrossRef]
- Wang, X.; Tian, P.; Zhao, Y.; Lu, J.; Dong, C.; Zhang, C. The association between female age and pregnancy outcomes in patients receiving first elective single embryo transfer cycle: a retrospective cohort study. Sci. Rep. 2024, 14, 19216. [CrossRef]
- Alenezi, S.A.; Khan, R.; Amer, S. The Impact of High BMI on Pregnancy Outcomes and Complications in Women with PCOS Undergoing IVF—A Systematic Review and Meta-Analysis. J. Clin. Med. 2024, 13, 1578. [CrossRef]
- Azmoodeh, A.; Pejman Manesh, M.; Akbari Asbagh, F.; Ghaseminejad, A.; Hamzehgardeshi, Z. Effects of Letrozole-HMG and Clomiphene-HMG on Incidence of Luteinized Unruptured Follicle Syndrome in Infertile Women Undergoing Induction Ovulation and Intrauterine Insemination: A Randomised Trial. Glob. J. Health Sci. 2015, 8, 244. [CrossRef]
- Li, S.; Liu, L.; Meng, T.; Miao, B.; Sun, M.; Zhou, C.; Xu, Y. Impact of luteinized unruptured follicles on clinical outcomes of natural cycles for frozen/thawed blastocyst transfer. Front. Endocrinol. 2021, 12, 738005. [CrossRef]
- Liu, S.; Mo, M.; Xiao, S.; Li, L.; Hu, X.; Hong, L.; Wang, L.; Lian, R.; Huang, C.; Zeng, Y.; et al. Pregnancy Outcomes of Women With Polycystic Ovary Syndrome for the First In Vitro Fertilization Treatment: A Retrospective Cohort Study With 7678 Patients. Front. Endocrinol. 2020, 11, 575337. [CrossRef]
- Dybciak, P.; Humeniuk, E.; Raczkiewicz, D.; Krakowiak, J.; Wdowiak, A.; Bojar, I. Anxiety and Depression in Women with Polycystic Ovary Syndrome. Medicina (Mex.) 2022, 58, 942. [CrossRef]
- Rakic, D.; Joksimovic Jovic, J.; Jakovljevic, V.; Zivkovic, V.; Nikolic, M.; Sretenovic, J.; Nikolic, M.; Jovic, N.; Bicanin Ilic, M.; Arsenijevic, P.; et al. High Fat Diet Exaggerate Metabolic and Reproductive PCOS Features by Promoting Oxidative Stress: An Improved EV Model in Rats. Medicina (Mex.) 2023, 59, 1104. [CrossRef]
- Kusuhara, S.; Kishimoto-Kishi, M.; Matsumiya, W.; Miki, A.; Imai, H.; Nakamura, M. Short-Term Outcomes of Intravitreal Faricimab Injection for Diabetic Macular Edema. Medicina (Mex.) 2023, 59, 665. [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).