Introduction
Chronic ulcers are defined as skin lesions that fail to progress through an orderly and temporal sequence of healing or that do not respond to standard therapies within three months [
1]. These conditions represent a major challenge for health systems globally, generating substantial costs through the need for complex and long-term medical care, repeated hospitalizations and a significant negative impact on patients' quality of life [
2], [
3]. Their impact translates not only into direct consumption of resources, but also into indirect economic losses, determined by incapacity for work and the associated social burden. From an epidemiological point of view, there is an increase in the prevalence of chronic ulcers, a phenomenon closely related to the aging of the population and the increased incidence of predisposing pathologies, such as obesity, diabetes, peripheral arterial diseases and chronic venous insufficiency [
4], [
5].
In this context, profiling patients with chronic ulcers becomes a pressing necessity. Detailed knowledge of the clinical-demographic characteristics of different subgroups of patients is fundamental for the development of personalized care strategies, adapted to the specific needs of each case [
6], [
7]. Such an approach allows not only the optimization of therapeutic outcomes, but also a more efficient allocation of medical resources, often limited, directing them to high-risk patients or to the most cost-effective interventions.
The main objectives of the present study are to identify and describe the distinct clinical-demographic profiles of adult patients hospitalized with the main diagnosis of chronic ulcers in hospital units in Romania and to evaluate the consumption of medical resources, expressed by days of hospitalization and number of hospitalizations, for each patient profile identified, to estimate the associated economic burden. By achieving these objectives, the study aims to provide valuable information to inform more effective and better targeted public health policies and clinical management strategies in the field of chronic ulcer care.
Materials and Methods
Study Population and Data Sources
The population analyzed in this study was composed of hospitalized patients with diagnoses of chronic ulcers, collected in a representative time frame for assessing the impact and severity of these conditions in hospital practice. The available data allowed the detailed characterization of the demographic, clinical and social aspects of patients, providing a solid basis for identifying differentiated risk profiles.
The data were collected from the administrative database of the National Institute of Public Health, which includes the reports of all state hospitals in Romania. The data set used in this study consists of hospitalizations of adult patients (≥18 years), registered nationwide, between 1st of January 2017 and 31st of December 2022, diagnosed with chronic ulcers.
ICD-10 codes [
8] were used to identify chronic ulcers as the primary diagnosis at discharge. These diagnoses have been categorized into six principal disease groups:
venous ulcer (I83.x - varicose veins of the lower extremities with ulceration),
arterial ulcer (I70.23 - atherosclerosis of the arteries of the extremities with ulceration),
diabetic ulcer (E1x.73 - diabetes mellitus with foot ulceration due to multiple causes),
pressure ulcer (L89 - pressure ulcer
), unclassified foot ulcer (L97 - ulceration of the lower limb not elsewhere classified),
unclassified skin ulcer (L98.4 - chronic ulceration of the skin not elsewhere classified). This classification allows both the specific analysis by major etiologies and a global assessment of the impact on the health system.
According to the standardized methodology in the international literature [
9], patients were grouped by age ranges as follows: <45 years, 45–54 years, 55–64 years, 65–74 years, 75–84 years, ≥85 years. This division allows direct comparability with other epidemiological studies and facilitates the analysis of the distribution of burden by risk groups and life stages.
To ensure accurate identification of chronic cases and reduce the risk of including acute or miscoded cases, only patients with a minimum of two hospital admissions during the study period were included in the profiling analysis. [
10].
For each patient, the following variables were extracted: demographic data: age at the time of admission, gender, place of residence (urban/rural), socio-economic status (employed, retired, unemployed, etc.); clinical data: type of ulcer and associated comorbidities, according to ICD-10 coding of primary and secondary diagnoses, consumption of medical resources: duration of hospitalization (total number of days hospitalization per patient), number of hospitalizations.
Data Preprocessing
To achieve robust profiling of patients with chronic ulcers, the selection of variables included in the clustering analysis is an essential step, based on both clinical relevance and evidence from the literature [
11], [
12], [
13]. The purpose of this selection is to identify those characteristics that significantly differentiate subgroups of patients and that can influence both the individual prognosis and the consumption of resources within the health system.
Demographic variables: age and sex are fundamental factors in the epidemiology and evolution of chronic ulcers. Age, entered as a continuous variable or in groups, allows the identification of clusters with specific risks associated with each life stage. Gender also influences epidemiological indices and the evolution of chronic ulcers, thus being indispensable for relevant segmentation. Additional variables such as the environment of residence (urban/rural) and socio-economic status, relevant for the assessment of epidemiological and evolutionary characteristics, were added.
Type of ulcer: identification and coding of ulcer types according to ICD-10 then grouping by categories: venous, arterial, diabetic, bedside scale, unclassified ulcer of the lower limb, chronic unclassified skin ulcer, allows precise segmentation of the studied population according to the predominant etiology, facilitating the differentiated analysis of clinical characteristics and resources used for each category.
Comorbidities: the presence of associated diseases, especially obesity, cardiovascular diseases and infections, plays a crucial role in the etiopathogenesis, evolution and prognosis of chronic ulcers. For each patient, the main comorbidities were extracted based on the secondary diagnoses associated with each discharge, thus allowing the assessment of their cumulative impact on the complexity of the case and on the consumption of resources.
Indicators of resource consumption: the total length of hospitalization (expressed as the total number of days spent in hospital during the study period) reflects the degree of complexity and severity of cases, being a proxy for the intensity of medical care required. The total number of admissions during the six years of the study, per patient, provides additional information on the prolonged evolution of severe cases with complications and recurrences.
The selection of these variables was made in accordance with the recommendations of the literature on profiling studies of patients with chronic diseases and is supported by evidence on the impact of each characteristic on clinical evolution and on the associated costs [
14], [
15], [
16]. The careful choice of these variables ensures not only the clinical relevance of the identified clusters, but also the practical utility for resource allocation policies and for the development of personalized strategies for the management of patients with chronic ulcers.
A critical step in the analysis of medical data is the evaluation and treatment of missing values, given the potential impact on the validity and robustness of statistical conclusions. In the present study, it was not necessary to implement techniques for dealing with missing values, as the dataset analyzed was subjected to a rigorous validation and verification process at institutional level. More specifically, the data comes from a standardized national administrative base, in which each admission record is complete, and the essential variables for the analysis were mandatory for hospital reporting. As a result, there were no missing values for the variables included in the profiling analysis, this peculiarity providing a solid basis for the application of clustering methods, without the risk of introducing bias through additional imputations or exclusions.
Subsequently, the additional validation carried out at the study level involved the inclusion of only patients with at least two hospitalizations for chronic ulcers, which ensures the consistency and integrity of the selected records.
To allow quantitative analysis and the application of clustering methods (K-means), the selected variables were processed and coded according to the following principles:
- Numerical variables: variables such as age (years), total length of hospitalization (days), total number of admissions. For some descriptive or cluster validation analyses, age was divided into standardized groups, and the results were interpreted on both raw and category data.
- Categorical variables: Sex was binary coded (0 = female, 1 = male), facilitating quick interpretation in statistical models. The category of ulcer was coded based on the six previously defined groups, with each patient associated with a label (e.g.: 1 = venous ulcer, 2 = arterial ulcer, 3 = diabetic ulcer, 4 = pressure scale, 5 = lower limb ulcer not elsewhere classified, 6 = chronic skin ulcer not elsewhere classified).
- Additional variables: in the structure of the dataset there were additional ordinal or nominal variables (e.g. urban/rural residence environment), they were coded accordingly: nominal variables by one-hot coding or numerical labels, and ordinal variables according to natural order.
- Normalization of variables: to ensure comparability between variables with different scales and to improve the performance of the K-means algorithm, numerical variables were standardized with z-score [
17] or normalized to the range [0,1], so that no variable dominates the clustering process by numerical size alone.
Through this systematic approach of coding and preprocessing, each case benefited from detailed framing from a demographic, etiological, and comorbidity point of view, allowing the obtaining of clinically relevant, easily interpretable and reproducible profiles. This method ensures the robustness of clustering analysis and interpretation of differences between resource consumption profiles, facilitates the extrapolation of results in similar clinical contexts and supports the substantiation of recommendations for the integrated management of chronic ulcers.
Clustering Methodology
The Choice of the K-Means Method: Justification, Advantages and Limitations
In this research, we opted for the application of the K-Means unsupervised learning algorithm to identify distinct profiles of hospitalized patients with chronic ulcers. The choice of method is based on both the characteristics of the available data and the operational advantages of the algorithm in medical contexts, where population segmentation based on demographic, clinical and resource consumption traits can guide personalized decisions and the efficient allocation of health resources.
K-Means is a partition algorithm that divides observations into unordered k clusters, minimizing internal variation (intra-cluster) and maximizing separation between groups (inter-cluster). This method is well adapted for situations in which it is desired to highlight latent patterns in the numerical data, such as age, length of hospitalization, number of comorbidities, frequency of hospitalizations, etc.
Advantages of the K-Means method:
- High computational efficiency: The algorithm is scalable for large datasets with linear complexity relative to the number of observations.
- Intuitiveness and interpretability: the results (cluster centers) can be interpreted directly as prototypes of patient profiles.
- Flexibility: allows the integration of a variable number of clinically and logistically relevant numerical variables (e.g. age, days hospitalized, comorbidities).
- Popularity and robust implementation: It is available in a variety of validated libraries (e.g. scikit-learn) [
18], with options for smart initialization (K-means++), which minimizes the risk of convergence to suboptimal local solutions.
Limitations of the K-Means method:
- The need to specify the number of k clusters: its choice involves auxiliary methods (e.g. the "elbow" method, silhouette score) [
19], which can introduce subjectivity.
- Scaling sensitivity: The performance and shape of clusters can be influenced by unscaled variables or the presence of outliers.
- Implicit expectation of spherical and balanced clusters: in the medical context, distributions can be unbalanced (e.g., patients with multiple comorbidities vs. some with a simple profile), which can limit the fidelity of the segmentation.
- Inability to work efficiently with categorical variables requires conversions (e.g., one-hot encoding), which can complicate interpretability or lead to sparsity.
The choice of the K-Means method is therefore justified by an optimal compromise between analytical accuracy, computational complexity and clinical interpretability. Considering the main objective of the study, the identification of homogeneous groups of patients from the perspective of clinical risks and resource consumption, this method allows the exploration of latent models of population organization that can underpin stratification and targeted intervention policies in medical practice.
The steps mentioned in the data processing section led to the formation of a numerical, scaled and coherent dataset suitable for the application of unsupervised learning methods. This set allows the identification of homogeneous groups of patients, relevant from a clinical and operational perspective.
The application of the Elbow Method [
19] indicated a relevant inflection at k = 2, suggesting that the addition of additional clusters did not bring a significant improvement in the explained variance. Concomitantly, the analysis of the silhouette score showed high values for k = 2, indicating good intra-cluster cohesion and adequate inter-cluster separation. Based on these criteria and considering the superior clinical interpretability of a solution with two distinct profiles, k = 2 was chosen in the subsequent K-Means analysis.
For the implementation of patient cluster analysis, the Python programming language was used, with the scikit-learn library (version 1.4.2) as the main tool, which provides robust functionalities for machine learning and data processing. The clustering algorithm chosen was K-Means, due to its computational simplicity, interpretability and efficiency in identifying internal structures in large data sets.
Pre-processing included:
- Selection of relevant variables for segmentation: age, number of comorbidities, total length of hospitalization, frequency of hospitalizations, binary sex and background of origin.
- Scaling these variables using the Standard Scaler function [
18] in scikit-learn, to ensure comparability between variables expressed in different units.
The K-Means model has been initialized with the following parameters:
- n_clusters = 2: value determined empirically as optimal by the Elbow and Silhouette methods.
- random_state = 42: for reproducibility.
- n_init = 10: the number of repetitions of the algorithm with different initial centers, to avoid local minimum.
- max_iter = 300: The maximum number of iterations per run.
The training of the model was achieved by calling the fit_predict method [
18] on the scaled dataset, resulting in a cluster membership vector that was later added to the original dataset for descriptive profile analysis.
This approach has made it possible to achieve a coherent segmentation of the population, based on clinically significant and easily obtainable traits from administrative sources.
Statistical Considerations for Data Analysis
For the analysis of quantitative data, the dispersion of variables was described by means of the interquartile range (IQR) [
20], providing a robust picture of variability in the case of asymmetric distributions. This measure was preferred in the context of the presence of deviations from normality, being complementary to the median values in the representation of the central trend.
The evaluation of the statistical significance of the differences between the groups was carried out by calculating the p-value [
21], establishing a conventional significance threshold of 0.05. In the case of categorical variables, the Chi-square test [
22] was used to investigate associations between groups. For the comparison of distributions, the non-parametric Mann–Whitney U test [
23] was applied, suitable for independent samples and continuous data with non-normal distributions.
The normality of the distributions was verified prior to the application of inferential tests, using the Shapiro-Wilk test [
24]. This approach allowed the selection of the most appropriate statistical tests, avoiding the incorrect application of parametric methods in the context of data that do not comply with the normality hypothesis.
Results
General Characteristics of the Population
The total database, for the 6 years of the study, consists of 116264 hospitalizations due to chronic ulcers, generated by 69349 patients. Of these, 50493 patients had only one hospitalization during the study period and were excluded from the analysis. The remaining database consists of 65771 hospitalizations, generated by 18856 patients who were hospitalized two or more times for chronic ulcers during this period. Thus, this database is considered valid and is the subject of our study.
The median age of the patients was 68 years, with an amplitude between 18 and 100 years, which indicates a predominant impairment of the elderly population, which is in a biological stage vulnerable to chronic healing processes.
From the perspective of gender distribution, the ratio was 54.5% male patients, with an absolute number of 35,864 men and 29,907 women, which suggests a slight male preponderance among hospitalized cases.
As regards socio-professional status, there was a clear dominance of retired patients (79.5%), followed by salaried and unemployed categories (9.9% each), the rest being marginally represented: self-employed (0.4%), unemployed (0.3%), pupils or students (0.08%), employers (0.04%) and farmers (0.02%). These data reflect a population with a high degree of dependence on the public health system, especially among retirees.
The geographical distribution of patients showed a slight predominance of rural areas (56.9%) compared to urban areas (43.1%). This distribution may be associated with unequal access to specialized medical services and difficulties in the preventive and curative management of chronic injuries in rural areas, where medical resources are often more limited.
These general characteristics indicate that the population affected by chronic ulcers is a vulnerable one, with a significant burden of disease associated with old age, comorbidities and precarious socio-economic status, which justifies the application of differentiated management and resource allocation strategies.
Preliminary Descriptive Analysis
Descriptive analysis was the first essential step in characterizing the population of unique patients diagnosed with chronic ulcers and hospitalized during the study period. The main demographic, clinical and resource consumption characteristics were assessed, both overall and stratified by ulcer types, sex and age groups. (
Figure 1)
In the entire batch of 18,856 unique patients, the median age was 68 years (IQR: 59–76) [
20], with a minimum of 18 and a maximum of 100 years. The age distribution is positively asymmetrical (straight tailed), with a peak around the age of 70, reflecting the prevalence of chronic ulcers among elderly patients.
The gender distribution showed a male: female ratio of 53.6%: 46.4%, indicating a slight predominance of male patients. This distribution is constantly maintained by most age groups and is also found in the distributions by clinical categories.
As for the social status of single patients, 74.9% were classified as pensioners, 12.6% as employees, 11.9% without employment, and the rest were distributed among self-employed workers (0.4%), students (0.1%), employers (0.04%) and farmers (0.03%). This structure directly reflects the elderly and vulnerable profile of the population studied.
The distribution according to the environment of origin showed that 55.1% of the patients come from rural areas, and 44.9% from urban areas. This rural predominance may suggest favorable factors such as delayed access to medical care, poor socio-economic conditions and low level of health education, factors that can contribute to the aggravation and chronicity of ulcerative lesions.
It is important to note that the 18,856 unique patients generated a total of 65,771 admissions, which corresponds to an average number of 3.5 admissions per patient, indicating the frequency of relapses and the chronic nature of these conditions.
The evaluation of the number of admissions according to the clinical categories of ulcers highlights the fact that the most common types are venous ulcers and lower limb ulcers not elsewhere classified, which together account for over 78% of the total of 65,771 admissions. (
Figure 2)
The distribution of admissions is as follows:
Venous ulcer (I83.x): 52.8% of the total (34,751 admissions)
Ulcer of the lower limb not elsewhere classified (L97): 25.5% (16,777 admissions)
Arterial ulcer (I70.23): 7,9% (5.173 admissions)
Chronic skin ulcer not elsewhere classified (L98.4): 7.3% (4,816 admissions)
Diabetic ulcer (E1x.73): 3,9% (2.559 admissions)
Decubitus ulcer (L89): 2.6% (1,695 admissions)
The number of admissions per patient is shown in
Table 1. along with the overall distribution of the number of hospitalization episodes in the analyzed dataset. Patients are assigned to the category after the first episode, and the average number of admissions/patients is calculated for admissions with that main diagnosis. For each patient, the clinical category (type of ulcer) from his first admission was determined. The average number of admissions for a clinical category is calculated only based on admissions in which the main diagnosis is exactly the type of ulcer identified at the patient's first admission. Any subsequent hospitalization of the patient in which the main diagnosis is different from the initial one is not included in this calculation. Thus, this indicator reflects how frequently the patient was hospitalized strictly for the specific type of ulcer initially identified and does not include any subsequent hospitalizations for other types of ulcers or other conditions.
The total distribution of hospitalization days is illustrated in
Figure 3. according to the age of the patients and the clinical category of the ulcer. It is observed that most days of hospitalization are associated with patients aged between 65 and 80 years, indicating a major consumption of resources among the elderly. The categories of ulcers with the greatest impact on the total duration of hospitalization are:
- Venous ulcer, which dominates the entire distribution and reaches a peak between 70–75 years.
- Ulcer of the lower limb not elsewhere classified, with a broad profile around the age of 65–80 years.
- Arterial ulcer and diabetic ulcer, which contribute especially to the age range 60–75 years.
- Bedsores and chronic unclassified skin ulcers have a smaller but constant contribution in the advanced age categories.
The distribution of unique patients by age group and sex is detailed in
Figure 4. stratified by each clinical category of ulcers. It is noted that:
- Venous ulcer and lower limb ulcer not elsewhere classified (L97) reach peaks in incidence in the age groups 65–74 years and 75–84 years, in line with the advanced mean age of the population studied. Based on the analyzed data, it was found that, of all the clinical categories of chronic ulcers studied, only venous ulcers present a clear predominance of female patients.
- For arterial and diabetic ulcers, a clearly favorable distribution of the male sex is observed, with proportions of approximately 74.2% and 77.8% of the total cases analyzed
- Bedsores and chronic unclassified skin ulcers have a more balanced distribution between the sexes, with a slight male predominance in older age groups.
The analysis of the number of hospitalizations per patient showed an overall average of 3.5 admissions/patient, indicating a high number of recurrences. About 53% of patients had two admissions, and 12.3% were readmitted more than five times. This suggests that the database does not only reflect isolated hospitalizations, but consistently includes recurrent cases, relevant for assessing the chronic burden on the health system. (
Figure 5)
The median number of days of hospitalization per admission was 8 days (IQR: 6–10), the average length of a hospital stay was 8.6 days. (
Figure 6)
The analysis of the distribution of consumed resources highlighted significant differences between the clinical categories of chronic ulcers, mainly reflected by the total duration of hospitalization and the average number of hospitalizations per patient, used as relevant indicators for assessing the burden on the health system.
The average duration of hospitalization per admission demonstrated significant variation based on ulcer type. The longest mean stay was observed for bedsores (L89), at 14.7 days, followed by diabetic ulcers (E1x.73) at 10.4 days. Lower limb ulcers not classified elsewhere (L97) averaged 9.0 days, arterial ulcers (I70.23) 8.4 days, venous ulcers (I83.2) 8.2 days, and chronic skin ulcers not otherwise specified (L98.4) had the shortest average stay at 7.1 days.
This distribution suggests a correlation between the severity and complexity of cases and the duration of hospital management. The differences are further reflected in the age group distribution, with the highest average ages observed among patients with arterial ulcers (69.0 years), followed by venous ulcers (67.6 years), decubitus ulcers (67.0 years), lower limb ulcers not classified elsewhere (67.1 years), and diabetic ulcers (64.7 years). In contrast, chronic unclassified skin ulcers (L98.4) occur at a significantly lower average age of 62.6 years. This pattern may indicate that L98.4 is more common in younger patients with milder or differently managed forms of ulcerations, whereas advanced age appears to be associated with greater severity and complexity of arterial pathology, as seen in cases coded as I70.23.
Also, recurrence (average number of admissions per patient) ranges from 2.6 admissions in patients with pressure ulcers, to 3.8 admissions in those with unclassified ulcers of the lower limb, indicating a frequent return to the hospital and possible deficiencies in outpatient management or lack of community support.
These differences reflect not only the pathophysiological particularities of each type of ulcer, but also the varying needs for multidisciplinary treatment, prolonged monitoring, surgery and functional recovery. Pressure ulcers and diabetic ulcers often involve patients with multiple comorbidities and poor functional status, which translates into an increased consumption of medical and social resources.
K-Means Clustering Results
The application of the K-means algorithm on the processed dataset led to the identification of two distinct clusters of patients, with significantly different clinical profiles and resource consumption. This segmentation allowed a more nuanced understanding of the population studied, highlighting the existence of subgroups with contrasting medical needs and economic impact. The analysis of the characteristics of each cluster was carried out based on standardized variables, revealing relevant patterns in terms of age, comorbidities, duration of hospitalization and frequency of hospitalizations.
The descriptive details of each identified cluster are presented below, together with the associated clinical interpretations, useful in outlining differentiated directions of intervention in medical practice.
It is essential to emphasize that these values reflect indicators calculated at the level of a single patient, i.e. they are the result of the aggregation of all admissions and diagnosis per individual, and not at the level of hospitalization episode.
Cluster 1 – Patients with a Complex Profile, Intensive Consumption of Resources
This cluster is characterized by:
- higher average age, 73.1 years.
- significantly longer total hospitalization duration – these patients accumulated a high number of days in hospital (61.4 days/patient).
- average increased number of comorbidities (4.6).
- higher frequency of readmissions (on average 3.3), indicating a profile with complicated medical evolution and recurrent risk.
- increased probability of rural origin (61.4%) and male (55.8%), and the dominant social status, retired.
Medical interpretation: this cluster includes elderly patients, with marked biological frailty and multiple chronic pathologies, requiring long-term care, repeated hospitalizations and interdisciplinary interventions. They represent a category with a high risk of recurrence and complications, an ideal candidate for integrated chronic disease management, community care and post-discharge coordination programs.
Cluster 2 – Patients with a Simpler, Clinically Stable Profile
This cluster includes:
- more varied ages, including those under 60 years old, but generally younger than in Cluster 1, with an average age of 63.4 years.
- reduced total length of hospital stay associated with shorter or well-managed episodes (average total length of hospitalization: 20.9 days/patient).
- comorbidities in fewer numbers (average of 2.6).
- low frequency of admissions (2.1).
- more balanced distribution in terms of gender (male 47.5%), environment of origin (urban 51.1%) and dominant social status (employee).
Medical interpretation: these patients have a clinically stable profile, with a less severe form of the disease and a better ability to be managed on an outpatient basis or in community care. The consumption of resources is significantly lower, and interventions can be focused on therapeutic education, prevention and punctual social support.
As practical and clinical management implications, it follows that patient segmentation allows for differentiated allocation of resources, cluster 1 can benefit from multidisciplinary care, access to advanced specialized treatment centers with prolonged hospitalization, and cluster 2 can be oriented towards outpatient monitoring programs, medical education and recurrence prevention.
The clinical and resource utilization differences between the two clusters of patients obtained by the K-Means algorithm is summarized in
Table 2. Each row corresponds to an evaluated parameter (age, total duration of hospitalization, number of comorbidities, frequency of hospitalizations, sex and environment of origin, social status), and the columns describe the characteristic features of each cluster. Cluster 1 highlights an older patient profile, with multiple chronic diseases and increased consumption of resources, while Cluster 2 includes younger patients, with less complex pathology and more favorable clinical evolution. This segmentation can support differentiated resource allocation decisions and personalized interventions.
Analysis of Resource Consumption
The segmentation of the population with chronic ulcers allowed the identification of two distinct groups of patients, significantly differentiated in terms of the consumption of medical resources. Cluster 1 is characterized by increased consumption: patients included in this profile had an average of 3.32 admissions, a mean total length of hospitalization of 61.42 days, and an estimated mean number of 4.62 comorbidities (
Figure 7). Comparatively, patients in Cluster 2 had an average of 2.07 admissions, a total duration of hospitalization of 20.91 days and 2.59 comorbidities.
These differences highlight the high clinical complexity and frailty of patients in Cluster 1, indicating the need for multidisciplinary specialized interventions, continuous monitoring, and integrated long-term care strategies.
In contrast, patients in cluster 2 had on average, a single episode of readmission, a reduced total duration of hospitalization, and a simpler clinical profile with fewer comorbidities. They seem to benefit from a more favorable evolution or efficient outpatient management.
The graphical visualizations (
Figure 7 and
Figure 8) confirm these differences, highlighting a clear segregation of resource consumption between the two clusters. In the absence of explicit data on costs and detailed medical interventions, these indirect indicators provide a conclusive picture of the health burden generated by each group. Cluster 1 clearly stands out as the most demanding for the health system, justifying priority multidisciplinary interventions and proactive prevention and monitoring strategies.
Figure 8 shows the comparative average consumption of medical resources per patient, calculated for the two clusters identified by the K-means algorithm. The analyzed indicators are the average number of admissions per patient, average total number of hospitalization days per patient, estimated average number of comorbidities.
The graph highlights the effectiveness of the clustering method in highlighting distinct clinical profiles among patients with chronic ulcers. The results can inform decisions about resource allocation, prioritization in medical interventions, and the development of differentiated patient management strategies – such as integration into community care or scheduled admissions for complex cases.
Statistical Significance Tests
For clinical validation of the segmentation of the patient population into distinct clusters, a rigorous analysis of the significance of the observed differences between these groups was required. Appropriate statistical tests were applied to each type of variable, to determine whether the differences identified are medically relevant and do not occur by chance.
For each comparison, the P-value was calculated, with the significance threshold set to 0.05. P-values below this threshold indicate that the observed differences are statistically significant and cannot be attributed to chance. Thus:
p < 0.05 indicates a statistically significant difference.
p < 0.01 reflects a strongly significant difference.
p ≥ 0.05 suggests the lack of a statistically significant difference.
For the clinical interpretation of cluster differences, 95% confidence intervals (CIs) were also calculated for each estimated mean. They provide an estimated margin within which the actual value of the difference between the groups can fit. Intervals that do not include the value 0 support the statistical significance of the observed difference.
To evaluate the clinical relevance of the clustering obtained, a comparative statistical analysis was performed between the two groups of patients resulting from the application of the K-means method. The analysis tracked differences in age, gender, background, length of hospitalization, number of comorbidities, and frequency of admissions.
Patients' age, sex and background: The comparison of age between the two clusters showed a statistically significant difference (p < 0.05), which suggests the existence of a predominant group made up of elderly patients, compared to a second younger group. This difference is consistent with the hypothesis that elderly patients have a higher severity of chronic ulcers and an increased likelihood of complications and recurrent hospitalizations. In terms of gender, a significant proportional difference was observed, with cluster 1 having a higher share of male patients (confirmed by the Chi-square test). This observation is in line with the literature that indicates a more frequent association of arterial and diabetic ulcers with the male sex. For the environment of origin, the difference between clusters was also significant, with cluster 1, the most vulnerable, including a higher proportion of patients from rural areas, which could reflect delays in accessing medical care and the lack of outpatient monitoring.
Total duration of hospitalization: The total duration of hospitalization was also significantly higher in one of the clusters (cluster 1) (p < 0.01), supporting the hypothesis of a higher consumption of hospital resources. This observation is particularly relevant from the perspective of the efficient management of care resources, as patients in the cluster with prolonged hospitalizations are likely to require complex interventions and long-term treatments.
Number of comorbidities: The analysis showed a significant difference in the number of comorbidities between the two groups, suggesting that one of the clusters consists mainly of patients with multiple pathologies. This information is crucial in individualizing the care plan and understanding the risk of unfavorable evolution or readmission.
Frequency of hospitalizations: Cluster 1, characterized by older patients and with more comorbidities, also had a higher frequency of hospitalizations, which suggests the complexity of the cases, their enduring evolution with the frequent occurrence of complications requiring readmission. The difference between clusters was statistically significant (p < 0.05), reinforcing the hypothesis of increased clinical vulnerability in this subgroup.
Analysis of the results of the statistically significant tests confirms the distinct clinical differences between the two identified clusters. The cluster of older patients with multiple associated comorbidities and long hospital stays is a group at high risk of resource consumption and requires targeted integrated management interventions. This statistical stratification provides solid premises for the development of policies for efficient allocation of resources and individualization of care for patients with chronic ulcers.
Continuous numerical variables (age, length of hospitalization, number of comorbidities, frequency of hospitalizations) were analyzed using the Mann–Whitney U test, as their distributions were not normal (Shapiro–Wilk test, p < 0.05). The results indicated statistically significant differences between clusters for all these variables (p < 0.001), confirming the existence of two distinct subpopulations:
A first cluster, characterized by older patients, with longer hospitalizations, multiple comorbidities and frequent hospitalizations.
A second cluster, consisting of relatively younger patients, with a low consumption of resources and a more stable clinical profile.
The categorical variables (sex, environment of origin) were compared by the Chi-square test, which showed significant differences (p = 0.0027 for sex; p < 0.0001 for the environment). Thus, a tendency was observed to associate a cluster with a higher proportion of male and rural patients, suggesting a possible relationship between socio-demographic factors and disease severity.
In conclusion, the tests applied confirm the validity of the segmentation performed, demonstrating that the two clusters not only differ statistically significantly, but also that these differences have direct clinical relevance. The results support the use of unsupervised learning profiling as a tool to stratify risk and optimize the management of patients with chronic ulcers.
Discussion
The application of the K-means grouping method allowed the segmentation of the population of hospitalized patients with chronic ulcers into two distinct clusters in terms of demographic, clinical and resource consumption.
Cluster 1 is characterized by a significantly higher median age, an increased number of comorbidities, longer hospitalizations and a higher frequency of readmissions. This profile reflects elderly patients with multiple comorbidities, with advanced clinical forms, with an increased risk of complications and with a low capacity for self-management.
Cluster 2 includes younger patients with fewer comorbidities and a shorter length of hospitalization. These patients seem to be diagnosed earlier or benefit from a higher healing capacity, which suggests a more favorable profile in terms of prognosis and outpatient therapeutic management.
The comparative analysis of the two clusters highlighted a clear correlation between the clinical profile and the consumption of resources, thus the patients in cluster 1 generated a disproportionate consumption of hospitalization days and medical resources, being responsible for most repeated admissions. This group can be considered a significant burden on the healthcare system, requiring long-term interventions, complex treatments, and often reinterventions. In contrast, cluster 2 has a smaller footprint on hospital services, being associated with shorter episodes of care and lower clinical complexity.
This polarization between the two profiles underscores the need for differentiated layering of care, tailored to the patient's individual risk.
The results obtained are consistent with the data published in the international literature, which highlight the existence of subgroups of patients with chronic ulcers with distinct clinical and risk profiles. Studies [
25], [
26] have shown that elderly patients with high blood pressure, obesity, peripheral vascular disease, and other comorbidities have a higher risk of frequent hospitalization, complications, and increased costs. There are also data [
27], [
28], [
29], [
30], [
31] confirming that a minority of patients generate a significant proportion of the expenses associated with ulcer care, confirming the usefulness of real-world data-based profiling for resource optimization. Despite the clinical importance of patient profiling in chronic skin ulcers, only a few studies on this topic have been conducted or published in Romania to date [
32], [
33], [
34], [
35], [
36].
Segmenting the population based on clinical and demographic characteristics has direct implications for the personalized management of chronic ulcers:
For patients in the high-risk cluster, intensive follow-up protocols, interdisciplinary care (including internists, diabetologists, vascular surgeons, specialist nurses) and early integration into community or palliative care programs should be implemented.
For the low-risk cluster, the focus should be on prevention, medical education, rapid interventions and avoiding the evolution to complicated stages.
This model allows for data-driven clinical triage, which can guide decisions about case routing and resource allocation.
Limitations of the Study
The present study faces several limitations specific to retrospective research using administrative databases. First, the secondary nature of the data analyzed, initially collected for administrative rather than research purposes, may influence its accuracy. This feature raises issues related to possible coding errors and variability in the quality of the information recorded, thus affecting the reliability of the conclusions reached. To mitigate this inconvenience, we included only patients with at least two hospitalizations for chronic ulcers in the study, which provides additional validation of the database.
Also, the dataset used does not include information from outpatient services. This omission affects the study's ability to paint a complete picture of the patient's journey, limiting the correct assessment of the total consumption of medical resources and increasing the risk of an overestimation of the contribution of hospitalization to overall care. The integration of this data would allow a more accurate understanding of the clinical and logistical needs of patients with chronic ulcers.
Another important limitation is the absence of essential clinical details, such as the size of the ulcer, the duration of the lesion evolution before admission or the type and intensity of the treatments applied. The lack of these variables prevents a rigorous correlation between the identified profiles and the real severity of the diseases, reducing the explanatory power of the segmentation models.
Future Research Directions
To consolidate the results obtained and extend their applicability in medical practice and health policy decision-making, it is necessary to continue the research through several complementary directions. The first direction aims to carry out prospective studies that include detailed clinical variables and validated severity scores, such as the Charlson comorbidity index [
37] or the WIfI (WoundIschemiafoot Infection) score [
38] used to evaluate ischemic ulcers. This type of study would allow a more faithful characterization of the clinical condition and a more precise calibration of the identified profiles.
In parallel, the integration of data from outpatient and home care services is essential to outline a complete picture of the patient's journey through the health system. This integration would help not only to a more realistic assessment of the total costs associated with care, but also to the identification of key moments when effective intervention can be taken to prevent complications.
At the same time, the development of predictive models based on machine learning algorithms would allow estimating the risk of recurrence, complications or excessive consumption of resources from the patient's first contact with the health system. Such models could become useful tools in clinical triage and personalized intervention planning.
Finally, external validation of profiling models on independent populations is indispensable to verify the robustness and transferability of the results. Only by confirming them in other clinical and demographic contexts can the conclusions be generalized and transposed into effective and equitable public health policies.
Conclusions
This study highlights the complexity and practical value of applying modern analytical methods to evaluate the population of patients with chronic skin ulcers. Using the K-Means algorithm, we identified two distinct patient profiles, differing in clinical and socio-demographic characteristics as well as healthcare resource utilization. One profile is characterized by a higher burden on the healthcare system, with advanced age, multiple comorbidities, prolonged hospital stays, and frequent readmissions. In contrast, the second profile comprises younger patients with more favorable clinical outcomes and lower resource consumption.
More broadly, patient profiling represents an essential component of contemporary medicine, aimed at identifying and characterizing subgroups of patients with shared clinical, demographic, behavioral, or genetic traits. By applying advanced statistical techniques such as clustering, multivariate analyses, or machine learning, hidden patterns that are not apparent through conventional descriptive analysis can be uncovered. In the context of chronic ulcers, this approach reveals that although patients share a common diagnosis, subgroups defined by diabetes, venous insufficiency, or immobility exhibit distinct risk profiles and therapeutic needs, which necessitate tailored management plans. Profiling supports treatment personalization, aligns interventions with subgroup-specific needs, and increases therapeutic effectiveness. Additionally, by identifying the most vulnerable patients, those at higher risk of complications, recurrence, or heavy resource use, it enables more efficient allocation of resources and the implementation of preventive measures that reduce costs and enhance care quality.
A key strength of this study lies in its patient-level analysis, aggregating all admissions and diagnoses per individual rather than treating each hospitalization as a separate episode. This methodology provides a more accurate assessment of the chronicity and complexity of each patient’s condition and avoids overestimating case numbers due to repeated admissions of the same individual. Furthermore, the study delivers a detailed profile of patients with chronic ulcers, highlighting differences in age and ulcer types that can inform personalized care strategies and resource planning. The use of a comprehensive database enhancing several years and diverse ulcer etiologies further enhances the robustness and relevance of the findings, contributing to a more nuanced understanding of this heterogeneous patient population.
This segmentation has direct implications for public health policy and hospital management, offering a deeper understanding of how complex cases are distributed within the healthcare system and providing a solid foundation for targeted, differentiated interventions. Patient profiling enables the early identification of individuals who require intensified monitoring, timely interventions, and prioritized resource allocation. Such an approach promotes more efficient use of hospital capacity and budgets, ultimately improving both system performance and patient outcomes.
In summary, patient profiling is an indispensable tool for advancing personalized medicine, optimizing healthcare resources, and enhancing the efficiency and sustainability of health services. Its benefits are evident both at the patient level, through improved clinical outcomes and quality of life, and at the system level, through cost reduction and better strategic planning of interventions.
References
- Mongkornwong, A. , Chiranantasanee, S., & Chaiyawat, P. (2024). Hard-to-heal wounds. Advanced Gut & Microbiome BioTech.
- Olsson, M. , Järbrink, K., Divakar, U., Bajpai, R., Upton, Z., Schmidtchen, A., & Car, J. (2019). The humanistic and economic burden of chronic wounds: A systematic review. Wound Repair and Regeneration, 27(1), 114–125. [CrossRef]
- Tzavella, V. , Salvador, V. G., Karanikola, M., Chatzimichael, D., Papathanassoglou, E., & Koulouvaris, P. (2023). Quality of life of patients with pressure ulcers: A systematic review. Journal of Personalized Medicine, 13(5), 791. [CrossRef]
- Augustin, M. , Blome, C., Heyer, K., Herberger, K., Rustenbach, S. J., Zschocke, I., & Protz, K. (2024). Risk factors for non-healing wounds—a single-centre study. Journal of Clinical Medicine, 13(4), 1003. [CrossRef]
- World Health Organization. (2022, October 1). Ageing and health. https://www.who.int/news-room/fact-sheets/detail/ageing-and-health.
- Ahmad, Z. , & Tupsy, M. S. (2025). The role of artificial intelligence in personalized medicine and predictive diagnostics – a narrative review. Insights-Journal of Health and Rehabilitation, 5(1). https://insightsjhr.com/index.php/home/article/view/510.
- Ennis, W. J. , & Lee, C. (2020). Chronic wounds: Evaluation and management. American Family Physician, 101(3), 159–168. https://www.aafp.org/pubs/afp/issues/2020/0201/p159.html.
- World Health Organization. (2016). International statistical classification of diseases and related health problems (10th rev.). https://icd.who.int/browse10/2016/en.
- Beard, J. R. , Officer, A., de Carvalho, I. A., Sadana, R., Pot, A. M., Michel, J. P., Lloyd-Sherlock, P., Epping-Jordan, J. E., Peeters, G. M. E. E. G., Mahanani, W. R., Thiyagarajan, J. A., & Chatterji, S. (2016). The World report on ageing and health: A policy framework for healthy ageing. The Lancet, 387(10033), 2145–2154. [CrossRef]
- Kreft, D. , Keiler, J., Grambow, E., Kischkel, S., Wree, A., & Doblhammer, G. (2020). Prevalence and mortality of venous leg diseases of the deep veins: An observational cohort study based on German health claims data. Angiology, 71(5), 452–464. [CrossRef]
- Nussbaum, S. R., Carter, M. J., Fife, C. E., DaVanzo, J. E., Haught, R., Nusgart, M., & Cartwright, D. (2018). An economic evaluation of the impact, cost, and Medicare policy implications of chronic nonhealing wounds. Value in Health, 21(1), 27–32. [CrossRef]
- Heyer, K. , Herberger, K., Protz, K., Glaeske, G., & Augustin, M. (2016). Effectiveness of advanced versus conventional wound dressings on healing of chronic wounds: Systematic review and meta-analysis. Dermatology, 232(2), 154–164. [CrossRef]
- Probst, S. , Seppänen, S., Gerber, V., & Hopkins, A. (2023). Patient profiling for chronic wounds: Key predictors of healing and healthcare burden. International Wound Journal, 20(3), 927–936. [CrossRef]
- Epstein, D. , Espin, J., Boyers, D., McInnes, E., & Cullum, N. (2018). Cost-effectiveness of compression treatments for venous leg ulcers. Pharmacoeconomics, 36(1), 95–105. [CrossRef]
- Guest, J. F. , Ayoub, N., McIlwraith, T., Uchegbu, I., Gerrish, A., Weidlich, D., Vowden, K., & Vowden, P. (2017). Health economic burden that different wound types impose on the UK's National Health Service. BMJ Open, 7(12), Article e016616. [CrossRef]
- Posnett, J., & Franks, P. J. (2008). The burden of chronic wounds in the UK. Nursing Times, 104(3), 44–45.
- Andrade, C. (2021). Z scores, standard scores, and composite test scores explained. Indian Journal of Psychological Medicine, 43(6), 555–557. [CrossRef]
- Scikit-learn developers. (2024). Clustering. Scikit-learn 1.4.2 documentation. https://scikit-learn.org/stable/modules/clustering.html.
- Umargono, E. , Suseno, J. E., & Gunawan, S. K. V. (2020). K-means clustering optimization using the elbow method and early centroid determination based on mean and median formula. Advances in Social Science, Education and Humanities Research, 431, 98-104. [CrossRef]
- Mohr, D. L. , Wilson, W. J., & Freund, R. J. (2022). Data and statistics. In D. L. Mohr, W. J. Wilson, & R. J. Freund (Eds.), Statistical methods (4th ed., pp. 1–64). Academic Press. [CrossRef]
- Wasserstein, R. L. , & Lazar, N. A. (2016). The ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129–133. [CrossRef]
- McHugh, M. L. (2013). The chi-square test of independence. Biochemia Medica (Zagreb), 23(2), 143–149. [CrossRef]
- Nachar, N. (2008). The Mann-Whitney U: A test for assessing whether two independent samples come from the same distribution. Tutorials in Quantitative Methods for Psychology, 4(1), 13–20. [CrossRef]
- Razali, N. M. , & Wah, Y. B. (2011). Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21–33.
- Jockenhöfer, F. , Gollnick, H., Herberger, K., Isbary, G., Renner, R., Stücker, M., Valesky, E., Vanscheidt, W., Komar, M., Dissemond, J., Gerber, V., & Augustin, M. (2016). Aetiology, comorbidities and cofactors of chronic leg ulcers: Retrospective evaluation of 1000 patients from 10 specialised dermatological wound care centres in Germany. International Wound Journal, 13(5), 821–828. [CrossRef]
- Finlayson, K. , Mioton, L., Edwards, H., Parker, C., & Gibb, M. (2019). Impact of comorbidities on healing in patients with chronic leg ulcers: A prospective observational study. Wound Practice and Research, 27(3), 142–149. [CrossRef]
- Black, J. (2025, ). The other costs of pressure ulcers. Wounds International. https://woundsinternational.com/journal-articles/the-other-costs-of-pressure-ulcers/.
- Santema, T. B. , Stoekenbroek, R. M., Koelemay, M. J., Reekers, J. A., van Dortmont, L., Ubbink, D. T., & Hinchliffe, R. J. (2019). Wound healing and quality of life in patients with diabetic foot ulcers: A multicentre randomized controlled trial comparing standard care with standard care plus local subatmospheric pressure therapy. Diabetes, Obesity and Metabolism, 21(1), 196–203. [CrossRef]
- Game, F. L. , Hinchliffe, R. J., Apelqvist, J., Armstrong, D. G., Bakker, K., Hartemann, A., Löndahl, M., Price, P. E., van Houtum, W., & Jeffcoate, W. J. (2016). A systematic review of interventions to enhance the healing of chronic ulcers of the foot in diabetes. Diabetes/Metabolism Research and Reviews, 32(S1), 154–168. [CrossRef]
- Guest, J. F. , Fuller, G. W., & Vowden, P. (2020). Cohort study evaluating the burden of wounds to the UK's National Health Service in 2017/2018: Update from 2012/2013. BMJ Open, 10(12), Article e045253. [CrossRef]
- National Wound Care Strategy Programme, & Skills for Health. (2021). National Wound Care Core Capabilities Framework for England. https://www.skillsforhealth.org.uk/wp-content/uploads/2021/05/Wound-Care-Framework-2021.pdf.
- Popa, R. , Mihai, C., & Ionescu, D. (2024). Chronic wound management in Romania: A survey on practices, protocols, and PRP efficacy. International Wound Journal. [CrossRef]
- Oprea, V., Grad, O., Gheorghescu, D., & Moga, D. (2019). Transinguinal preperitoneal mesh plasty—An alternative or a dispensable technique? A prospective analysis vs. Lichtenstein repair for complex unilateral groin hernias. Chirurgia (Bucur), 114(1), 48–56.
- Domnariu, C. D., Ilies, A., & Furtunescu, F. L. (2013). Influence of family modelling on children’s healthy eating behaviour. Revista de Cercetare și Intervenție Socială, (41), 120–131.
- Furtunescu, F., Minca, D. G., Vasile, A., & Domnariu, C. (2009). Alcohol consumption impact on premature mortality in Romania. Romanian Journal of Legal Medicine, 17(4), 265–272. Retrieved May 7, 2025, from http://www.rjlm.ro/index.php/arhiv/113.
- Mihaila, R. G., Nedelcu, L., Fratila, O., Retzler, L., Domnariu, C., Cipaian, R. C., et al. (2011). Effects of simvastatin in patients with viral chronic hepatitis C. Hepatogastroenterology 2011, 58(109), 1296–1300. [CrossRef] [PubMed]
- Charlson, M. E. , Carrozzino, D., Guidi, J., & Patierno, Comorbidity Index: a critical review of clinimetric properties, Psychosomatics, 91(1), 8–35. [CrossRef]
- Mills, J. L., Sr. , Conte, M. S., Armstrong, D. G., Pomposelli, F. B., Schanzer, A., Sidawy, A. N., & Andros, G. (2014). The Society for Vascular Surgery Lower Extremity Threatened Limb Classification System: Risk stratification based on Wound, Ischemia, and foot Infection (WIfI). Journal of Vascular Surgery, 59(1), 220–234.e2. [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).