Submitted:
31 January 2026
Posted:
03 February 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Research Design
2.2. Data Collection and Preprocessing
- Data cleaning:: Missing values were imputed using the mean for continuous variables and the mode for categorical ones.
- Encoding: Categorical variables ( employment type, education level) were converted into numerical form using one-hot encoding.
- Normalization: Continuous variables were standardized using the z-score method to ensure comparability across scales.
- Data splitting:Data splitting: The dataset was divided into training (80%) and test (20%) subsets.
2.3. Analysis Methods
- 1.
- Principal Component Analysis (PCA): PCA was applied to reduce data dimensionality and identify the main components explaining the variance[28]. The transformation is defined as:where (Z) represents the principal components, (X) the original data matrix, and (W) the weight matrix of the components
- 2.
- K-Means Clustering: allowed to group households based on their socio-economic characteristics and their vulnerability. Tis algorithm minimizes the within-cluster sum of squares as follows:where denotes clusters, are centroids, and k is the number of clusters [29].
- 3.
-
Machine Learning Models:
- Logistic Regression: Estimates the probability of vulnerability considering explanatory variables. The model is defined by:where is the probability of vulnerability, X are predictors, and are logistic coefficients indicating each variable’s effect [30].
- Random Forests: A set of decision trees for capturing complex non-linear relationships among variables. Prediction is given by:where represents the decision tree and B is the total number of trees [24]. Variable importance was measured using the mean decrease in impurity (Gini or entropy).
- Support Vector Machines (SVM): SVM maximizes the margin between classes by using a Kernel function: The objective function is:where w is the weight vector, are the slack variables, and C is a regularization parameter controlling the trade-off between margin maximization and error minimization. Kernel functions (linear, polynomial, or Gaussian/RBF) were used to handle non-linear separations.
2.4. Model Evaluationl
3. Results
3.1. Description of Variables
3.2. Economic Vulnerability and Budget Effort Rate


3.3. Principal Component Analysis (PCA) and Clustering for Household Classification in Bukavu
- Cluster 0 (282 households) groups the most vulnerable households, with a high budget effort rate (27.49%) and low income (USD 199.51). These households, often dependent on public transport or motorcycle taxis, face major economic difficulties.
- Cluster 1 (29 households) represents affluent households, with a very low budget effort rate (4.86%) and high income (USD 1,500), enjoying more flexible mobility and better living conditions.
- Cluster 2 (74 households) embodies an intermediate class, with moderate economic characteristics, reflecting a reality often neglected in studies on African cities.
3.4. Comparative Analysis of the Three Machine Learning Models
4. Model Evaluation
4.1. Confusion Matrices; ROC curve and Precision-Recall curve
4.2. Learning Curves
4.3. Cross-Validation Results
5. Discussion
5.1. Structural Vulnerability and Financial Burden of Urban Mobility in Bukavu
5.2. Socio-Economic Segmentation and Cluster Analysis: Towards a Typology of Vulnerability
5.3. Methodological Contribution: Superiority of Machine Learning Models for Contextual Analysis
5.4. Determinant Factors: Beyond Income, Intersectionality of Vulnerabilities
5.5. Policy Implications
5.6. Limitations and Perspectives
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Cling, J.P.; Lagrée, S.; Razafindrakoto, M.; Roubaud, F., Eds. L’économie informelle dans les pays en développement; Number 6 in Conférences & Séminaires, Agence Française de Développement (AFD): Paris, France, 2012; p. 363. Sélection de communications présentées lors de la Conférence internationale sur l’économie informelle, Hanoï (Vietnam), mai 2010.
- Porter, G.; Abane, A.; Lucas, K. User diversity and mobility practices in Sub-Saharan African cities: Understanding the needs of vulnerable populations. The state of knowledge and research. SSATP Working Paper No. 108, Africa Transport Policy Program (SSATP), The World Bank 2020. Working Paper No. 108, Sub-Saharan Africa Transport Policy Program (SSATP).
- Dávila, J.D., Ed. Urban Mobility and Poverty: Lessons from Medellin and Soacha, Colombia; University College London & Facultad de Arquitectura, Universidad Nacional de Colombia, Medellín: London, UK, 2013; p. 214.
- Nicolas, J.P.; Vanco, F.; Verry, D. Mobilité quotidienne et vulnérabilité des ménages. Revue d’Économie Régionale et Urbaine, 1, 5–30.
- Kyprianou, I.; Serghides, D.; Carlucci, S. Urban vulnerability in the EMME region and sustainable development goals: A new conceptual framework. Sustainable Cities and Society, 80, 103763. [CrossRef]
- Santos, T.; Fernandes, V.A.; Cardoso, M.; da Silva, M. Resilience and Vulnerability of Urban Mobility Systems in Developing Countries: A Case Study of Rio De Janeiro’s Transportation Fare Policy. [CrossRef]
- mondiale, B. Urban Transport and Poverty in Sub-Saharan Africa: A Policy Brief. Technical report, Banque mondiale, 2015. Consulté le 15 octobre 2025.
- Reis, E.C.G.d.; Véras, M.P.B. Social inequalities, territories of vulnerability, and urban mobility. Cadernos Metrópole, 26, 537–560. [CrossRef]
- Phippard, T. Urban fractures: mobility, risk and the accidenté in Kikwit, Democratic Republic of Congo. Africa 2023, 93, 140–158. [CrossRef]
- Wilson Janssens, M.C. Spatial mobility and social becoming: The journeys of four Central African Students in Congo-Kinshasa. Geoforum, 116, 252–261. [CrossRef]
- Diop, D.; Timera, A.S. Diamniadio : naissance d’une nouvelle ville : enjeux et défis d’une gouvernance durable; Éditions L’Harmattan: Sénégal, 2018; p. 228. Publié le 29 mars 2018.
- Ansah, E.; Amoadu, M.; Obeng, P.; Sarfo, J.O. Climate change, urban vulnerabilities and adaptation in Africa: a scoping review. Climatic Change, 177. [CrossRef]
- Ngomba Yashele, K.; Nsombo Mosombo, B. Perception paysanne des impacts de la variabilité climatique autour de la station de l’INERA/Kipopo dans la province du Katanga en République Démocratique Congo. VertigO - la revue électronique en sciences de l’environnement 2017, 17. Publisher: Les éditions en environnements VertigO, Number: 17-3, . [CrossRef]
- Ndayiragije, R.; Alidou, S.; Ansoms, A., Eds. Conjonctures de l’Afrique centrale 2022; Vol. 98, Cahiers africains, L’Harmattan, 2022; p. 310. Sélection de textes en lien avec la situation politique, économique et sociale de l’année 2021 en Afrique centrale.
- Shamamba, D.B.; Ansoms, A.; Basengere, E.B.; Lebailly, P. L’agriculture familiale à l’épreuve de la concurrence foncière au Sud-Kivu. Conjonctures de l’Afrique centrale, Cahiers africains 2022, 97, 293–312. Étude de conjoncture / Cahiers africains.
- Muzalia, G.; Mukungilwa, B.; Bisimwa, S.; Hoffmann, K.; Nalunva, A.; Batumike, E.; Mapatano, J.; Dunia, O.; Cirhuza, E.; Muderhwa, V. Roadblocks ‘at the rhythm of the country’: Predation and beyond in South Kivu, Democratic Republic of Congo. Technical report, Governance in Conflict Network, 2021. Accessed: 2025-10-15.
- Munyaka, J.C.B.; Yadavalli, V.S.S. Using transportation problem in humanitarian supply chain to prepositioned facility locations: a case study in the Democratic Republic of the Congo. International Journal of System Assurance Engineering and Management, 12, 199–216. [CrossRef]
- Tristan, A. Sur le taux d’effort budgétaire et la vulnérabilité des ménages pour leur mobilité quotidienne : cas des ménages de la commune d’Ibanda dans la ville de Bukavu.
- Bucekuderhwa, C. Technology adoption in South Kivu province subsistence farming of DRC. Bukavu Journal of Economics and Social Sciences (BJESS), Numéro 3, 32–76.
- Büttner, B.; Wulfhorst, G.; Ji, C.; Crozet, Y.; Mercier, A.; Ovtracht, N. The Impact of Sharp Increases in Mobility Costs Analysed by Means of the Vulnerability Assessment. In Proceedings of the Proceedings of the 13th World Conference on Transport Research (WCTR), Rio de Janeiro, Brazil, 2013; pp. 1–15. Accessed: 2025-10-15.
- Diaz Olvera, L.; Plat, D.; Pochet, P. The puzzle of mobility and access to the city in Sub-Saharan Africa. Journal of Transport Geography, 32, 56–64. [CrossRef]
- Athey, S. The Impact of Machine Learning on Economics. In The Economics of Artificial Intelligence: An Agenda; University of Chicago Press, 2018; pp. 507–547.
- Ludwig, J.; Mullainathan, S.; Spiess, J. Machine-Learning Tests for Effects on Multiple Outcomes, [1707.01473 [stat]]. [CrossRef]
- Breiman, L. Random Forests. Machine Learning, 45, 5–32. [CrossRef]
- Omar, E.D.; Mat, H.; Abd Karim, A.Z.; Sanaudi, R.; Ibrahim, F.H.; Omar, M.A.; Ismail, M.Z.H.; Jayaraj, V.J.; Goh, B.L. Comparative Analysis of Logistic Regression, Gradient Boosted Trees, SVM, and Random Forest Algorithms for Prediction of Acute Kidney Injury Requiring Dialysis After Cardiac Surgery. International Journal of Nephrology and Renovascular Disease, 17, 197–204. Publisher: Dove Medical Press _eprint: https://www.tandfonline.com/doi/pdf/10.2147/IJNRD.S461028, . [CrossRef]
- Salazar, D.A.; Vélez, J.I.; Salazar, J.C. Comparison between SVM and Logistic Regression: Which One is Better to Discriminate? Revista Colombiana de Estadística.
- Kurakin, A.; Goodfellow, I.; Bengio, S. Adversarial Machine Learning at Scale, [1611.01236 [cs]]. [CrossRef]
- Jolliffe, I.T.; Cadima, J. Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374, 20150202. [CrossRef]
- MacQueen, J. Some Methods for Classification and Analysis of Multivariate Observations. In Proceedings of the Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, 1967, Vol. 1, pp. 281–297.
- Fagerland, M.W.; Hosmer, D.W. A Generalized Hosmer–Lemeshow Goodness-of-Fit Test for Multinomial Logistic Regression Models. The Stata Journal, 12, 447–453. Publisher: SAGE Publications, . [CrossRef]
- Fawcett, T. An introduction to ROC analysis. Pattern Recognition Letters, 27, 861–874. [CrossRef]
- Douglas, M..; Haikins. The Problem of Overfitting | Journal of Chemical Information and Modeling. J. Chem.inf. comput. sci/ 20224.
- Arlot, S. Cross-validation, [1703.03167 [math]]. [CrossRef]
- Cervero, R. Linking urban transport and land use in developing countries. Journal of Transport and Land Use, 6, 7–24. Publisher: Journal of Transport and Land Use.
- Hornby, T.G.; Henderson, C.E.; Plawecki, A.; Lucas, E.; Lotter, J.; Holthus, M.; Brazg, G.; Fahey, M.; Woodward, J.; Ardestani, M.; et al. Contributions of Stepping Intensity and Variability to Mobility in Individuals Poststroke. Stroke, 50, 2492–2499. Publisher: American Heart Association, . [CrossRef]
- Pavageau, C.; Locatelli, B.; Tiani, A.M.; Zida, M. Cartographier la vulnérabilité aux variations climatiques : une méta-analyse en Afrique. Technical report, Center for International Forestry Research (CIFOR), 2013.
- Hauslbauer, A.L.; Schade, J.; Petzoldt, T. The identification of mobility types on a national level. Transport Policy, 125, 289–298. [CrossRef]
- Alhassan, T.F.; Ansah, E.O.; Niyazbekova, S.U.; Blokhina, T.K. The impact of foreign investment in financing sustainable development in Sub-Saharan African countries. Russian Journal of Economics 2024, 10, 60–83. Publisher: Non-profit partnership "Voprosy Ekonomiki", . [CrossRef]
- Adelekan, I.O.; Asiyanbi, A.P. Flood risk perception in flood-affected communities in Lagos, Nigeria. Natural Hazards, 80, 445–469. [CrossRef]
- Nkoa, B.E.O.; Song, J.S. Urbanisation et inégalités en Afrique : une étude à partir des indices désagrégés. Revue d’Économie Régionale et Urbaine 2019, pp. 447–484. Publisher: Armand Colin, . [CrossRef]
- Oviedo, D.C.; Pinzón, M.S.; Rodríguez-Araña, S.; Tratner, A.E.; Pauli-Quirós, E.; Chavarría, C.; Posada Rodríguez, C.; Britton, G.B. Psychosocial response to the COVID-19 pandemic in Panama. Frontiers in Public Health, 10, 919818. [CrossRef]
- Bernard, S.; Bissonnette, J.F. Les politiques agricoles de l’Indonésie et de la Malaisie face aux impératifs de la sécurité alimentaire. VertigO : la revue électronique en sciences de l’environnement, 14. Publisher: Université du Québec à Montréal.
- Soja, E.W.; Dufaux, F.; Gervais-Lambony, P.; Buire, C.; Desbois, H. La justice spatiale et le droit à la ville : un entretien avec Edward Soja. Justice spatiale = Spatial Justice 2011. Publisher: Université Paris Ouest Nanterre La Défense, UMR LAVUE 7218, Laboratoire Mosaïques.
- Mullainathan, S.; Spiess, J. Machine Learning: An Applied Econometric Approach. Journal of Economic Perspectives, 31, 87–106. [CrossRef]
- Alam, A.; Khalil, M.B. Gender, (im)mobility and social relations shaping vulnerabilities in coastal Bangladesh. International Journal of Disaster Risk Reduction, 82, 103342. [CrossRef]
- Satterthwaite, D. The Under-estimation of Urban Poverty in Low and Middle-income Nations; IIED, 2004. Google Books ID: 7Gnnm_FTG_gC.





| Variable | Mean | Std. Dev. | Min | 25% | 50% | 75% | Max |
|---|---|---|---|---|---|---|---|
| Transport (USD) | 51.37 | 25.46 | 0.00 | 35.00 | 50.00 | 60.00 | 200.00 |
| Income (USD) | 402.08 | 371.83 | 30.00 | 150.00 | 260.00 | 500.00 | 1500.00 |
| Rent (USD) | 133.80 | 96.09 | 0.00 | 70.00 | 100.00 | 200.00 | 700.00 |
| Age (years) | 39.42 | 15.26 | 24.00 | 24.00 | 45.50 | 45.50 | 89.00 |
| Car Ownership | 0.436 | 1.28 | 0.00 | 0.00 | 0.00 | 0.00 | 4.00 |
| Mobility Rate | 11.42 | 8.56 | 0.00 | 7.00 | 8.00 | 14.00 | 47.00 |
| Budget Effort Rate | 23.02 | 12.27 | 0.00 | 10.00 | 16.67 | 23.52 | 50.00 |
| Cluster | Size | Budget Effort Rate ( %) | Avg. Income | Avg. Transport Expenses | Vehicles per Household |
|---|---|---|---|---|---|
| 0 | 282 | 27.49 | 199.51 | 46.70 | 0.66 |
| 1 | 29 | 4.86 | 1500.00 | 72.93 | 2.62 |
| 2 | 74 | 9.82 | 660.54 | 63.14 | 2.89 |
| Model | Precision | Recall | F1-Score | AUC-ROC | Accuracy | Time(s) |
|---|---|---|---|---|---|---|
| Logistic Regression | 0.96 | 1.00 | 0.98 | 0.999 | 0.987 | 0.021 |
| SVM | 0.96 | 1.00 | 0.98 | 1.00 | 0.987 | 0.030 |
| Random Forest | 1.00 | 0.90 | 0.95 | 0.997 | 0.974 | 0.185 |
| Model | Mean Accuracy | Standard Deviation |
|---|---|---|
| Random Forest | 0.9455 | 0.0727 |
| SVM | 0.9247 | 0.0654 |
| Logistic Regression | 0.9636 | 0.0397 |
| Variables | Logistic Regression | SVM | Random Forest Importance |
|---|---|---|---|
| Transport Expenditure | 2.347926 | 2.026481 | 0.175443 |
| Household Income | -5.478468 | -5.564908 | 0.314423 |
| Housing Rent | -0.025524 | 0.053763 | 0.165803 |
| Age of Household Head | -0.510877 | -0.421659 | 0.040898 |
| Occupation Status | -0.632006 | -0.314694 | 0.023974 |
| Education Level | -0.797172 | -0.335017 | 0.055182 |
| Gender of Household Head | 0.598190 | 0.276443 | 0.011202 |
| Employment: Unemployed | -0.023989 | -0.083704 | 0.001989 |
| Employment: Entrepreneur | 0.172913 | 0.091994 | 0.030739 |
| Employment: Private Employee | -0.677285 | -0.365117 | 0.010238 |
| Employment: Public Employee | 0.172913 | 0.091994 | 0.021575 |
| Number of Dependents | 0.573524 | 0.415332 | 0.056644 |
| Car Ownership | -0.379830 | 0.431540 | 0.018048 |
| Mobility Rate | 0.059943 | -0.002723 | 0.037358 |
| Residence: Bagira | 1.037223 | 0.903137 | 0.019356 |
| Residence: Ibanda | -0.416353 | -0.451564 | 0.008131 |
| Residence: Kadutu | -0.625747 | -0.451573 | 0.008996 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).