Submitted:
30 May 2025
Posted:
06 June 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Methods
2.1. Overview of Count Data Models
2.1.1. Robust Zero-Inflated Models
2.1.2. Robust Hurdle Models
2.2. Simulation Study
2.3. Model Comparison
3. Results
3.1. Simulation Study Findings
3.1.1. Initial Assessment of the Simulation Data
3.1.2. AIC Comparison Across Regression Models
3.1.3. Performance under Low Outlier Levels and Increasing Dispersion
3.1.4. Performance under Moderate Outlier Levels and Increasing Dispersion
3.1.5. Influence of Sample Size on Model Performance
3.2. Real Data Application
3.3. Description of the Study Sample
3.3.1. Regression Diagnostics: Multicollinearity Test.
3.4. Comparison of the Fitted Count Data Models
3.4.1. Model Evaluation
3.4.2. Vuong Test
3.4.3. Model Comparison Using AIC and BIC
3.5. Robust Count Regression Estimation Results
3.6. The RZIP Model Validation
3.7. Overview of Application Results
3.7.1. Model Formulation of The Study
4. Conclusion
Author Contributions
Acknowledgments
Appendix A
Appendix A.1
| Sample Size | Poisson | NB | RZIP | RZINB | RHP | RHNB |
|---|---|---|---|---|---|---|
| 50 | 266.960 | 153.870 | 268.480 | 199.260 | 268.020 | 199.130 |
| 200 | 589.190 | 461.600 | 489.710 | 470.840 | 488.640 | 469.150 |
| 500 | 1896.760 | 1389.060 | 1650.340 | 1401.300 | 1649.590 | 1400.290 |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Parameter | RHP (Count Model) | RHNB (Count Model) | ||||||
|---|---|---|---|---|---|---|---|---|
| Estimate (SE) | P-Value | Estimate (SE) | P-Value | |||||
| Intercept | -0.8571 (0.1514) | <0.0001 | -1.0239 (0.3020) | <0.0001 | ||||
| BreachDelivery | 0.1159 (0.0496) | 0.0195 | 0.1467 (0.1250) | 0.2404 | ||||
| AssistedDeliveries | -0.0038 (0.0200) | 0.8504 | 0.0072 (0.0315) | 0.8193 | ||||
| EarlyTeenPreg | 0.0568 (0.0250) | 0.0232 | 0.1514 (0.0215) | <0.0001 | ||||
| LateTeenPreg | 0.0219 (0.0150) | 0.1443 | 0.0116 (0.0346) | 0.7376 | ||||
| Parameter | RHP (Zero Model) | RHNB (Zero Model) | ||||||
| Estimate (SE) | P-Value | Estimate (SE) | P-Value | |||||
| Intercept | -2.8706 (1.1909) | 0.0159 | -2.8867 (1.2799) | 0.0241 | ||||
| BreachDelivery | -0.1578 (0.4464) | 0.7238 | -0.08897 (0.5621) | 0.8742 | ||||
| AssistedDeliveries | -16.7361 (15.3676) | 0.2761 | -0.01931 (0.0747) | 0.7959 | ||||
| EarlyTeenPreg | 1.0192 (0.7228) | 0.1585 | 0.2751 (1.3401) | 0.8373 | ||||
| LateTeenPreg | 0.1195 (0.0464) | 0.0100 | 0.0700 (0.0699) | 0.3171 | ||||
| Parameter | RHP (Count Model) | RHNB (Count Model) | ||
|---|---|---|---|---|
| Estimate (SE) | P-Value | Estimate (SE) | P-Value | |
| Intercept | -0.8527 (0.2694) | 0.0016 | -0.8753 (0.3306) | 0.0081 |
| BreachDelivery | 0.1357 (0.0462) | 0.0033 | 0.1476 (0.1172) | 0.2080 |
| AssistedDeliveries | 0.0046 (0.0242) | 0.8491 | 0.0047 (0.0256) | 0.8534 |
| EarlyTeenPreg | 0.0539 (0.0273) | 0.0484 | 0.0547 (0.0320) | 0.0874 |
| LateTeenPreg | 0.0133 (0.0210) | 0.5263 | 0.0126 (0.0226) | 0.5770 |
| Parameter | RHP (Zero Model) | RHNB (Zero Model) | ||
| Estimate (SE) | P-Value | Estimate (SE) | P-Value | |
| Intercept | -0.5983 (0.1826) | 0.0011 | -0.4389 (0.2206) | 0.0421 |
| BreachDelivery | 0.1003 (0.1335) | 0.4529 | 0.0513 (0.0235) | 0.0582 |
| AssistedDeliveries | 0.0074 (0.0342) | 0.8275 | 0.0654 (0.0412) | 0.05827 |
| EarlyTeenPreg | -0.0943 (0.0507) | 0.0627 | -0.0853 (0.0761) | 0.7681 |
| LateTeenPreg | -0.0226 (0.0149) | 0.1308 | -0.0167 (0.0149) | 0.8135 |
References
- Shahsavari, S. Shahsavari, S. Robust Inference for Zero-Inflated Models with Outliers Applied to the Number of Involved Lymph Nodes in Patients with Breast Cancer 2023.
- Feng, C.X. A comparison of zero-inflated and hurdle models for modeling zero-inflated count data. Journal of statistical distributions and applications 2021, 8, 8. [Google Scholar] [CrossRef] [PubMed]
- Lambert, D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 1992, 34, 1–14. [Google Scholar] [CrossRef]
- Min, Y.; Agresti, A. Random effect models for repeated measures of zero-inflated count data. Statistical modelling 2005, 5, 1–19. [Google Scholar] [CrossRef]
- Hall, D.B. Robust estimation for zero-inflated Poisson regression. Scandinavian Journal of Statistics 2010, 37, 237–252. [Google Scholar] [CrossRef]
- Mullahy, J. Specification and testing of some modified count data models. Journal of econometrics 1986, 33, 341–365. [Google Scholar] [CrossRef]
- Tüzen, F. A simulation study for count data models under varying degrees of outliers and zeros. Communications in Statistics - Simulation and Computation 2018, 49, 1078–1088. [Google Scholar] [CrossRef]
- Tawiah, K.; Iddi, S.; Lotsi, A. On Zero-Inflated Hierarchical Poisson Models with Application to Maternal Mortality Data. International Journal of Mathematics and Mathematical Sciences 2020, 2020, 1407320. [Google Scholar] [CrossRef]
- Bassey, U.E.; Akinyemi, M.I.; Njoku, K.F. On Zero inflated models with applications to maternal healthcare utilization. International Journal of Mathematical Sciences and Optimization: Theory and Applications 2021, 7, 65–75. [Google Scholar] [CrossRef]
- Abonazel, M.R.; El-sayed, S.M.; Saber, O.M. Performance of robust count regression estimators in the case of overdispersion, zero inflated, and outliers: simulation study and application to German health data. Commun. Math. Biol. Neurosci. 2021, 2021, Article-ID. [Google Scholar]
- Adehi, M.; Yakasai, A.; Dikko, H.; Asiribo, E.; Dahiru, T. Risk of maternal mortality using relative risk ratios obtained from poisson regression analysis. International Journal of Development Research 2017, 7, 15405–15409. [Google Scholar]
- Okello, S.; Otieno Omondi, E.; Odhiambo, C.O. Improving performance of hurdle models using rare-event weighted logistic regression: an application to maternal mortality data. Royal Society Open Science 2023, 10, 221226. [Google Scholar] [CrossRef]
- Chau, A.M.H.; Lo, E.C.M.; Wong, M.C.M.; Chu, C.H. Interpreting poisson regression models in dental caries studies. Caries Research 2018, 52, 339–345. [Google Scholar] [CrossRef] [PubMed]
- Shahsavari, S.; Moghimbeigi, A.; Kalhor, R.; Jafari, A.M.; Bagherpour-kalo, M.; Yaseri, M.; Hosseini, M. Zero-Inflated Count Regression Models in Solving Challenges Posed by Outlier-Prone Data; an Application to Length of Hospital Stay. Archives of Academic Emergency Medicine 2024, 12, e13–e13. [Google Scholar] [PubMed]
- Huber, P.J. Robust estimation of a location parameter. In Breakthroughs in statistics: Methodology and distribution; Springer, 1992; pp. 492–518.
- Cantoni, E.; Zedini, A. A robust version of the hurdlemodel. Journal of Statistical Planning and Inference 2011, 141, 1214–1223. [Google Scholar] [CrossRef]
- Miranda, M.; Miranda, M.C.; Gomes, M.I. A robust hurdle poisson model in the estimation of the extremal index. Springer Proceedings in Mathematics &Amp; Statistics 2022, pp. 15–28. [CrossRef]
- Hamura, Y.; Irie, K.; Sugasawa, S. Robust hierarchical modeling of counts under zero-inflation and outliers. arXiv preprint arXiv:2106.10503 2021.
- Akaike, H. A new look at the statistical model identification. IEEE transactions on automatic control 1974, 19, 716–723. [Google Scholar] [CrossRef]
- Vuong, Q. Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica 1989, 57, 307. [Google Scholar] [CrossRef]



| Sample Size | Prop of Zeros | Dispersion | Outliers |
|---|---|---|---|
| 0.00 | |||
| 50 | 0.5 | 1 | 0.05 |
| 200 | 0.7 | 3 | 0.10 |
| 500 | 0.8 | 5 | 0.15 |
| Assisted | Breach | EarlyTeen | LateTeen | MaternalDeaths | |
|---|---|---|---|---|---|
| AssistedDeliveries | 1.000 | 0.001 | 0.064 | 0.199 | 0.035 |
| BreachDelivery | 0.001 | 1.000 | 0.264 | 0.157 | 0.734 |
| EarlyTeenPreg | 0.064 | 0.264 | 1.000 | 0.235 | 0.388 |
| LateTeenPreg | 0.199 | 0.157 | 0.235 | 1.000 | 0.222 |
| MaternalDeaths | 0.035 | 0.434 | 0.388 | 0.222 | 1.000 |
| Factors | No death reported(%) | Death reported (%) |
|---|---|---|
| Assisted Delivery | 28.79 | 71.21 |
| Breech Delivery | 52.69 | 47.31 |
| Early Teen Pregnancy | 25.42 | 74.58 |
| Late Teen Pregnancy | 50.88 | 49.12 |
| Model | RZIP | RZINB | RHP | RHNB | POIS | NB | Best Model |
|---|---|---|---|---|---|---|---|
| RZIP | 0.004*** | 0.002 *** | 0.003*** | 0.002*** | 0.000*** | RZIP | |
| RZINB | 0.012** | 0.011** | 0.000 *** | 0.000 *** | RZINB | ||
| RHP | 0.011** | 0.003*** | 0.002*** | RHP | |||
| RHNB | 0.003*** | 0.003*** | RHNB | ||||
| POIS | 0.003*** | NB | |||||
| NB |
| Model | AIC | BIC |
|---|---|---|
| Poisson | 660.2816 | 677.2950 |
| NB | 554.6014 | 575.0174 |
| RZIP | 366.8992 | 400.9260 |
| RZINB | 395.2032 | 432.6327 |
| RHP | 372.6716 | 406.6984 |
| RHNB | 374.6517 | 412.0811 |
| Parameter | RZIP (Count Model) | RZIP (Zero Model) | ||||
|---|---|---|---|---|---|---|
| Estimate (SE) | P-Value | Estimate (SE) | P-Value | |||
| Intercept | -0.8571 (0.1514) | <0.0001 | -2.8706 (1.1909) | 0.0159 | ||
| BreachDelivery | 0.1159 (0.0496) | 0.0195 | -0.1578 (0.4464) | 0.7238 | ||
| AssistedDeliveries | -0.0038 (0.0200) | 0.8504 | -16.7361 (15.3676) | 0.2761 | ||
| EarlyTeenPreg | 0.0568 (0.0250) | 0.0232 | 1.0192 (0.7228) | 0.1585 | ||
| LateTeenPreg | 0.0219 (0.0150) | 0.1443 | 0.1195 (0.0464) | 0.0100 | ||
| Statistic | Value |
|---|---|
| Chi-Square Statistic | 20809.280 |
| Degrees of Freedom | 6.000 |
| P-value | 0.198 |
| Comparison | df1 | df2 | LogLik1 | LogLik2 | Chisq | p-value |
|---|---|---|---|---|---|---|
| ZIP vs RZINB | 10 | 11 | -173.45 | -186.60 | 26.304 | 0.09 |
| RZIP vs RHP | 10 | 10 | -173.45 | -176.34 | 5.7724 | 0.06 |
| RZIP vs RHNB | 10 | 11 | -173.45 | -176.33 | 5.7525 | 0.47 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).








