Submitted:
12 February 2025
Posted:
13 February 2025
You are already at the latest version
Abstract
This technical note investigates a p-value paradox that emerges in the conventional proportion test. The paradox isdefined as the phenomenon where “decisions made on the same effect size from data of different sample sizes may be inconsistent.” It is illustrated with two examples from clinical trial research. We argue that this p-value paradox stems from the use (or misuse) of p-values to compare two proportions and make decisions. We propose replacing the conventional proportion test and its p-value with estimation statistics that include both the observed effect size and a reliability measure known as the signal content index (SCI).
Keywords:
1. Introduction
2. Two Examples of the p-Value Paradox
Suppose a clinician wanted to test whether the prevalence of a disease was 10%. To do so, the clinician selected a sample of 10 individuals, found that two of the 10 had the disease, and used evidence from the sample (20% sample incident rate) to make inferences about the population prevalence. With p=0.26, the hypothesis was not rejected.…, suppose we increased the sample size from 10 to 50, of which 10 had the disorder (the sample incident rate remained at 20%). This yielded a p value of 0.02. Although the new sample had the same (20%) incident rate, the null hypothesis was rejected under a significance level of 0.05. This test, however, would still fail to reject the null under a significance level of 0.005. Now consider an even larger sample of 100, of which 20 had the disease (the sample incident rate remained 20%), but the p value was 0.002. The hypothesis was rejected under 0.005.
3. Resolution to the p-Value Paradox
3.1. What Does the p-Value Produced by the Conventional Proportion Test Really Mean?
3.2. Resolution to the p-Value Paradox
4. Alternative to the Conventional Proportional Test and Its p-Value
5. Conclusion and Recommendation
Conflicts of Interest
References
- Bonovas, S. & Piovani, D. (2023). On p-Values and Statistical Significance. J. Clin. Med. 12, 900. [CrossRef]
- Chén, O. Y., Bodelet, J. S., Saraiva, R. G., Phan, H., Di, J., Nagels, G., Schwantje, T., Cao, H., Gou, J., Reinen, J. M., Xiong, B., Zhi, B., Wang, X., & de Vos, M. (2023). The roles, challenges, and merits of the p value. Patterns (New York, N.Y.), 4(12), 100878. [CrossRef]
- Dixon P. (2003). The p-value fallacy and how to avoid it. Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale, 57(3), 189–202. [CrossRef]
- Dunkler D, Haller M, Oberbauer R, Heinze G. (2020). To test or to estimate? P-values versus effect sizes. Transpl Int. 33(1), 50-55. Epub 2019 Oct 21. [CrossRef] [PubMed] [PubMed Central]
- Goodman S. N. (1999). Toward evidence-based medical statistics. 1: The p value fallacy. Annals of Internal Medicine. 130(12), 995–1004.
- Huang H. (2019). Signal content index (SCI): a measure of the effectiveness of measurements and an alternative to p-value for comparing two means. Measurement Science and Technology, 31, 045008. [CrossRef]
- Huang H. (2023). Probability of net superiority for comparing two groups or group means. Lobachevskii Journal of Mathematics, 44(11), 42-54.
- Huang, H. (2024). Comments on “The Roles, Challenges, and Merits of the p Value” by Chén et al. Basic and Applied Social Psychology, 1–7. [CrossRef]
- Joint Committee for Guides in Metrology (JCGM) (2008). Evaluation of Measurement Data - Guide to the Expression of Uncertainty in Measurement (GUM 1995 with minor corrections). Sevres, France.
- Nuzzo R. L. (2015). The inverse fallacy and interpreting P values. PM & R: the journal of injury, function, and rehabilitation, 7(3), 311–314. [CrossRef]
| n | Observed effect size | RES (%) | u(Δ) | SCI | ||
| 10 | 0.2 | 0.1 | 0.1 | 66.7 | 0.126 | 0.38 |
| 50 | 0.2 | 0.1 | 0.1 | 66.7 | 0.057 | 0.76 |
| 100 | 0.2 | 0.1 | 0.1 | 66.7 | 0.040 | 0.86 |
| n | Observed effect size | RES (%) | u(Δ) | SCI | ||
| 200 | 0.14 | 0.2 | 0.06 | 35.3 | 0.037 | 0.72 |
| 850 | 0.14 | 0.2 | 0.06 | 35.3 | 0.018 | 0.92 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).