Version 1
: Received: 22 December 2023 / Approved: 25 December 2023 / Online: 26 December 2023 (01:30:21 CET)
How to cite:
Yoseph, A. Data Reduction Using Principal Component Analysis: Theoretical Underpinnings and Practical Applications in Public Health. Preprints2023, 2023121767. https://doi.org/10.20944/preprints202312.1767.v1
Yoseph, A. Data Reduction Using Principal Component Analysis: Theoretical Underpinnings and Practical Applications in Public Health. Preprints 2023, 2023121767. https://doi.org/10.20944/preprints202312.1767.v1
Yoseph, A. Data Reduction Using Principal Component Analysis: Theoretical Underpinnings and Practical Applications in Public Health. Preprints2023, 2023121767. https://doi.org/10.20944/preprints202312.1767.v1
APA Style
Yoseph, A. (2023). <strong></strong>Data Reduction Using Principal Component Analysis: Theoretical Underpinnings and Practical Applications in Public Health. Preprints. https://doi.org/10.20944/preprints202312.1767.v1
Chicago/Turabian Style
Yoseph, A. 2023 "<strong></strong>Data Reduction Using Principal Component Analysis: Theoretical Underpinnings and Practical Applications in Public Health" Preprints. https://doi.org/10.20944/preprints202312.1767.v1
Abstract
Abstract Big datasets are becoming increasingly common and can be challenging to understand and apply in public health. One method for lowering the dimensionality of these datasets and improving interpretability while minimizing information loss is data reduction using principal component analysis (PCA). It achieves this by successively maximizing variance through the creation of new, uncorrelated variables. PCA is an adaptive data analysis technique because it simplifies the process of finding new variables, or principal components, by solving an eigenvalue or eigenvector problem. These new variables are determined by the dataset being used, rather than by the analyst starting from scratch. It is also adaptable in another way because varieties of the method have been designed to adjust to various data structures and types. However, there are serious problems in the theoretical understanding and practical application of PCA in public health researchers, whereas its application is becoming popular in developing countries. Therefore, this article, which concentrated on using PCA to reduce data, began by outlining the fundamental concepts of PCA and going over what it can and cannot do, as well as when and how to use it. We also discussed the fundamental assumptions, benefits, and drawbacks of PCA. Furthermore, this article demonstrated and resolved PCA practical application problems in public health that most scholars are unaware of, such as variable preparation, variable inclusion and exclusion criteria for PCA, iteration steps, wealth index analysis, interpretation, and ranking.
Keywords
principal component analysis; data reduction; wealth index; public health; eigenvalue or eigenvector; communality; complex structure; anti-image; covariance matrix
Subject
Public Health and Healthcare, Other
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.