Preprint Review Version 1 Preserved in Portico This version is not peer-reviewed

Data Reduction Using Principal Component Analysis: Theoretical Underpinnings and Practical Applications in Public Health

Version 1 : Received: 22 December 2023 / Approved: 25 December 2023 / Online: 26 December 2023 (01:30:21 CET)

How to cite: Yoseph, A. Data Reduction Using Principal Component Analysis: Theoretical Underpinnings and Practical Applications in Public Health. Preprints 2023, 2023121767. https://doi.org/10.20944/preprints202312.1767.v1 Yoseph, A. Data Reduction Using Principal Component Analysis: Theoretical Underpinnings and Practical Applications in Public Health. Preprints 2023, 2023121767. https://doi.org/10.20944/preprints202312.1767.v1

Abstract

Abstract Big datasets are becoming increasingly common and can be challenging to understand and apply in public health. One method for lowering the dimensionality of these datasets and improving interpretability while minimizing information loss is data reduction using principal component analysis (PCA). It achieves this by successively maximizing variance through the creation of new, uncorrelated variables. PCA is an adaptive data analysis technique because it simplifies the process of finding new variables, or principal components, by solving an eigenvalue or eigenvector problem. These new variables are determined by the dataset being used, rather than by the analyst starting from scratch. It is also adaptable in another way because varieties of the method have been designed to adjust to various data structures and types. However, there are serious problems in the theoretical understanding and practical application of PCA in public health researchers, whereas its application is becoming popular in developing countries. Therefore, this article, which concentrated on using PCA to reduce data, began by outlining the fundamental concepts of PCA and going over what it can and cannot do, as well as when and how to use it. We also discussed the fundamental assumptions, benefits, and drawbacks of PCA. Furthermore, this article demonstrated and resolved PCA practical application problems in public health that most scholars are unaware of, such as variable preparation, variable inclusion and exclusion criteria for PCA, iteration steps, wealth index analysis, interpretation, and ranking.

Keywords

principal component analysis; data reduction; wealth index; public health; eigenvalue or eigenvector; communality; complex structure; anti-image; covariance matrix

Subject

Public Health and Healthcare, Other

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.