Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Machine Learning-Based Forest Classification and Regression (FCR) for Spatial Prediction of Liver Fluke (Opisthorchis viverrini) Infection in Small Sub Watershed

Version 1 : Received: 28 August 2023 / Approved: 29 August 2023 / Online: 30 August 2023 (07:06:04 CEST)

A peer-reviewed article of this Preprint also exists.

Pumhirunroj, B.; Littidej, P.; Boonmars, T.; Bootyothee, K.; Artchayasawat, A.; Khamphilung, P.; Slack, D. Machine-Learning-Based Forest Classification and Regression (FCR) for Spatial Prediction of Liver Fluke Opisthorchis viverrini (OV) Infection in Small Sub-Watersheds. ISPRS Int. J. Geo-Inf. 2023, 12, 503. Pumhirunroj, B.; Littidej, P.; Boonmars, T.; Bootyothee, K.; Artchayasawat, A.; Khamphilung, P.; Slack, D. Machine-Learning-Based Forest Classification and Regression (FCR) for Spatial Prediction of Liver Fluke Opisthorchis viverrini (OV) Infection in Small Sub-Watersheds. ISPRS Int. J. Geo-Inf. 2023, 12, 503.

Abstract

Infection of liver flukes (Opisthorchis viverrini) is partly due to their suitability for habitats in sub-basin areas, which causes the intermediate host to remain in the watershed system in all seasons. Spatial monitoring of fluke infection at the small basin analysis scale is important because this can enable analysis at the level of the spatial factors involved and influencing infections. The spatial mathematical model was weighted by the nine spatial factors by dividing the analysis into two levels. 1) sub-basin boundary level analyzed with ordinary least square (OLS) model used to analyze spatial factors of liver fluke aimed at analyzing spatial factors related to human liver fluke infection according to sub-basin boundaries, and 2) infection risk positional analysis level with machine learning-based forest classification and regression (FCR) and displaying predictive results of infection risk locations along stream lines. The analysis results show 4 prototype models that import different independent variable factors. The results show that Model-1 and Model-2 give the most AUC = 0.964 and the variables that influence infection risk the most were distance to stream lines, and distance to water bodies, NDMI and NDVI factors rarely affect accuracy. This FCR machine learning application approach can be applied to the analysis of infection risk areas at the sub-basin level, but independent variables must be screened with a preliminary mathematical model weighted to the spatial units in order to obtain the most accurate predictions.

Keywords

Opisthorchis viverrini; Forest-based classification and regression; Machine learning; Ordinary least square

Subject

Environmental and Earth Sciences, Geography

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.