Submitted:
24 June 2026
Posted:
25 June 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Study Population
2.2. Statistical Analysis
2.3. Model Construction
2.4. Model Evaluation

3. Results
3.1. Clinical Features and Variable Selection

3.2. Development and Evaluation of Diagnostic Models

3.3. SHAP Analysis of the Random Forest Model

4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021 May;71(3):209-249.
- Singh D, Vignat J, Lorenzoni V, et al. Global estimates of incidence and mortality of cervical cancer in 2020: a baseline analysis of the WHO Global Cervical Cancer Elimination Initiative. Lancet Glob Health. 2023 Feb;11(2):e197-e206. [CrossRef]
- World Health Organization. WHO guideline for screening and treatment of cervical pre-cancer lesions for cervical cancer prevention. 2nd ed. Geneva: World Health Organization; 2021.
- McCredie MR, Sharples KJ, Paul C, et al. Natural history of cervical neoplasia and risk of invasive cancer in women with cervical intraepithelial neoplasia 3: a retrospective cohort study. Lancet Oncol. 2008 May;9(5):425-34. [CrossRef]
- Saslow D, Solomon D, Lawson HW, et al. American Cancer Society, American Society for Colposcopy and Cervical Pathology, and American Society for Clinical Pathology screening guidelines for the prevention and early detection of cervical cancer. Am J Clin Pathol. 2012 Apr;137(4):516-42.
- Massad LS, Einstein MH, Huh WK, et al. 2012 updated consensus guidelines for the management of abnormal cervical cancer screening tests and cancer precursors. J Low Genit Tract Dis. 2013 Apr;17(5 Suppl 1):S1-s27.
- Petersen Z, Jaca A, Ginindza TG, et al. Barriers to uptake of cervical cancer screening services in low-and-middle-income countries: a systematic review. BMC Womens Health. 2022 Dec 2;22(1):486. [CrossRef]
- Arbyn M, Weiderpass E, Bruni L, et al. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. Lancet Glob Health. 2020 Feb;8(2):e191-e203. [CrossRef]
- Mantovani A, Allavena P, Sica A, et al. Cancer-related inflammation. Nature. 2008 Jul 24;454(7203):436-44. [CrossRef]
- Khorana AA, Mackman N, Falanga A, et al. Cancer-associated venous thromboembolism. Nat Rev Dis Primers. 2022 Feb 17;8(1):11.
- Cheng L, Li Z, Zheng Q, et al. Correlation study of serum lipid levels and lipid metabolism-related genes in cervical cancer. Front Oncol. 2024;14:1384778. [CrossRef]
- Snaebjornsson MT, Janaki-Raman S, Schulze A. Greasing the Wheels of the Cancer Machine: The Role of Lipid Metabolism in Cancer. Cell Metab. 2020 Jan 7;31(1):62-76. [CrossRef]
- Qin L, Zhang L. The predictive value of serum inflammatory markers for the severity of cervical lesions. BMC Cancer. 2024 Jun 28;24(1):780. [CrossRef]
- Han S, Zhang J, Sun Y, et al. The Plasma DIA-Based Quantitative Proteomics Reveals the Pathogenic Pathways and New Biomarkers in Cervical Cancer and High Grade Squamous Intraepithelial Lesion. J Clin Med. 2022 Dec 1;11(23). [CrossRef]
- Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011 Mar 4;144(5):646-74.
- Yue C, Liu S, Wang W, et al. Machine learning in early screening for high-grade cervical intraepithelial neoplasia using blood testing. BMC Med Inform Decis Mak. 2025 Dec 18;26(1):25. [CrossRef]
- Yuan C, Yao Y, Cheng B, et al. The application of deep learning based diagnostic system to cervical squamous intraepithelial lesions recognition in colposcopy images. Sci Rep. 2020 Jul 15;10(1):11639. [CrossRef]
- Li D, Wang Z, Liu Y, et al. Assessing the risk of high-grade squamous intraepithelial lesions (HSIL+) in women with LSIL biopsies: a machine learning-based study. Infect Agent Cancer. 2024 Dec 5;19(1):61. [CrossRef]
- Wang CW, Liou YA, Lin YJ, et al. Artificial intelligence-assisted fast screening cervical high grade squamous intraepithelial lesion and squamous cell carcinoma diagnosis and treatment planning. Sci Rep. 2021 Aug 10;11(1):16244. [CrossRef]
- Zhang L, Tian P, Li B, et al. Risk-stratified management of cervical high-grade squamous intraepithelial lesion based on machine learning. J Med Virol. 2024 Oct;96(10):e70016. [CrossRef]
- Xiao T, Wang C, Yang M, et al. Use of Virus Genotypes in Machine Learning Diagnostic Prediction Models for Cervical Cancer in Women With High-Risk Human Papillomavirus Infection. JAMA Netw Open. 2023 Aug 1;6(8):e2326890. [CrossRef]
- Wang J, Yu Y, Tan Y, et al. Artificial intelligence enables precision diagnosis of cervical cytology grades and cervical cancer. Nat Commun. 2024 May 22;15(1):4369. [CrossRef]
- Liu L, Liu J, Su Q, et al. Performance of artificial intelligence for diagnosing cervical intraepithelial neoplasia and cervical cancer: a systematic review and meta-analysis. EClinicalMedicine. 2025 Feb;80:102992. [CrossRef]
- Li Z, Zeng CM, Dong YG, et al. A segmentation model to detect cevical lesions based on machine learning of colposcopic images. Heliyon. 2023 Nov;9(11):e21043. [CrossRef]
- Xie Y, Shi H, Han B. Bioinformatic analysis of underlying mechanisms of Kawasaki disease via Weighted Gene Correlation Network Analysis (WGCNA) and the Least Absolute Shrinkage and Selection Operator method (LASSO) regression model. BMC Pediatr. 2023 Feb 24;23(1):90. [CrossRef]
- Liu B, Mazumder R. Randomization Can Reduce Both Bias and Variance: A Case Study in Random Forests [Article]. J Mach Learn Res. 2025 2025;26:1-49.
- Sagi O, Rokach L. Approximating XGBoost with an interpretable decision tree [Article]. Inf Sci. 2021 Sep;572:522-542. [CrossRef]
- Hajihosseinlou M, Maghsoudi A, Ghezelbash R. A Novel Scheme for Mapping of MVT-Type Pb-Zn Prospectivity: LightGBM, a Highly Efficient Gradient Boosting Decision Tree Machine Learning Algorithm [Article]. Nat Resour Res. 2023 Dec;32(6):2417-2438.
- Hao PY, Chiang JH, Chen YD. Possibilistic classification by support vector networks. Neural Netw. 2022 May;149:40-56. [CrossRef]
- Wang Y, Pan Z, Pan Y. A Training Data Set Cleaning Method by Classification Ability Ranking for the k -Nearest Neighbor Classifier. IEEE Trans Neural Netw Learn Syst. 2020 May;31(5):1544-1556. [CrossRef]
- Bender DP, Sorosky JI, Buller RE, et al. Serum CA 125 is an independent prognostic factor in cervical adenocarcinoma. Am J Obstet Gynecol. 2003 Jul;189(1):113-7. [CrossRef]
- Hanif H, Ali MJ, Susheela AT, et al. Update on the applications and limitations of alpha-fetoprotein for hepatocellular carcinoma. World J Gastroenterol. 2022 Jan 14;28(2):216-229. [CrossRef]
- Fentiman IS. Gamma-glutamyl transferase: risk and prognosis of cancer. Br J Cancer. 2012 Apr 24;106(9):1467-8. [CrossRef]
- Huang M, Chen X, Lin X, et al. Prediction Models of Microinvasive Cervical Cancer in High-Grade Squamous Intraepithelial Lesion Treatment by Loop Electrosurgical Excision Procedure. Risk Manag Healthc Policy. 2025;18:2921-2934. [CrossRef]
- Liu Q, Yang J, Cheng H, et al. A Clinical Prediction Model for Pathologic Upgrade to Invasive Carcinoma Following Conization of Cervical High-Grade Squamous Intraepithelial Lesions. Cancer Med. 2025 Jan;14(1):e70540. [CrossRef]
- Collins GS, Moons KGM, Dhiman P, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. Bmj. 2024 Apr 16;385:e078378.
- Wang L, Chen R, Weng J, et al. Detecting and localizing cervical lesions in colposcopic images with deep semantic feature mining. Front Oncol. 2024;14:1423782. [CrossRef]
- Yu H, Fan Y, Ma H, et al. Segmentation of the cervical lesion region in colposcopic images based on deep learning. Front Oncol. 2022;12:952847. [CrossRef]
- Zhou Y, Shi X, Liu J, et al. Correlation between human papillomavirus viral load and cervical lesions classification: A review of current research. Front Med (Lausanne). 2023;10:1111269. [CrossRef]
| Variable | HSIL (n=370) | Cervical cancer (n=332) | Statistic | P value |
|---|---|---|---|---|
| Age | 36.74±4.98 | 39.31±4.68 | t=-7.055 | <0.001 |
| Progesterone (P4) | 11.61 (2.08, 22.86) | 4.60 (1.47, 8.61) | Z=8.725 | <0.001 |
| Thrombin time (TT) | 18.08±1.52 | 17.11±1.28 | t=9.216 | <0.001 |
| Carbohydrate antigen 125 (CA125) | 14.70 (11.80, 18.06) | 20.50 (14.30, 27.90) | Z=-8.617 | <0.001 |
| Lipoprotein(a) [Lp(a)] | 99.00 (50.00, 187.59) | 187.50 (101.95, 299.25) | Z=-7.840 | <0.001 |
| Prothrombin time (PT) | 11.51±0.60 | 11.87±0.57 | t=-8.246 | <0.001 |
| Alpha-fetoprotein (AFP) | 1.79 (1.65, 2.11) | 2.21 (1.76, 3.17) | Z=-7.710 | <0.001 |
| High-density lipoprotein cholesterol (HDL-C) | 1.25±0.27 | 1.09±0.21 | t=9.045 | <0.001 |
| Serum bicarbonate (HCO₃⁻) | 22.82±2.01 | 23.92±2.27 | t=-6.800 | <0.001 |
| Gamma-glutamyl transferase (GGT) | 14.20 (12.00, 21.90) | 21.30 (15.07, 35.57) | Z=-8.307 | <0.001 |
| Model | AUC | Accuracy | Sensitivity | Specificity | PPV | NPV | F1 |
|---|---|---|---|---|---|---|---|
| KNN | 0.920 | 0.829 | 0.750 | 0.912 | 0.900 | 0.775 | 0.818 |
| LightGBM | 0.975 | 0.876 | 0.778 | 0.980 | 0.977 | 0.806 | 0.866 |
| RF | 0.977 | 0.886 | 0.787 | 0.990 | 0.988 | 0.815 | 0.876 |
| SVM | 0.933 | 0.886 | 0.898 | 0.873 | 0.882 | 0.890 | 0.890 |
| XGBoost | 0.951 | 0.895 | 0.889 | 0.902 | 0.906 | 0.885 | 0.897 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).