YOLO-Based Automated Cephalometric Landmark Detection Achieves Expert-Level Skeletal Classification: A Clinical Validation Study

Jacek Kotula; Marcin Konarzewski; Jakub Polkowski; Krzysztof Kotula; Joanna Lis; Rafal Porowski; Anna Ewa Kuc; Beata Kawala; Michal Sarul

doi:10.20944/preprints202605.0059.v1

Submitted:

30 April 2026

Posted:

04 May 2026

You are already at the latest version

Abstract

Automated cephalometric landmark detection using deep learning has the potential to transform routine orthodontic diagnosis. However, the clinical relevance of AI localization accuracy depends critically on how detection errors propagate into derived angular measurements and skeletal classifications. This study presents a systematic clinical validation of 14 YOLO-based model configurations, evaluating the effects of architecture (YOLOv5/YOLOv11), bounding box size (40-150 px), dataset scale (235-4255 images) and training duration on landmark detection accuracy with specific focus on the four clinically critical landmarks that define the ANB angle: Sella (S), Nasion (N), A-point (A) and B-point (B). The best-performing model (YOLOv11s, 40×40 px bounding box, 4255 training images) achieved a mean radial error of 3.10 ± 1.00 mm and a Successful Detection Rate of 87.2% at the 4 mm threshold for S, N, A, and B. Despite this error magnitude, ANB-based skeletal classification demonstrated 96.9% concordance with expert assessments (95% bootstrap CI: 93.8–99.2%, n = 130 classifications), with all discordances confined to borderline cases within 1^◦ of diagnostic thresholds. Notably, the localization accuracy achieved by the best AI models falls within the inter-operator variability range reported for experienced human clinicians (1.5–3.5 mm), indicating that the AI system has reached a threshold of clinical equivalence for skeletal classification purposes. Bounding box size emerged as the single most influential hyperparameter, with a 3.4-fold increase in mean radial error from 40×40 to 150×150 px configurations. These findings support the clinical deployment of YOLO-based AI systems for automated ANB-based skeletal classification, while highlighting the need for human oversight in borderline cases.

Keywords:

artificial intelligence

;

orthodontics

;

cephalometric analysis

;

YOLO

;

landmark detection

;

ANB angle

;

skeletal classification

;

deep learning

;

clinical validation

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

YOLO-Based Automated Cephalometric Landmark Detection Achieves Expert-Level Skeletal Classification: A Clinical Validation Study

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe