Preprint
Article

This version is not peer-reviewed.

Multi-Regulatory Domain OTA Compliance Audit and Semantic Rule Automatic Matching System

Submitted:

15 January 2026

Posted:

16 January 2026

You are already at the latest version

Abstract
Accurate prediction of pedestrian intention and future paths is essential for traffic safety, urban planning, and autonomous navigation. This study develops a multimodal prediction model that combines meaning-based image-text features, motion trajectories, and social interactions. We extract visual-language information from RGB sequences using a CLIP-based encoder and represent group behavior using a Social-GRU network. To improve the reliability of predictions, we apply Bayesian modeling to manage uncertainty. We tested the method on the Waymo and ETH/UCY datasets. On the ETH dataset, the model achieved a 14.2% reduction in average displacement error and a 17.6% reduction in final displacement error, compared with leading baseline methods. The model remained effective in crowded spaces, unclear visual conditions, and sudden motion changes. The results confirm that combining visual-language and motion data improves prediction accuracy. This method offers a practical solution for real-world pedestrian analysis in intelligent transport systems.
Keywords: 
;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated