Article
Version 1
Preserved in Portico This version is not peer-reviewed
Benchmarking Perception to Streaming Inputs in Vision‐Centric Autonomous Driving
Version 1
: Received: 17 November 2023 / Approved: 17 November 2023 / Online: 21 November 2023 (10:34:50 CET)
A peer-reviewed article of this Preprint also exists.
Jin, T.; Ding, W.; Yang, M.; Zhu, H.; Dai, P. Benchmarking Perception to Streaming Inputs in Vision-Centric Autonomous Driving. Mathematics 2023, 11, 4976. Jin, T.; Ding, W.; Yang, M.; Zhu, H.; Dai, P. Benchmarking Perception to Streaming Inputs in Vision-Centric Autonomous Driving. Mathematics 2023, 11, 4976.
Abstract
In recent years, vision-centric perception has played a crucial role in autonomous driving tasks, encompassing functions such as 3D detection, map construction, and motion forecasting. However, the deployment of vision-centric approaches in practical scenarios is hindered by substantial latency, often deviating significantly from the outcomes achieved through offline training. This disparity arises from the fact that conventional benchmarks for autonomous driving perception predominantly conduct offline evaluations, thereby largely overlooking the latency concerns prevalent in real-world deployment. While a few benchmarks have been proposed to address this limitation by introducing effective evaluation methods for online perception, they do not adequately consider the intricacies introduced by the complexity of input information streams. To address this gap, we propose the Autonomous-driving Streaming I/O (ASIO) benchmark, aiming to assess the streaming inputs characteristics and online performance of vision-centric perception in autonomous driving. To facilitate this evaluation across diverse streaming inputs, we initially establish a dataset based on the CARLA Leaderboard. In alignment with real-world deployment considerations, we further develop evaluation metrics based on information complexity specifically tailored for streaming inputs and streaming performance. Experimental results indicate significant variations in model performance and ranking under different major camera deployments, underscoring the necessity of thoroughly accounting for the influences of model latency and streaming inputs characteristics during real-world deployment. To enhance streaming performance consistently across distinct streaming inputs features, we introduce a backbone switcher based on the identified streaming inputs characteristics. Experimental validation demonstrates its efficacy in perpetually improving streaming performance across varying streaming inputs features.
Keywords
vison-centric perception benchmark; online assessment; streaming inputs; two-dimensional entropy
Subject
Engineering, Control and Systems Engineering
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment