Integration of Multi-Sensor Fusion and Decision-Making Architecture for Autonomous Vehicle Navigation Apply in Vietnam Traffic Conditions

Hai Ngoc Nguyen; Thien Nguyen Luong; Tuan Pham Minh; Nguyen Mai Thi Hong; Kiet Tran Anh; Quan Bui Hong; Ngoc Pham Van Bach

doi:10.20944/preprints202510.0738.v1

Submitted:

09 October 2025

Posted:

10 October 2025

You are already at the latest version

Abstract

Autonomous vehicles represent a transformative technology in modern transportation, promising enhanced safety, efficiency, and accessibility in mobility systems. This paper presents a comprehensive autonomous vehicle system designed specifically for Vietnam's traffic conditions, featuring a multi-layered approach to perception, decision-making, and control. The system utilizes dual 2D LiDARs, camera vision, and GPS sensing to navigate complex urban environments. A key contribution is the development of a specialized segmentation model that accurately identifies Vietnam-specific traffic signs, lane markings, road features, and pedestrians. The system implements a hierarchical decision-making architecture, combining long-term planning based on GPS and map data with short-term reactive planning derived from a bird's-eye view transformation of segmentation and LiDAR data. The control system modulates the speed and steering angle through a validated model that ensures stable vehicle operation across various traffic scenarios. Experimental results demonstrate the system's effectiveness in real-world conditions, achieving a high accuracy rate in terms of segmentation and detection, and exact response in navigation tasks. The proposed system shows robust performance in Vietnam's unique traffic environment, addressing challenges such as mixed traffic flow and country-specific road infrastructure.

Keywords:

multi-sensor fusion

;

data fusion

;

autonomous vehicle

;

vision and sensors

Subject:

Engineering - Control and Systems Engineering

1. Introduction

Autonomous vehicles (AVs) have emerged as a revolutionary advancement in transportation technology, promising to en-hance road safety, optimize traffic flow, and transform urban mobility. While significant progress has been made in autonomous driving systems for well-structured environments in developed countries, the implementation of AVs in diverse traffic conditions, particularly in Southeast Asian countries like Vietnam, presents unique challenges. These challenges include heterogeneous traffic patterns, varying road infrastructure quality, complex traffic sign systems, and distinct driving behaviors [1]. Traditional autonomous driving approaches often struggle in Vietnam’s traffic environment due to several factors. First, the traf-fic flow is characterized by a mix of vehicles including cars, motorcycles, bicycles, and pedestrians, creating complex interaction scenarios. Second, lane markings and road infrastructure may be inconsistent or deteriorated, making traditional lane-following algorithms less reliable. Third, Vietnam’s traffic sign system includes unique elements that are not commonly addressed in existing autonomous driving datasets and models [2]. To address these challenges, this paper presents a comprehensive autonomous vehicle system specifically designed for Vietnam’s traffic conditions. Our system integrates multiple sensing modali-ties, including dual 2D LiDARs, camera vision, and GPS, to ensure robust environmental perception. The key innovations of our work include: a specialized segmentation model trained to recognize Vietnam-specific traffic elements, including unique traffic signs, lane markings, and road features, a hierarchical decision-making system that combines long-term route planning with reactive short-term navigation, a bird’s-eye view transformation approach that fuses segmentation results with LiDAR data for enhanced situational awareness, and a validated control system model that ensures stable vehicle operation across various traffic scenarios.

Our experimental results, conducted on Vietnam scenarios with different cases, demonstrate the system’s effectiveness in handling Vietnam’s unique traffic challenges. The proposed approach achieves significant improvements in comprehensive per-ception, navigation, and decision-making.

2. Related Works of Multi-Sensor Fusion in Autonomous Vehicle

This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.

2.1. Camera Segmentation and LiDAR Signal Representation

Recent advances in autonomous vehicle perception have demonstrated the power of combining camera-based segmentation with LiDAR data. Camera segmentation using deep learning models, particularly YOLO, has shown exceptional performance in detecting and segmenting traffic elements like lanes, signs, vehicles, and pedestrians [3,4,5,6].The evolution from previous YOLO versions has brought significant improvements in both accuracy and processing speed, making it suitable for real-time autonomous driving applications. However, autonomous vehicles need to have a comprehensive perception for several tasks, so it is necessary to apply a comprehensive model with multiple objects to improve input information for decision-making.

Figure 1. Vehicles and pedestrians detection [4].

Several studies have explored effective ways to represent and fuse LiDAR data with camera segmentation. In different research [7], researchers developed a method to combine geometric information from LiDAR with semantic segmentation from cameras to create a more comprehensive understanding of the environment. The fusion of these complementary sensor modalities helps overcome the limitations of each sensor type - cameras provide rich semantic information but lack precise depth measurements, while LiDAR provides accurate spatial information but lacks semantic context [8].With this method, the authors have demonstrated a precise method for obstacle distance estimation, and this method can be improved by combining a comprehensive model and image processing.

Figure 2. Camera and Lidar 2D fusion for precise distance estimation [9]

Bird’s-eye view (BEV) representation has emerged as a particularly effective approach for autonomous driving perception. Prakash et al. [9] demonstrated a multi-modal fusion transformer that effectively combines camera and LiDAR data into a unified BEV representation. Similar to our approach, Wang et al. [10] utilized multiple LiDAR sensors to create a comprehensive top-view model, though their work focused on highway scenarios rather than urban environments.

2.2. Decision Making for Autonomous Vehicles

Decision making in autonomous vehicles typically follows a hierarchical structure, separating long-term strategic decisions from short-term tactical controls [11,12].Comprehensive surveys of decision-making architectures highlight the importance of integrating both rule-based and reactive approaches [13]. Research emphasizes that effective autonomous driving requires both adherence to traffic rules and responsive behavior to dynamic obstacles. For long-term decision making, recent work has focused on incorporating map information and traffic rules into planning frameworks. Studies propose strategic planning systems that consider both static rules (from traffic signs and road markings) and dynamic conditions (from real-time perception) [14]. This approach aligns with our system’s use of GPS and detected signs for long-term planning.

In the domain of short-term decision making, several approaches have emerged for converting perception inputs into control commands. Chen et al. [15] demonstrated an end-to-end system that uses segmentation masks to directly generate steering and velocity commands. However, their work primarily relied on camera data, whereas our approach incorporates LiDAR information for additional safety verification. The use of LiDAR for handling undefined obstacles has gained increasing attention. Researches show that LiDAR-based obstacle detection can serve as an effective backup system when camera-based perception might fail, particularly in challenging weather conditions or with unexpected obstacles [16,17].This multi-layered approach to perception and decision making has proven crucial for robust autonomous operation.

2.3. Route Planning and Path Finding

Recently, significant advancements have been made in constructing and interpreting real-world road networks for autonomous vehicle route planning. Delling et al., [18] provides a general overview of various Dijkstra-based route planning algorithms and the techniques used to improve search efficiency. However, modern route planning approaches incorporate additional factors beyond physical distance, such as traffic conditions, speed limits, and detours, to determine the most cost-effective route [19,20,21]. More recently, machine learning techniques have been introduced to handle the complexity of multiple parameters and estimate optimal paths more effectively [20,21]. However, these methods require extensive training data and real-time inputs, making them computationally demanding.

While the route planning generates waypoints to guide the system to destination, the path planning between waypoints is important to make the car handle the environment change (i.e. obstacles). Pathfinding algorithms can be broadly categorized into two main approaches: discrete graph-based methods, such as Dijkstra’s Algorithm and A*, which model the environment as a network of interconnected nodes and edges [22] and continuous space methods, like RRT and PRM, which explore the environment without predefined reference points to find feasible paths [23]. Although being popular and generally well performing on motion planning, the graph-based methods have a drawback in time complexity which prevents them from real-time application. Meanwhile, the continuous space methods are computationally expensive [24].

3. System Architecture and Implementation

3.1. System Architecture Proposal

The autonomous vehicle system architecture integrates multiple sensor modalities, including 2D LiDAR, camera, GPS, and wheel encoders, to enable perception, localization, and decision-making. All of the information from sensors is processed to several tasks simultaneously.

Figure 3. System architecture for the autonomous vehicle.

For segmentation and detection, returned results are applied for short term decision-making with masks of important objects on the road. The detected Vietnam traffic signs are analyzed to set restrictions and conditions for long term decision-making. The masks of objects on the road are fused with 2D lidar signals to improve the perception of the vehicle. Top view is dedicated to a visualization of the vehicle’s front view, which can be analyzed for short term decision-making.GPS sensors are used for pathfinding and vehicle positioning. The encoder from each wheel provides information about velocity, acceleration, direction of the vehicle, which is essential for controlling system.

3.2. Implementations

3.2.1. YOLOv8 Instance Segmentation and 2D Lidar Fusion and Perception Visualization

A 2D LiDAR sensor can be exploited to perceive the surrounding environment by analyzing the distance measurements from its emitted rays. The 2D Lidar sensor provides 541 distance values, each corresponding to a ray spaced at 0.5-degree intervals and covers a total field of view of precisely 270 degrees. By converting these polar coordinates (angle and distance) into Cartesian coordinates (X, Y), a point cloud representation of the environment can be constructed. This data can be processed to detect obstacles and understand the spatial layout of the surroundings. By continuously updating and analyzing the point cloud over time, the sensor can assist in real-time decision-making for navigation and path planning.

Figure 4. A mask from 2D Lidar of the autonomous vehicle

For comprehensive segmentation and detection, we used YOLOv8 segmentation for lanes, markings, vehicles and pedestrians. Lane segmentation is applied for short-term decision making by determining the accepted area to move. Marking segmentation is dedicated to analyze the acceptance of lane changing. Vehicle and pedestrian segmentations collaborate with 2D Lidar analysis to determine accurate distances to the autonomous vehicle.

Figure 5. YOLOv8 segmentation model [25]

For traffic signs detection, we collect and label Vietnam traffic signs, then train with YOLOv8-s detection model. The dataset of Vietnam traffic signs is collected with respect to Vietnam traffic rules. The collected ones are important for navigation and speed of the autonomous vehicle, which requires consistency between vehicles on the specific road.

The signal of lidar, segmentation masks and top-view model are fused to create a comprehensive perception for the autonomous vehicle. Firstly, vehicles and pedestrians are considered to be obstacles to the autonomous vehicle. From segmentation masks of vehicles and pedestrians, we determine the angular range of the object relative to the camera. In paper [26], the author calculates that the angle of incidence of an object is a linear function of the pixel coordinates. However, this calculation is only valid when the camera sensors are arranged in a spherical shape or in a rectangular layout with a narrow field of view. The relationship between the sensor plane and the real-world plane is illustrated in Figure 4, as discussed in [27]. We compute it as follows:

Figure 5. Camera view for object’s angle estimation.

Consider a camera with a horizontal field of view (HFOV) of α degrees and a horizontal resolution of H pixels. The half-HFOV is α/2, and the image width is H, meaning half of the image corresponds to H/2 pixels. The focal length f in pixels can be derived as [26]:

f = \frac{H}{2 \tan (\frac{α}{2})}

(1)

The principal point, corresponding to the 0-degree angle, is located at the center of the image [26]:

c_{x} = \frac{H}{2}

(2)

With

c_{x}

denotes the center coordinate of the image.

x_{1}

and

x_{2}

are the left and right pixel coordinates of the detected object’s bounding box. By convention, the extreme right of the image corresponds to an angle of

\frac{α}{2}

, while the extreme left corresponds to

\frac{- α}{2}

. The angles subtended by the object’s left and right boundaries are given by [26]:

θ_{1} = \tan^{- 1} (\frac{x_{1} - C_{x}}{f})

(3)

θ_{2} = \tan^{- 1} (\frac{x_{2} - C_{x}}{f})

(4)

After determining the potential angles of obstacles, the potential angles are converted to 2 potential rays from lidar by:

r a y_{p o t e n t i a l_l e f t} = 2 x (θ_{1} + 135)

(5)

if determined rays on the right compared to the central of camera.

r a y_{p o t e n t i a l_l e f t} = 2 x (135 - θ_{1})

(6)

if determined rays on the left compared to the central of camera.

r a y_{p o t e n t i a l_r i g h t} = 2 x (θ_{2} + 135)

(7)

if determined rays on the right compared to the central of camera.

r a y_{p o t e n t i a l_r i g h t} = 2 x (135 - θ_{2})

(8)

if determined rays on the left compared to the central of camera.

Determined rays can not be precise because of the limit of camera and algorithm. However, the potential angle and rays of the obstacles are estimated, these can be applied to extract the precise rays for obstacles by:

r a y_{l e f t} = d r_{i} if \frac{d r_{i + 1} - d r_{i}}{θ_{i + 1} - θ_{i}} > T h

(9)

Where:

r a y_{l e f t}

is the precise ray on the left side of an obstacle.

d r_{i} s

is the distance of the ray number i.

θ_{i}

is the specific angle of the ray number i.

After applying this formula for all rays in potential rays on the left and right sides of obstacles, we retrieve a set of values for differences between rays, and 2 peaks on the left and right sides are the rays that accurately represent obstacles.

3.2.2. Long short-term decision-making architecture based on sensor exploitation

Given the resource and data limitations of implementing route planning on embedded OpenStreetMap, we opt for the traditional A* algorithm as the most practical and efficient choice for our application. To search the path, we will first input the geocode of the current position of the car and the name or geocode of the target point. The name of the target point is translated to geocode if available. These geocodes will be connected to the road map vertex. Next, we apply the A* algorithm to find the shortest path from the closest vertex of start point to the closest vertex of end point.

Figure 6. Hanoi city map and road layer (hidden).

Figure 7. Schema for long-term decision-making of the autonomous vehicle.

Based on detected traffic signs, the vehicle automatically sets the restrictions and conditions based on Vietnam traffic rules. The supervisor collects the information about velocity, position, direction and conditions of the vehicle to ensure the system respects the rules and sends feedback to the control system. The traffic rules are set based on the detected Vietnam signs. The restrictions and conditions conveyed by the traffic signs are processed by traffic rules. These rules ensure that vehicles progress consistently without traffic violations. The supervisor plays an important role in short-term decision-making, which processes the conditions and restrictions to apply properly for the autonomous system. After supervisor’s progresses, all of the checking results and system requirements are sent to short-term decision-making to execute. The long-term decision-making model is especially designed for Vietnam traffic conditions, where the meaning of signs should be combined to make precise decisions. For turning prohibition, the validity finishes after intersection or turning branch on the road. Meanwhile, the restrictions and conditions from areas and one-direction signs remain in progress.

Figure 8. Flowchart for short term decision-making of the autonomous vehicle.

In the short-term decision-making process, the vehicle continually evaluates its immediate surroundings and adjusts its trajectory and speed to ensure safe and efficient navigation. After verifying that long-term conditions and restrictions are satisfied, the system proceeds to assess the lane geometry and detect obstacles using sensors such as LiDAR and cameras. It then determines the optimal velocity and steering angle by analyzing real-time data on lane positions, distances to nearby vehicles or objects, and any potential lane deviations or urgent stopping scenarios [28,29]. If the analysis indicates unsafe conditions, the system refines its perception through instance segmentation and updates its understanding of the environment accordingly. The capability of vehicles is improved to adapt to Vietnam traffic conditions, where obstacles occasionally are not perceived by model or appear suprisingly from alleys of the road. The signal from Lidar is not only used to combine with camera for precise distances to objects but also ensure the safe distance from the vehicle to surrounding environment. We consider all of unfiltered rays from lidars to belong to surrounding environment, where the distances are calculated in a cluster to evaluate the surprising risks. The distance of individual rays is sensitive to tiny objects, so we use a sliding array to calculate the risk from unidentified objects to the vehicle.

R = \frac{\sum_{n_{r a y s}} d}{n_{r a y s} \times β}

(10)

where:

$R$ is the risk to the vehicle
$d$ is the distance from individual ray to lidar
$n_{r a y s}$ is the number of values in sliding array.
$β$ risky coefficient respect to different $n_{r a y s}$

We calculate the risk with different sliding arrays and collect different values of risks for the vehicle. Based on those risk estimations, the autonomous vehicle could decide short-term decisions to ensure safety in unexpected traffic situation in Vietnam. Finally, the calculated control commands steering angle and velocity are applied, and feedback from the vehicle’s response is continually monitored to maintain safe driving performance. This loop of perception, analysis, and actuation occurs in rapid cycles to adapt to changing traffic conditions and complement the broader constraints set by the long-term decision-making framework. After short-term decision-making, the control system executes based on processed information. To achieve efficient turning, we follow Ackermann steering geometry, which ensures that all wheels follow circular paths around a common instantaneous center of rotation (ICR). To further optimize efficiency, we adjust each wheel’s velocity to match the expected velocity profile dictated by Ackermann steering.

Our inputs are turning angle

(δ)

and current speed. From the turning angle, we compute the central turning radius that the center of mass should follow:

R = \sqrt{a^{2} + l^{2} \cot^{2} (δ)}

(11)

where:

$a$ is the lateral offset (if applicable).
$l$ is the wheelbase (distance between front and rear axles).
$\cot (δ)$ is the cotangent of the steering angle.

To compute the turning radius for each wheel, we determine R, then compute the radii for each wheel as follows:

Rear Axle Center Radius:

R_{r e a r, c e n t e r} = \sqrt{R^{2} - \frac{l^{2}}{4}}

(12)

Rear Wheel Radii:

R_{r e a r, i n n e r} = R_{r e a r, c e n t e r} - \frac{ω}{2}

(13)

R_{r e a r, o u t e r} = R_{r e a r, c e n t e r} + \frac{ω}{2}

(14)

Front Wheel Radii:

R_{f r o n t, i n n e r} = \sqrt{R_{r e a r, i n n e r}^{2} + l^{2}}

(15)

R_{f r o n t, o u t e r} = \sqrt{R_{r e a r, o u t e r}^{2} + l^{2}}

(16)

where:

$ω$ is the track width of the vehicle.

The kinematic design of vehicle steering systems impacts handling, stability, and tire wear. Three common configurations are parallel steering, Ackermann steering, and anti-Ackermann steering, each suited to different applications. Parallel steering turns both front wheels at the same angle. While simple to implement, it causes tire scrubbing and excessive wear, making it impractical for most vehicles. It finds limited use in autonomous robotic platforms where wheel slip is negligible. Ackermann steering ensures that the inner and outer wheels follow concentric paths during a turn, reducing slip and improving traction. It enhances maneuverability and reduces tire wear, making it ideal for road vehicles. However, it can contribute to understeer at high speeds and requires precise linkage design.

Anti-Ackermann steering, where the outer wheel turns more than the inner wheel, is used in high-speed racing to optimize tire load distribution and improve cornering grip. While beneficial for performance, it increases low-speed tire scrubbing and is unsuitable for regular road vehicles. Parallel steering is simple but inefficient. Ackermann steering is best for general vehicles, balancing maneuverability and tire wear. Anti-Ackermann steering benefits high-speed racing but is impractical for normal driving due to increased tire wear.

4. Experiments and Results

4.1. Results of YOLOv8 Instance Segmentation and 2D Lidar Fusion and Top View for Vehicle Front-view Visualization

We developed and trained two deep learning models based on datasets independently collected and annotated to reflect the unique characteristics of the Vietnamese traffic environment. The first dataset focuses on instance segmentation and includes four key classes: lane markings, road markings, pedestrians, and vehicles (containing different kinds of vehicles in the Vietnam traffic environment)—capturing the complexity of real-world road scenarios in Vietnam. The second dataset is dedicated to Vietnamese traffic signs, covering diverse and localized sign types under varying lighting and environmental conditions. As a result, we conducted two separate training processes—one for segmentation and one for detection—whose performance is visualized in the respective graphs. The graphs in Figure 9 indicate good training processes; the accuracy of the detection model is approximately 97% for traffic signs, and the segmentation model acquired favourable metrics: precision: 92%, accuracy: 95%, and mAP90: 0.75.

Figure 9. Graphs of segmentation and detection training process

The A* algorithm successfully searches for a route from the Vietnam academy of Science and Technology to The President Ho Chi Minh Mausoleum as show in figure 10. The generated path was efficient in terms of distance and computational time, demonstrating the algorithm’s effectiveness in urban route planning.

Figure 10. Path finding on the map in Hanoi

Figure 11. Result for top-view of autonomous vehicle

Modeling top-view plays an important role in controlling the system, where the path-finding algorithms can be applied. In our results, we focus mainly on properties of the road and the objects on the road because they are segmented in the segmentation model. The front lane is estimated and combined with segmentation masks to find the optimal path for the vehicle.

Figure 12. Result of Traffic Signs Detection

In the experimental scenarios, traffic signs are detected, and their meanings are subsequently analyzed by the long-term decision-making model. Based on this analysis, restrictions and regulations are applied to the short-term decision-making model. The signs are arranged as the Vietnamese traffic environment, where the system must integrate diverse traffic rule information due to the variety of vehicles on the road. The signs in the experimental scenarios are detected, and their meanings are consequently analyzed by the long-term decision-making model. The restrictions are set and provisions to the short-term decision-making model.

Figure 13. Result of lane, marking and vehicle segmentation

The masks cover precisely the instances in the image, where vehicle in the middle of the road. Hence, those masks of instances can accurately be applied to different purposes. Our dataset is acquired in Vietnam’s traffic environment, where different lanes and vehicles can appear in a frame, and the diverse segmentation of vehicles is required.

Figure 14. Selected rays for obstacle distance estimation from segmentation masks.

The potential angle of the detected object is exploited to extract precise rays regarding the object. The above figure shows the potential rays, which are taken values to apply to formula (9) for extracting precise rays of objects.

Figure 15. 2D Lidar signal analysis from filtered obstacle rays from segmentation masks.

The precise rays reflecting the distance to objects are extracted by the difference between the distances of adjacent rays. Figure 16 indicates the role of filtering potential rays from the environment, where the peaks created by objects’ rays are easily separated from others.

Figure 16. Precise rays represent the object.

The extracted rays precisely reflect the distance from Lidar 2D to the object. With precise distances, the vehicle’s system can evaluate the appropriate speed and steering angle.

3.1. Result of System Response

Figure 17. The system perceives objects and makes decisions.

The system adapts well to multiple detected objects of the environment that the fusion model above perceives and analyzes. By evaluating the appropriate speed and steering angle, the vehicle can avoid detected obstacles, move to the proper lane, and evaluate the surrounding environment in real-time.

Figure 18. The system avoids objects and remains stable.

After overcoming obstacles and moving to the proper lane, the vehicle remains at a stable speed, respects the restriction from signs, and continuously collects and processes information.

Figure 18. The vehicle’s velocity in an experiment.

The graph shows a clear acceleration phase starting from rest, reaching a peak velocity of approximately 6 km/h. The car maintains high-speed operation with moderate fluctuations before decelerating smoothly to a near stop. Minor variations in velocity during the steady-state phase suggest adaptive control behavior, possibly in response to environmental factors or trajectory adjustments.

5. Conclusions

This research presents the development and integration of a comprehensive autonomous vehicle system tailored to Vietnam’s complex traffic conditions. By multi-sensor fusion and a hierarchical decision-making architecture, the system demonstrates strong perception, navigation, and control capabilities. The proposed models accurately segment lanes, markings, vehicles, pedestrians, and detect Vietnam-specific traffic signs, ensuring robust situational awareness. The decision-making modules, integrating long-term route planning with short-term reactive behavior, enable the vehicle to navigate dynamically while adhering to traffic rules. Experimental results validate the system’s ability to perceive, plan, and act reliably in real-world environments, achieving stable vehicle operation, accurate obstacle avoidance, and compliance with traffic regulations. The outcomes demonstrate the system’s adaptability for deployment in the diverse traffic scenarios in Vietnam. Future work will focus on enhancing system precision in more scenarios with adverse weather conditions and improving stability.

Abbreviations

The following abbreviations are used in this manuscript:

LiDAR	Light Detection and Ranging
GPS	Global Positioning System
YOLO	You Only Look Once
BEV	Bird’s-eye view

References

Ghraizi, D.; Talj, R.; Francis, C. An overview of decision-making in autonomous vehicles. IFAC-PapersOnLine 56 2023, 2, 10971–10983. [Google Scholar]
Hong, Q.P.; Luong, T.N.; Xuan, T.P.; Duc, M.T.; Van Bach, N.P.; Minh, T.P.; Trong, T.B.; Huy, H.L. Study on a method for detecting and tracking multiple traffic signals at the same time using YOLOv7 and SORT object tracking. International Conference on Robotics and Automation Engineering 2023, 8, 293–297. [Google Scholar]
Tung, P.X.; Thien, N.L.; Ngoc, P.V.B.; Vu, M.H. Research and Development of a Traffic Sign Recognition Module in Vietnam. Engineering, Technology & Applied Science Research. 14 2024, 1, 12740–12744. [Google Scholar] [CrossRef]
Wang, B.; Li, Y.Y.; Xu, W.; Wang, H.; Hu, L. Vehicle–pedestrian detection method based on improved YOLOv8. Electronics 13 2024, 11, 2149. [Google Scholar] [CrossRef]
Zhang, F.; Li, X.; Wang, Y. Improved YOLOv8 for small traffic sign detection under complex environmental conditions. Franklin Open 8 2024, 1, 100167. [Google Scholar]
Liu, C.; Ma, J.; Zhao, P. SDG-YOLOv8: Single-domain generalized object detection based on domain diversity in traffic road scenes. Displays 87 2025, 1, 102944. [Google Scholar]
Dai, Z.; Guan, Z.; Chen, Q.; Xu, Y.; Sun, F. Enhanced object detection in autonomous vehicles through LiDAR—camera sensor fusion. World Electric Vehicle Journal 15 2024, 7, 297. [Google Scholar] [CrossRef]
Yurtsever, E.; Lambert, J.; Carballo, A.; Takeda, K. A survey of autonomous driving: Common practices and emerging technologies. IEEE Access 8 2020, 1, 58443–58469. [Google Scholar] [CrossRef]
Prakash, A.; Chitta, K.; Geiger, A. Multi-modal fusion transformer for end-to-end autonomous driving. ArXiv 2021. [Google Scholar]
Wang, D.; Fu, W.; Song, Q.; Zhou, J. Potential risk assessment for safe driving of autonomous vehicles under occluded vision. Scientific Reports 12 2022, 1, 1–14. [Google Scholar] [CrossRef]
Zhang, Y.; Li, X.; Wang, Q. Evolutionary decision-making and planning for autonomous driving based on safe and rational exploration and exploitation. Engineering 33 2024, 1, 108–120. [Google Scholar]
Gupta, A.K.; Singh, P.K.; Kumar, R. An overview of decision-making in autonomous vehicles. IFAC-PapersOnLine 56 2023, 2, 10971–10983. [Google Scholar]
Badue, C.; Guidolini, R.; Carneiro, R.V.; Azevedo, P.; Cardoso, V.B.; Forechi, A.; Jesus, L.; Berriel, R.; Paixão, T.; Mutz, F.; Veronese, L.; De Souza, A.F. Self-driving cars: A survey. ArXiv 2019. [Google Scholar]
Li, G.; Yang, Y.; Li, S.; Qu, X.; Lyu, N.; Li, S.E. Decision making of autonomous vehicles in lane change scenarios: Deep reinforcement learning approaches with risk awareness. Transportation Research Part C: Emerging Technologies 134 2022, 1, 103452. [Google Scholar] [CrossRef]
Chen, D.; Koltun, V.; Krähenbühl, P. Learning to drive from a world on rails. ArXiv 2021. [Google Scholar]
Duhautbout, T.; Talj, R.; Cherfaoui, V.; Aioun, F.; Guillemard, F. Efficient speed planning in the path-time space for urban autonomous driving. 25th IEEE International Conference on Intelligent Transportation Systems (ITSC) 2022, Macau, China, pp. 1268–1274.
Sprenger, F. Microdecisions and autonomy in self-driving cars: Virtual probabilities. AI & Society 2022, 37, 619–634. [Google Scholar]
Delling, D.; Sanders, P.; Schultes, D.; Wagner, D. Engineering route planning algorithms. In Lerner, J., Wagner, D. and Zweig, K. A. (eds), Algorithmics of Large and Complex Networks, Springer 2009, pp. 117–139.
Fahmin, A.; Shen, B.; Cheema, M.A.; Toosi, A.N.; Ali, M.E. Efficient alternative route planning in road networks. IEEE Transactions on Intelligent Transportation Systems 25 2024, 3, 1234–1245. [Google Scholar] [CrossRef]
Geng, Y.; Li, Q.; Chen, H.; Zhang, X. Deep reinforcement learning based dynamic route planning for minimizing travel time. 2021 IEEE-ICC Workshops 2021, 1–6.
Verbytskyi, Y. Delivery routes optimization using machine learning algorithms. Eastern Europe: Economy, Business and Management 2023, 38, 14. [Google Scholar] [CrossRef]
Alton, K.; Mitchell, I.M. Optimal path planning under different norms in continuous state spaces. In Proceedings of the 2006 IEEE International Conference on Robotics and Automation (ICRA 2006) 2006, 866–872.
Choudhary, A. Sampling-based path planning algorithms: A survey. arXiv preprint.
Ojha, P.; Thakur, A. Real-time obstacle avoidance algorithm for dynamic environment on probabilistic road map. 2021 International Symposium of Asian Control Association on Intelligent Robotics and Industrial Automation (IRIA); 2021; pp. 57–62. [Google Scholar]
Khan, F.; Rafique, S.; Khan, S.; Hasan, L., Jr. I. Smart Fire Safety: Real-Time Segmentation and Alerts Using Deep Learning. International Journal of Innovative Science and Technology (IJIST) 2024, 6, 105–115. [Google Scholar]
Mulyanto, A.; Borman, R.I.; Prasetyawana, P.; Sumarudin, A. 2D LiDAR and camera fusion for object detection and object distance measurement of ADAS using Robotic Operating System (ROS). JOIV.
Szeliski, R. Computer Vision: Algorithms and Applications. Springer, 2010, p. 66.
Li, Y.; Guan, H.; Jia, X. An interpretable decision-making model for autonomous driving. Advances in Mechanical Engineering 2024, 16, 16878132241255455. [Google Scholar] [CrossRef]
Li, D.; Zhao, Y.; Wang, W.; et al. Localization and Mapping Based on Multi-feature and Multi-sensor Fusion. Int.J Automot. Technol. 2024, 25, 1503–1515. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Integration of Multi-Sensor Fusion and Decision-Making Architecture for Autonomous Vehicle Navigation Apply in Vietnam Traffic Conditions

Abstract

Keywords:

Subject:

1. Introduction

2. Related Works of Multi-Sensor Fusion in Autonomous Vehicle

2.1. Camera Segmentation and LiDAR Signal Representation

2.2. Decision Making for Autonomous Vehicles

2.3. Route Planning and Path Finding

3. System Architecture and Implementation

3.1. System Architecture Proposal

3.2. Implementations

3.2.1. YOLOv8 Instance Segmentation and 2D Lidar Fusion and Perception Visualization

3.2.2. Long short-term decision-making architecture based on sensor exploitation

4. Experiments and Results

4.1. Results of YOLOv8 Instance Segmentation and 2D Lidar Fusion and Top View for Vehicle Front-view Visualization

3.1. Result of System Response

5. Conclusions

Abbreviations

References

MDPI Initiatives

Important Links

Subscribe