Autonomous Driving Implementation in an Experimental Environment

Autonomous systems require identifying the environment and it has a long way to go before putting it safely into practice. In autonomous driving systems, the detection of obstacles and traffic lights are of importance as well as lane tracking. In this study, an autonomous driving system is developed and tested in the experimental environment designed for this purpose. In this system, a model vehicle having a camera is used to trace the lanes and avoid obstacles to experimentally study autonomous driving behavior. Convolutional Neural Network models were trained for Lane tracking. For the vehicle to avoid obstacles, corner detection, optical flow, focus of expansion, time to collision, balance calculation, and decision mechanism were created, respectively.


Introduction
The autonomous driving system is one of the most popular smart autonomous systems recently. Nowadays, it is aimed to minimize driver-related errors with autonomous driving systems. Today, we can say that autonomous driving systems have speed control with radar and distance sensors, lane tracking, and lane change after cameras on the vehicle. Autonomous vehicles are one of the most effective use cases where hardware and software work together. The hardware enables the vehicle to move and communicate with a range of cameras, sensors, while the software processes information and provides control.
Today, many automobile companies are attempting to produce cars with autonomous driving systems. We can say Tesla company as the leading company. The cars they produce have a full automation driving system. The data set is collected in realtime from approximately 1 million vehicles. 70,000 GPU's are trained per hour. It is capable of semantic segmentation, object recognition, depth estimation. There are 1000 different estimates per step each time. Some companies use the LIDAR device to model depth prediction and 3D perception. Depth prediction is a fundamental task in perceiving the 3D environment around us [1].
In this study, lane tracking, which is one of the two most important abilities in autonomous vehicles, and the ability to avoid obstacles for the robot vehicle to drive freely without hitting any obstacle are discussed. The main purpose of this research was to develop lane tracking and obstacle avoidance capabilities with different methods and solutions for an autonomous driving system on an experimental vehicle.
The robot vehicle, remote control module, and experimental map that constitute the hardware part of the project were prepared. Raspberry Pi module on the vehicle forms the brain of the vehicle. Raspberry PI communicates with the remote-control module via wireless network (RF24) and computer via embedded software. The Raspberry PI module, which plays the role of the brain of the system in the later stages of the project, was renewed with the Coral Dev Board [2] device developed by Google for artificial intelligence model's due to its inadequate performance.
With the help of a remote-control device, the robot vehicle was moved along the track and dataset collection was carried out through the camera on the vehicle. This dataset created consists of images and action information taken at the time of that image. Then, the images taken were simplified with image processing techniques, and the strip lines were brought to the fore. At this stage, Convolutional Neural Network [7] was used while creating an artificial intelligence model. CNN is a type of artificial neural network developed to solve problems such as image classification, object detection, and style transfer. Since the images are our main data source, it was decided to use CNN.
The target problem for avoiding obstacles is the calculation of contact time or time to collision. The most important feature focused here was the calculation of the time until the collision, ie the contact time. In this direction, corner detection, optical flow focus of expansion, and collision time were calculated instantaneously on the image taken from the camera [8], [9], [10]. The balance calculation has been made for the right and left body of the robot vehicle and a decision mechanism has been created to avoid obstacles. The robot vehicle has been provided to move without hitting any obstacle.

Lane Tracking
In this section, simplification of lane information with computer vision techniques, and data set collection are given initially to enable the robot vehicle to move autonomously by following the lane information on the experimental map. Next, a detailed description of the designed network architecture and training is provided. To achieve the successful model, hyper parameter tuning and data augmentation, and finally, the testing process is explained.
The dataset collection process will be performed by moving the robot vehicle over the experimental environment with the help of a remote control. The images taken from the camera correctly positioned on the vehicle will first be recorded in the filing system by simplifying the lane information with image processing techniques. At the same time, the action information of the car at the time the image is taken is recorded in the filing system simultaneously with the images.

Simplifying Lane Information with Computer Vision Techniques
Preparing the images in the data set that we will give to the neural network in accordance with the purpose is the most influential factor in the result of the developed neural network model. If the image is messy, difficult to understand, and the neural network is not able to distinguish the features in the image, the error values of the model will be high and the operation is nothing but a waste of time. For this purpose, the images taken from the camera on the robot vehicle were first simplified with image processing techniques. First, the color space change was performed on the image. Many color spaces are supported in the OpenCV library and you can convert between them. In the first step, the image was converted from RGB color space to grayscale color space [11].   The image taken from the camera has been successfully simplified and made ready for the use of the neural network. If you pay attention to the upper left corner of the figure, you can see the angle and speed, which are the action information of the vehicle at the time the image is taken from the camera.

Data Set Collection and Editing
The collection of the data set will be carried out by moving the robot vehicle over the experimental environment with the remote control. In the Supervised Learning machine learning approach we will use, the data to be trained should be given to the learner as x and y outputs. While the vehicle is controlled remotely on the experimental environment we have prepared before, the image from the camera, the current servo angle and engine speed are recorded in the filing system simultaneously. Figure 3: Experimental map on which the robot vehicle will be moved.
In the Supervised Learning approach, the data should be given to the trainer as x and y outputs. While the robot vehicle was being moved over the experimental environment, the servo angle information, which is the current action information, was recorded in the filing system along with the camera image. According to the general structure of the experimental map, the servo x angle values in the data collected vary between 1270 and 600. Angle data collected at this stage has a complex structure and needs to be simplified. For this purpose, while the robot vehicle is moving on the experimental map, the angle information as FLAT, MIDDLE, and SHARP is updated in the filing system by simultaneously looking at its location on the map and the instant angle information from the computer. Alternatively, the angle information, which is a parameter of the data set, can be compressed between 0 and 1 for linear regression [13], [14], allowing linear estimation.

Data Set Collection and Editing
While the robot vehicle is in motion, it should analyze the environmental conditions and make control predictions. Environmental conditions consist of data collected in the previous topic.
The robot vehicle needs a system that can use this data and make predictions. Artificial neural networks are Artificial Intelligence structures that are trained with the given data and can make predictions according to the information they learn.
In this subsection, the design of the architecture and the training of the model presented initially. Next, the hyper parameter tuning [15], [16], and data augmentation [17], and finally the testing and result are explained in detail.

Designing and Training the Model
Before creating a neural network model, the neural network structure to be used is decided by considering the data set, project conditions and properties. The main source of data consists of images. The neural network is required to be predicted according to the images and action information taken from the camera.
The neural network will distinguish the features in the images taken from the camera and perform the learning and prediction processes. Therefore, it was decided to use convolutional neural networks at this stage of the project.
Data set consists of binary color pictures simplified with Canny edge detection algorithm and angle information, which is the action information at the time the picture is taken. The stored images were resized to 128 x 128 pixels before being transferred to the model. The action information, FLAT, MIDDLE and SHARP, are updated to correspond to 0, 1, and 2, respectively. Due to the general structure of neural networks, the complexity of the structure is directly proportional to the estimation time. Therefore, it is important that the model to be designed has a simple structure. On the other hand, the education period of the models with a simple structure is short and time saving is obtained. Another issue in neural networks is that there are no rules for establishing the best model. For this reason, until we find the model with which we have achieved high performance, the models have been designed by taking the available data into consideration.
The steps to be taken during the training of our artificial neural network model are as follows. The first step is to read and store the data set and mix it randomly. The second step is to separate 70test data. After separating the training and test data set, an image from the training data set is given to our model and the weights are updated according to the error value. Figure  5. shows the flow chart representing the training process of the model. The first layer of the first model prepared is a convolution layer with 32 x 32 depth and 3 x 3 filter dimensions. Input data is 128 x 128 x 2 size simplified image with Canny edge detection algorithm. Relu, the next layer activation function, has been applied. Two Max-Pooling layers were then applied. Filter dimensions of the Max-Pooling layers are determined as 2 x 2. By applying Max-Pooling layers in succession, which yields a feature map of the 32 × 32 × 32 size. Next, the Flatten layer added and then the Dropout layer have been added to prevent overfitting [18]. The dropout layer is a regularization approach [19] that helps reduce dependent learning between neurons. Figure 7. shows the loss graph of the first designed neural network. When the loss graph in Figure 7. is examined, it is seen that during the batch of 32 pictures each, it goes to overfitting quickly [20]. It has been observed that the loss of the neural network model rapidly approaches 0 at the end of 1 epoch.
The result we will get here is that the dropout layer used to reduce overfitting is insufficient. In order to eliminate this problem caused by the fact that the dataset consists of few and similar images, data diversity will be increased by data augmentation. Thus, a more general model that can respond to real-life problems will be obtained by considering parameters such as lighting conditions and noise in the image.

Hyper Parameter Tuning and Data Augmentation
While designing a model in artificial neural networks, there is no rule to reach a successful model. There is no rule to be followed in line with the information obtained from the studies conducted on this subject in the world so far. The improvement of the model is done by techniques such as trial-and-error method, hyper parameter tuning [15] and data augmentation [17].
In another neural network model prepared, the first layer is a convolution layer with 32 x 32 depth and 3 x 3 filter dimensions. The second layer is also a convolution layer of 16 x 16 depth and 3 x 3 filter dimensions. Relu activation function has been applied in the convolution layers. Next comes Flatten layer and then Dense layer, which uses the Softmax activation function. Figure 8: Layers of neural network whose hyper parameters are tuned.
The Dropout layer was insufficient to prevent overfitting in the previous architecture. For this reason, overfitting is prevented by generating synthetic data with the data augmentation technique [21].  . shows the synthetic image obtained after applying flip, shift, and zoom to the real image. These variations in the data set enable the trained model to achieve a similar performance in different image conditions. Thus, a more general solution will be achieved. The loss graph of the model trained after data augmentation is shown in Figure 10. There is a obvious improvement in overfitting [20] rate compared to previous training. It is seen that the loss ratio that converges rapidly to 0 at the batch level before is now decreasing at the epoch level. At the end of 100 epoch, the lowest loss value was reached as 0.25 in the 90th epoch.
New synthetic data were generated by making changes in the augmentation parameters to reduce the loss value even smaller. Zoom ratio decreased from 0.4 to 0.2, flip angle from 40 degrees to 10 degrees. The changes applied here will make less distortion of the lane information in the image and help achieve the goal of obtaining a more general model. The highest performing neural network model has been retrained with the new dataset, and the expected reduction in loss data occurred. Figure 11: Loss graph of the most successful CNN model.
As seen in the graph, the 95th epoch has reached 0.17 loss value. Thus, it was observed that the change made in the data augmentation parameters had an effect on the decrease of the loss value.

Obstacle Avoidance with Optical Flow
While an autonomous robot vehicle is moving in a constant velocity, the time until the collision can be found without any knowledge of the distance to be traveled or the velocity the robot is moving [8]. Calculating the time to collision is one of the practical optical flow uses. The optical flow knowledge is extracted from the image sequence taken from the Google camera placed in the robotic vehicle, and then the time until the robot reaches a particular area is determined. Calculated collision times are considered separately as collision times on the left and right of the image. Depending on whether the difference between the collision times of the left and right side is higher or lower than a certain threshold value, the vehicle is ordered to ignore the obstacle in front of it or to take action. Figure 12: The flow char representing obstacle avoidance procedure [22].
In this section, corner detection, and calculating optical flow are introduced in the first place. After, the focus of expansion and time to collision calculation procedures are explained. Next, the balance strategy and decision mechanism are explained in detail and the movement of the vehicle is presented according to the decision produced by the mechanism.

Corner Detection with Fast (Features from Accelerated Segment Test)
t is necessary to extract the optical flow information from the image sequence taken from the camera. To find the optical flow between consecutive frames, the motion of a pixel feature set should be tracked. Features in the image are points of interest that provide rich picture content information, and these points are not affected by intensity changes in the image [23].
Using the FAST [24] algorithm, which is known for its high performance in real-time images, corner detection performed in the real-time image sequence.

Calculating Optical Flow
For the optical flow to be computable, a selected point on the first image must change its location on the next image. While the selected point is moving, the shape of the light reflected on that point is constantly changing and optical flow occurs. In other words, the vehicle must be moving in order to obtain optical flow with the robot vehicle.
The most widely used Lucas-Kanade [25] method was used to calculate the optical flow between consecutive frames. The vector containing the vertices detected by the FAST corner detection algorithm is given to the function and it returns two vectors containing the (x, y) coordinates of the previous and next points. Now that the changing coordinates of a corner point in the previous and ongoing frame are known, an arrow can be drawn from the previous position to the next position. In other words, an arrow is drawn in the direction of the point's movement in consecutive frames if the tracked corner point exists (detected) in the next frame. Suppose (x 1 , y 1 ) and (x 2 , y 2 ) are the coordinates of the point in the previous and next squares.

Calculating Focus of Expansion
The motions of objects moving around are projected to the eyes of the observer as two fundamental motions. An optical flow field is formed as a result of the projection of the translation and rotation fundamental motions into an image plane [8]. differences between the previous and current position of the optical flow vector.
In matrix A it should be known as a i 0 = -v and a i 1 = u, and in matrix B, each value is obtained by b i = xv -yu. The focus of expansion is calculated using the least-squares method and inversion of the matrices.
The OpenCV library has the necessary functionality for the matrix inversion method. The function is given matrices A and b and an empty matrix of 2x1 dimensions. Additionally, the DECOMP_QR flag was added for QR decomposition [27].

Calculating Time to Collision
The most valuable information we can obtain for the robot vehicle to avoid obstacles is the determination of the contact time or the time until the collision. This information can be found without requiring any information about the distance to travel or the velocity the robot is moving [8].
The studies carried out up to this stage were to obtain information to be used in the TTC because when calculating the TTC, optical flow vectors and FOE are required. Let p = (x, y) be the x and y positions of an optical flow vector and FOE = (x, y) be the x and y positions of a focus of expansion. Let v = (u, v) be the x and y coordinate differences between the previous and current positions of the optical flow vector.
The  After calculating the TTC of each flow vector, it is collected at one of the closest A matrix points in the image. In matrix B, the number of TTCs collected at each matrix point is stored.
ttc i is the time until the collision of a flow vector, and a in matrix B is the number of flow vectors. Using matrices A and b, the average TTC for each matrix point can be calculated.

Balance Calculation and Decision Mechanism
The basic idea is that when the robot is in motion, close objects move faster than farther objects on the retina. Also, closer objects cover the field of view more, causing greater optical flows.
In the region where the optical flow is greater, the collision time is low and the robot vehicle must go to the other side and avoid from the obstacle. Kachluche Souhila and Achour Karim in their article [22], they present a different perspective in which the robot car moves away from the side where there is greater optical flow.
The image taken from the camera is divided into two parts to give the vehicle balance. Balance can be achieved by minimizing the difference between collision times on the left and right side of the vehicle. Figure 19: Dividing the image into two parts.
The following control formula is used to calculate the difference between collision times on the left and right side of the vehicle.
Here (F L -F R ) is the difference between the forces on both sides of the robot body and TTC is the average of the collision time in the visual half-field on one side. The difference between the forces calculated in equation 8 is a linear number and varies between 0 and 1.The balance mechanism was applied to the robot vehicle. Due to the environmental conditions of the experimental environment, the threshold value of the difference between the left and right TTC was determined as 0.5. In cases where the threshold value is exceeded, the robot vehicle will be given the necessary rotation order. As shown in Figure 21, the left collision time is calculated as (3895.2), the right collision time (101820.7) and the difference in forces (0.9) were found. It is decided to turn right, because the left collision time is less than the right, and the difference in forces is greater than 0.5. In Figure 21, it can be seen that the robot vehicle is given a forward motion command. Left collision time (993.2), right collision time (339.8) and difference in forces (0.1) were found.

Conclusion
This article presented develop lane tracking and obstacle avoidance capabilities with different methods and solutions for an autonomous driving system on an experimental vehicle.
Training of the neural network model was done with simplified images. During the test stage, the images taken from the camera should be similar to the images used in the training stage of the neural network. The similarity emphasized here is that the images are in the same color space and simplified. For this reason, the images taken from the camera during the test stage are instantly simplified and then transmitted to the neural network. The prepared neural network model produces 0, 1, and 2 as output. These correspond to the values for FLAT, MEDIUM, and SHARP, respectively. Since these labels are obtained by simplifying the rotation angle of the servo, the rotational motion is provided by converting the predictions back to the servo angle with the help of an algorithm.
Neural network models were tested on the experimental map and the output analyzed. Although the first model did not apply data augmentation and was less trained than other models, it achieved 50trained after hyper parameter adjustment and data augmentation was observed to complete the map completely. Considering the light conditions on the map and the external environmental factors detected by the camera, it has been observed that the data augmentation process eliminates this problem and helps the vehicle to complete the map under different light and environmental conditions.
Time to Collision, which is the target problem of obstacle avoidance, was calculated. For this purpose, corner detection, optical flow, and focus of expansion were calculated, respectively. Finally, balance calculation were made for the left and right body of the robot vehicle and a decision mechanism was created to avoid obstacles.