1. Introduction
Amidst the swift surge of smart technologies, their integration into agriculture to combat labor scarcity and an aging workforce is critical. Contemporary practices in crop management, ranging from surveillance to irrigation and fertilization, predominantly rely on manual effort. This dependency not only underscores the necessity for continuous human intervention but also accentuates the risks associated with human error, potentially triggering considerable agricultural setbacks. The diversity in farmland sizes necessitates tailored equipment solutions. Investing in oversized machinery carries implications beyond initial costs, factoring in insurance and depreciation, thereby inadvertently inflating operational expenses. The proven efficacy of automation in performing monotonous tasks has seen its adoption across various sectors, with its applicability in agriculture being equally comprehensive, encompassing activities like weeding and harvesting [
1]. Traditional small-scale agricultural robots were designed to navigate based on sensory feedback, apt for structured environments [
2]. However, with precision agriculture's evolution, technologies like Real-Time Kinematic (RTK)-assisted Global Navigation Satellite System (GNSS) (RTK-GNSS) and machine vision have assumed central roles in navigation automation. RTK-GNSS has been pivotal in offering precise metrics for positioning, velocity, and timing, aiding in the meticulous choreography of robot trajectories [
3,
4]. This precision allows for the preprogramming of task sequences in familiar terrains, promoting autonomous fieldwork and refined error mitigation strategies [
5].
Moreover, machine vision represents a cost-effective, maintainable, and versatile alternative to human supervision for object recognition and tracking, revolutionizing image processing [
6]. The recent trend has seen a proliferation of compact agricultural robots leveraging machine vision, particularly in identifying navigational paths for field operations [
7,
8,
9,
10,
11], signifying a paradigm shift in agricultural practices. Choi et al. developed a morphology-based approach for extracting guidance lines in rice fields, enabling autonomous weeding robots to operate without damaging crops [
7]. Their method initially involved grayscaling images, followed by Otsu thresholding and thinning processes to delineate plant edges. Experimental results demonstrated that the proposed algorithm exhibited robust performance, with guidance lines achieving high precision of less than 1° deviation. Addressing the issue of indistinct navigation lines in cornfields was the focus of Suriyakoon et al. [
8]. Due to problems such as exposure variations or crop obstructions, traditional image processing often failed, rendering navigation lines indiscernible. By using the soil in images as guiding points through specialized image processing, the study found that navigation lines could be accurately identified, even when parts of the frame were obscured by corn leaves or shadows, or the image's horizon was tilted.
Bonadiesa et al. proposed two methodologies for autonomous guidance of unmanned vehicles in fields [
9]. The first method identified the edges of paths between crops, while the second recognized the lateral crop lines. Post-identification, Hough transform operation were utilized for edge detection, and the median lines between them served as navigation paths, guiding unmanned vehicles through field operations autonomously. A navigation path-fitting method for agricultural robots was introduced by Chen et al., enhancing the traditional Hough transform's computational efficiency and addressing precision concerns related to the least squares method [
10]. Following the segmentation of plants and soil via grayscale processing, predictive points were marked through a regression equation, and path fitting was conducted using Hough transform. The results revealed a maximum deviation of less than 0.5° for navigation paths based on the Hough transform, offering significant improvements over the 10.25° average error from the Least Squares method and reducing the most considerable time consumption to 17.92ms, saving 35.20 ms.
Ruan et al. proposed a classification algorithm that leverages YOLO-R and density-based spatial clustering of applications with noise (DBSCAN) to determine the number of crop rows and individual crops within each row [
12]. Finally, a least squares method is applied to fit the crop row lines, achieving a crop row recognition rate of at least 89.8%. In the realm of guidance line recognition, two predominant methods are the Hough transform and the least squares method [
13]. The Hough transform stands out for its robust interference resistance and its proficiency in extracting edge lines to derive navigation paths. On the other hand, the least squares method offers a mathematical approach to fit navigation paths with high precision. Both methods, however, can be sensitive to changes in illumination, affecting factors like color, brightness, saturation, contrast, reflections, shadows, and noise. Such variability may hinder the accurate extraction of guiding lines. Moreover, fluctuating wind conditions in fields can induce plant motion, resulting in blurred images and potentially misidentifying crop center points [
14,
15,
16,
17]. With the advancements in high-speed computing technology, employing deep learning for navigation line extraction has gained traction [
18,
19]. Hu and Huang proposed a method for extracting crop row center lines based on an enhanced Tiny-YoLov4, where MobileNetv3 replaces the original CSPDarknet53 structure in the YoLov4 model [
18]. Images within multiple detection frames undergo binarization and mean filtering algorithms to extract crop feature points within the frames, with the final center line fitted using the least squares method. Furthermore, de Silva et al. introduced a U-Net deep learning model capable of detecting crops under favorable weather conditions, good lighting, minimal weeds, and when crops are planted in neat, uninterrupted rows [
19]. The Hough transform is then applied to identify guidance lines. These advancements underscore the ongoing innovation in agricultural technology, with deep learning and advanced image processing taking the lead in improving precision and efficiency in the field.
Deep learning methods are also frequently employed in robotics for the identification of weeds and crops. Various research endeavors highlight the effectiveness of these advanced techniques in precision agriculture. Hu et al. utilized the YOLOv4 model to detect 12 types of weeds in rice fields, with experimental results showcasing a detection accuracy of 97%, a recall rate of 81%, an F1-score of 0.89, and an average detection time of 377ms [
20]. Ruigrok et al. focused on the application of a weed detection algorithm for plant spraying [
21]. They employed a YOLOv3 model trained for this purpose, achieving accurate detection of 83% of potato plants. Field tests demonstrated effective control of 96% of weeds, with only a 3% error rate in crop termination. Presented by Chang et al. is a weed removal robot that employs artificial intelligence techniques [
22]. The YOLOv3 model was used for weed identification, and the results demonstrated a weed recognition rate of 90.7%. The weeding robot achieved an efficiency of 88.6% in removing weeds at a speed of 15cm per minute. Wang et al. applied the YOLOv5 methodology for weed detection, with their model demonstrating an accuracy and recall rate of 95% and 90%, respectively [
23]. Chen et al. employed the YoLo-sesame model to identify weeds and crops in sesame fields, achieving an average precision rate (mAP) of 96.1% at a frame rate of 36.8 frames per second (fps) [
24]. Umar Farooq et al. introduced a lightweight deep learning model, Tiny-YOLOv4, for weed detection, addressing the trade-off between algorithmic cost-effectiveness and performance [
25].
Utilizing the YOLOv3 model for weed detection, as detailed by Ruigrok et al., they trained it on image data from 20 different fields and tested it in five different arable field [
26]. The results indicated that increasing the variance in training data while keeping the sample size constant can reduce the generalization error during detection. Numerous heading control methods have been disclosed in the field of agricultural robotics. Suriyakoon et al. proposed a method for autonomous navigation in cornfields [
27]. This approach combines image processing with a combination of Proportional-Integral-Derivative (PID) and fuzzy logic control techniques. MATLAB was employed for simulating the robot's movement in the field using a speed differential model. Specifically, PID was used to regulate the wheel speeds of robot, while fuzzy logic methods were applied to control the robot's heading, ensuring its autonomous navigation in the cornfield. Qiu et al. introduced a four-wheel-drive agricultural mobile robot based on Ackermann Steering Principle (ASP) technology [
28]. Initially, a control model was employed to estimate the steering angles of the front and rear wheels. Subsequently, the optimal position for the steering center was determined based on the minimal deviation of the inner steering angle. Finally, the linear velocity of robot was generated according to ASP principles. Experimental results demonstrated that the enhanced model was capable of reducing the overall motor energy consumption and minimizing instances of robot slippage during testing. Bonadies et al. developed an unmanned ground vehicle for autonomous navigation in fields using visual guidance [
29]. Combining PID with fuzzy logic control enables vehicle to navigate autonomously in a farmland.
Tu et al. designed a sliding mode control (SMC) and inverse calculation method based on kinematics, front wheel steering and coordinated steering models to enhance the robustness of the system [
30]. Testing in various environments demonstrated the stability and robustness of the integrated controller in tracking trajectories. Zhang et al. proposed a state-feedback navigation control system based on the Ackermann steering model for a single-cylinder diesel track vehicle [
31]. The system utilizes the Ackermann steering control model in conjunction with a PWM-regulated steering proportional control approach. Path points are delineated using RTK-GNSS, and cubic spline interpolation is applied for the smoothing of these points. An effective guidance and control system for agricultural robots necessitates the precise execution of tasks such as spraying and weeding. Employing machine vision integrated with deep learning methodologies for the identification of distinct agronomic elements, such as crops, proves to be a promising strategy for navigational guidance.
Nevertheless, a significant proportion of existing methods remain sequestered in the realms of offline simulation or laboratory experimentation, with minimal empirical evidence supporting their applicability in real-world agricultural operations. Additionally, there is a conspicuous deficiency in literature concerning the in-field testing and validation of these integrated methodologies for steering control and task execution. Compounding these challenges is the reliance on singular crop types for training datasets, thereby critically hampering the universality and adaptability of the recognition models. This study presents a machine vision-based method for the autonomous guidance and field operations of agricultural robots, leveraging deep learning for guidance, control, and operational systems. The proposed system can automatically identify potential navigation paths on field ridges, modulating travel speed and directional angles using PID controllers and fuzzy logic. Adopting the YoLov4-based architecture, the system is tailored for various field recognition tasks such as crop identification, drip irrigation line detection, field ridge recognition, weed detection, and crop nutrient deficiency identification. In terms of field operations, the robot's spraying mechanism is designed to accurately dispense liquid fertilizer and herbicides to crops showing nutrient deficiencies and to target weeds.
The organization of this paper is as follows: In
Section 2, the methodology is presented, including the configuration of various modules inside the robot, the motion model and steering mode of the four-wheeled mobile robot, path planning during field operations, visual guidance methods for generating field row lines, and control methods for robot speed and heading.
Section 3 discusses the experimental results, including tests on the robot's autonomous guidance, weed and nutrient-deficient crop identification, and performance tests of the spraying system. Finally,
Section 4 provides the conclusion, summarizing the key findings of the study.
Figure 1.
The appearance and functional module configuration of the robot. ➊ Two RTK-GNSS positioning modules; ➋ controller box; ➌ spraying module; ➍ electric cylinders; ➎ webcam; ➏ battery; ➐ DC motors.
Figure 1.
The appearance and functional module configuration of the robot. ➊ Two RTK-GNSS positioning modules; ➋ controller box; ➌ spraying module; ➍ electric cylinders; ➎ webcam; ➏ battery; ➐ DC motors.
Figure 2.
Steering mechanism of robot.
Figure 2.
Steering mechanism of robot.
Figure 3.
The motion mode of robot. (a) linear motion model; (b) rotation mode.
Figure 3.
The motion mode of robot. (a) linear motion model; (b) rotation mode.
Figure 4.
Schematic of the path planning of the robot moving in the field. Black dots represent turning points or start/end points (ST/END), and dotted arrows represent movement paths.
Figure 4.
Schematic of the path planning of the robot moving in the field. Black dots represent turning points or start/end points (ST/END), and dotted arrows represent movement paths.
Figure 5.
Schematic representation of single-point detection and line segment formation methods. (a) single point; (b) single line segment; (c) multiple line segments.
Figure 5.
Schematic representation of single-point detection and line segment formation methods. (a) single point; (b) single line segment; (c) multiple line segments.
Figure 6.
The framework of YoLov4 model.
Figure 6.
The framework of YoLov4 model.
Figure 7.
Illustration of the prediction frame and the real target frame. is the prediction frame, is the real target frame (ground truth), and is the minimum peripheral frame between and .
Figure 7.
Illustration of the prediction frame and the real target frame. is the prediction frame, is the real target frame (ground truth), and is the minimum peripheral frame between and .
Figure 8.
Block diagram of fuzzy logic controller.
Figure 8.
Block diagram of fuzzy logic controller.
Figure 9.
Membership function and mathematical expressions. (a) Triangular membership function; (b) ladder membership function.
Figure 9.
Membership function and mathematical expressions. (a) Triangular membership function; (b) ladder membership function.
Figure 10.
Mamdani model with max-min composition.
Figure 10.
Mamdani model with max-min composition.
Figure 11.
The inference engine using Mamdani model.
Figure 11.
The inference engine using Mamdani model.
Figure 12.
(a) Defuzzification (center of gravity method); (b) 3-D surface view of fuzzy rules for FLC.
Figure 12.
(a) Defuzzification (center of gravity method); (b) 3-D surface view of fuzzy rules for FLC.
Figure 13.
Leaf appearance due to nutritional deficiencies. (a) Sample 1; (b) sample 2.
Figure 13.
Leaf appearance due to nutritional deficiencies. (a) Sample 1; (b) sample 2.
Figure 14.
The results of object detection on the ridge and reference line detection. (a) Displays the detection results for crops (orange boxes) and furrows (green boxes); (b) shows the central vertical line of the image (black), irrigation lines (blue), ridge lines (red), and crop lines (orange). The green boxes indicate object bounding boxes for crops, drip irrigation belt, or ridge.
Figure 14.
The results of object detection on the ridge and reference line detection. (a) Displays the detection results for crops (orange boxes) and furrows (green boxes); (b) shows the central vertical line of the image (black), irrigation lines (blue), ridge lines (red), and crop lines (orange). The green boxes indicate object bounding boxes for crops, drip irrigation belt, or ridge.
Figure 15.
The robot movement trajectories (dark purple dot and light blue dot) received by two sets of receivers at different time segments. (a) 9 AM to 11 AM (weather: sunny and cloudy); (b) 12 PM to 2 PM (weather: sunny); (c) 3 PM to 5 PM (weather: cloudy). The size of the square is 1x1 m (length x width). The green dot represents the starting point and is also the end point.
Figure 15.
The robot movement trajectories (dark purple dot and light blue dot) received by two sets of receivers at different time segments. (a) 9 AM to 11 AM (weather: sunny and cloudy); (b) 12 PM to 2 PM (weather: sunny); (c) 3 PM to 5 PM (weather: cloudy). The size of the square is 1x1 m (length x width). The green dot represents the starting point and is also the end point.
Figure 16.
Velocity control with PID (morning segments). (
a) Changes in the rotational velocity of the four wheels (moving in a straight line (blue area)), changing furrow (orange area); (
b) changes in the velocity of the four wheels during straight-line movement (enlargement of
Figure 16(
a) (8 seconds to 268 seconds)).
Figure 16.
Velocity control with PID (morning segments). (
a) Changes in the rotational velocity of the four wheels (moving in a straight line (blue area)), changing furrow (orange area); (
b) changes in the velocity of the four wheels during straight-line movement (enlargement of
Figure 16(
a) (8 seconds to 268 seconds)).
Figure 17.
Changes in heading angle on different types of guidance lines (morning segment). The angular variation in the mean heading angle is approximately 1 degree (green line).
Figure 17.
Changes in heading angle on different types of guidance lines (morning segment). The angular variation in the mean heading angle is approximately 1 degree (green line).
Figure 18.
The crops at the top, middle and bottom of the image are not parallel to the center vertical line of the ridge. Irrigation line (blue color); crop line (orange color); ridge line (red color).
Figure 18.
The crops at the top, middle and bottom of the image are not parallel to the center vertical line of the ridge. Irrigation line (blue color); crop line (orange color); ridge line (red color).
Figure 19.
The performance comparison of the irrigation lines recognition at different movement speeds for robot.
Figure 19.
The performance comparison of the irrigation lines recognition at different movement speeds for robot.
Figure 20.
The feature recognition results for weeds and crop nutrient deficiency symptoms on the ridge. (a) Weeds (in green boxes); (b) nutrient deficiency symptoms (in purple boxes).
Figure 20.
The feature recognition results for weeds and crop nutrient deficiency symptoms on the ridge. (a) Weeds (in green boxes); (b) nutrient deficiency symptoms (in purple boxes).
Figure 21.
Recognition rates of weeds and crop nutrient deficiencies in different time segments (a) and under different weather conditions (b).
Figure 21.
Recognition rates of weeds and crop nutrient deficiencies in different time segments (a) and under different weather conditions (b).
Figure 22.
The result of spraying performed by the spray module (brown color).
Figure 22.
The result of spraying performed by the spray module (brown color).
Figure 23.
Comparison of spraying performance at different time segments.
Figure 23.
Comparison of spraying performance at different time segments.
Table 1.
Definition of PID parameters.
Table 1.
Definition of PID parameters.
Type |
|
|
|
PID |
|
|
|
Table 2.
Parameters of membership function of input and output variable in fuzzy logic controller.
Table 2.
Parameters of membership function of input and output variable in fuzzy logic controller.
Input Variable |
Output Variable |
Heading Angle (θ) |
) |
Steering Angle (δ) |
Crisp Interval |
Linguistic Labels |
Crisp Interval |
Linguistic Labels |
Crisp Interval |
Linguistic Labels |
Triangular ] |
Ladder ] |
Triangular [α, γ, β ] |
Ladder ] |
Triangular ] |
Ladder ] |
– |
[-100, -100, -20, 0] |
LO |
– |
[-100, -100, -25, 0] |
N |
– |
[-17, -17, -7, 0] |
L |
[-10, 0, 10] |
– |
M |
[-20, 0, 20] |
– |
Z |
[-5, 0, 5] |
– |
M |
– |
[0, 20, 100, 100] |
RO |
– |
[0, 25, 100, 100] |
P |
– |
[0, 7, 17, 17] |
R |
Table 3.
The performance of the YOLOv4 model in detecting objects of different categories.
Table 3.
The performance of the YOLOv4 model in detecting objects of different categories.
Type |
Average precision (%) |
Recall (%) |
F1-Score (%) |
Black drip irrigation belt |
99.2 |
99.0 |
96.0 |
Crop |
99.1 |
99.0 |
96.3 |
Ridge |
98.5 |
99.0 |
96.1 |
Crop with nutritional deficiencies |
90.0 |
81.3 |
85.4 |
Weed |
91.2 |
84.2 |
88.5 |