Preprint
Article

This version is not peer-reviewed.

Design and Implementation of an Autonomous Surgical Robotic Aspirator

Submitted:

01 April 2026

Posted:

02 April 2026

You are already at the latest version

Abstract
Robotic assistance in minimally invasive surgery has significantly improved precision and dexterity; however, many supportive tasks, such as blood aspiration, still rely on manual operation. This work presents the design and implementation of an autonomous robotic aspirator capable of detecting and removing intraoperative bleeding without continuous human intervention. The proposed system integrates a perception module based on a convolutional neural network for real-time blood segmentation, a task planner for high-level actions execution, and a control strategy based on artificial potential fields for autonomous navigation. Additionally, a mixed-reality human–robot interaction interface is incorporated to enable system supervision and seamless transition to teleoperation when required. The system was experimentally validated with a set of in-vitro experiments under three representative bleeding scenarios, evaluating four suction strategies based on the computation method for the target selection. Results demonstrate fast reaction times (below 0.04 s) and high blood removal rates (above 80% in all cases). The comparative analysis reveals that the performance of the suction strategies is scenario-dependent and highlights a trade-off between suction efficiency and removed area. These findings support the feasibility of autonomous robotic aspiration and provide insights into the design of adaptive strategies for surgical assistance, contributing toward increased autonomy and improved workflow efficiency in minimally invasive procedures.
Keywords: 
;  ;  ;  ;  
Subject: 
Engineering  -   Bioengineering

1. Introduction

Robotic assistance has become a fundamental component of modern minimally invasive surgery, enabling enhanced precision, tremor filtering, and improved dexterity compared to conventional laparoscopy. Robotic platforms such as the da Vinci Surgical System have achieved widespread clinical adoption, with more than 8,600 systems installed worldwide and over 14 million procedures performed to date. Despite these advances, the predominant paradigm in robotic surgery remains master–slave teleoperation, in which the surgeon directly controls the surgical instruments while repetitive supportive tasks are still performed manually. For this reason, a growing research trend in medical robotics focuses on introducing increasing levels of autonomy into surgical assistants in order to reduce surgeon workload and improve workflow efficiency [1].
Early efforts in surgical automation have primarily addressed supportive tasks that exhibit a high degree of standardization. Among them, endoscope guidance has received considerable attention, as it is a simple task that can reduce surgeon cognitive load while maintaining a stable visual field. Several approaches have been proposed based on explicit surgeon commands, including voice interfaces [2], head-motion tracking [3], and gaze-based control [4]. Although these solutions eliminate the need for a human assistant, they still require continuous attention from the surgeon to guide the camera. More recently, research has explored camera guidance approaches in which the robot directly analyzes the surgical scene. Vision-based strategies that track surgical instruments have demonstrated the potential of this paradigm by endowing the robot with greater autonomy without continuous supervision by the surgeon [5,6,7].
Other surgical tasks that have been subject to automation are those common to a wide range of surgical procedures, and that exhibit a high degree of standardization, such as organ or tissue retraction [8,9,10], suturing [11,12,13], and knot tying [14]. These works demonstrate that automating supportive tasks can significantly streamline the surgical workflow.
Within this context, intraoperative blood aspiration represents a particularly suitable candidate for automation. During minimally invasive procedures, the accumulation of blood can rapidly degrade the visual field and obscure critical anatomical structures, requiring a human assistant to manually operate a suction tool to remove blood. This manual intervention introduces variability and coordination challenges that an autonomous robotic counterpart could mitigate. Previous studies have demonstrated that semi-autonomous suction can improve surgical performance by preventing states of high cognitive demands of the main surgeon [15]. However, achieving autonomous blood aspiration requires the integration of multiple technological challenges that need to be addressed: a robust real-time blood detection algorithm, a safe control strategy, and an efficient trajectory generation to optimize suction.
Several works have addressed automated blood removal. Richter et al. proposed an autonomous suction framework for detecting and removing blood using the da Vinci Research Kit, incorporating an autonomous trajectory generation based on temporal information (pixels age) to prioritize active bleeding regions [16]. Ou et al. tested a reinforcement learning strategy for rinsing and cleaning surgical fields [17]. Their approach is based on customized simulation environments, where two autonomous agents are trained using robot learning approaches and transferred to the real world. Although the results are very promising, in-depth information, inherent to real surgical scenarios, is not considered in this work. Barragan et al. studied the surgeon workload with a semi-autonomous suction robotic assistant [15]. They computed the centroids and areas of detected blood blobs using a fully convolutional model. Then, a straight-line trajectory from the current position to the calculated target position was calculated and executed by the robot. In this work, collision avoidance is not directly addressed; instead, the surgeon is advised of the robot’s target position to avoid the collision.
From the perception perspective, automatic bleeding detection in endoscopic imagery has significantly advanced in recent years. Early approaches relied on color filtering and handcrafted features [18], whereas modern methods predominantly employ deep learning models. Convolutional neural networks (CNNs) have demonstrated strong performance in detecting bleeding in laparoscopic videos. Horita et al. developed a YOLOv7-based detector capable of identifying active bleeding in real time during laparoscopic procedures [19]. Similarly, Hua et al. have proposed a hybrid spatiotemporal architecture combining RGB images and optical flow to improve robustness in dynamic surgical scenes [20]. Rabbani et al. have also proposed a ResNet-50-based space-time memory network with positional encoding for video-based bleeding source segmentation via adversarial domain adaptation between real and synthetic data [21].
In addition to perception, autonomous navigation of surgical instruments poses unique challenges due to the constraints of laparoscopic manipulation. Learning-based motion planning strategies, such as Deep Reinforcement Learning (DRL), have been explored for complex robotic manipulation tasks [22,23]. However, these approaches typically require large datasets and extensive training procedures, and they often lack the transparency and predictability required for safety-critical medical applications. Alternatively, deterministic motion planning techniques are attractive alternatives for surgical robotics. Among them, Artificial Potential Fields (APF), originally formulated by Khatib [24], provide a computationally efficient method for reactive navigation by combining attractive forces toward the goal and repulsive forces from obstacles.
The APF algorithm is based on the generation of artificial potential fields in Cartesian space, where the goal location induces an attractive force and obstacles generate repulsive forces that influence the robot motion. This technique is widely applied in robotic applications for obstacle avoidance [25,26], providing real-time low-level path planning and control. APF-based strategies have also demonstrated strong performance in scenarios involving dual-arm manipulators, enabling stabilized motion planning with smoother trajectories and optimized spatial separation [27]. These characteristics make APFs particularly suitable for MIS applications, where instruments operate within a confined workspace and unexpected collisions between instruments or with surrounding anatomical structures may occur. In this context, Tang et al. proposed an APF method for master-slave surgical robots in which virtual fixtures are automatically generated based on marked points by the surgeon before the intervention [28]. Similarly, Hao et al. proposed a novel path planning algorithm for surgical robots based on APFs to guide the tool to the goal position and a primal-dual neural network to minimize the angular velocity [29].
Motivated by these challenges, this work proposes an autonomous surgical robotic aspirator with a mixed-reality HRI interface for human supervision. An overview of this scenario is illustrated in Figure 1, showing a conventional minimally invasive procedure augmented with a robotic aspirator for autonomous blood removal. The system consists of a suction tool attached to an external robot, enabling autonomous control of the instrument. Supervision of the robot’s performance is done through a human-robot interaction (HRI) interface, consisting of a virtual reality (VR) headset for visualizing the intraoperative endoscopic image. Relevant information regarding the intervention and the performance of the robotic aspirator is overlaid onto the endoscopic view within a mixed-reality (MR) environment. This concept is applicable for either manual or robot-assisted management of the surgical tools, controlled by the main surgeon.
Thus, the main contributions of this paper are:
  • The design and implementation of a unified framework for autonomous surgical blood aspiration, integrating perception, a task planner for high-level actions, a navigation controller based on artificial potential fields, together with a mixed-reality human-robot interaction interface for human supervision and teleoperation if required.
  • Analysis of different suction strategies based on centroid-based computation methods for the target selection, including a novel evaluation of their spatial discrepancy and its impact on the robotic navigation behavior.
  • An extensive experimental validation under multiple representative bleeding scenarios, providing a systematic comparison of four centroid-based strategies in terms of reaction time, suction efficiency, and removal performance.
The paper is organized as follows. Section 2 describes the overall system architecture and details the bleeding detection and autonomous navigation algorithms. This section concludes with the experimental software and hardware experimental setup used for the validation of the system, along with the experimental design. Section 3 analyzes the experimental results and a discussion is presented in Section 4. Finally, Section 5 summarizes the conclusions of the work.

2. Material and Methods

Figure 2 shows the overall system architecture and workflow of the autonomous robotic aspirator proposed in this work. First, endoscopic images are processed in the perception layer, which performs blood segmentation using a pretrained convolutional neural network (CNN) that outputs a mask of the detected blood regions. Then, region analysis is performed to compute the following blood descriptors for each region: (1) the area, used to filter small blood regions and to control suction deactivation, (2) the centroid, which is used as candidate targets for tool navigation, and (3) a temporal persistence map, which represents the age of the blood and gives an idea of the blood flow.
Based on this information, the task planner generates the high-level navigation and suction actions. The mode selector module allows switching from autonomous navigation (task planner input) or teleoperation (haptic controller input) in case of emergency. Navigation towards a target position is implemented through a local planner based on an Artificial Potential Field (APF) approach to generate collision-free trajectories towards the target blood region while avoiding interactions with other surgical instruments and anatomical structures, such as organs or suturing needles. Robot low-level controller ensures that the tool motion satisfies the remote-center of motion (RCM) constraint imposed by the fulcrum point. Once the tool reaches the target position, the suction control module activates the suction tool. The system then monitors the area of the target region and deactivates suction once the blood has been removed.
To ensure safety, human supervision is provided through a human–robot interaction (HRI) interface. In this work, we propose a mixed-reality (MR) supervisory console based on a virtual reality (VR) headset and a haptic controller. Within the headset, the virtual environment is replaced by a real-time video stream captured by the laparoscopic endoscope, allowing the operator to directly visualize the intraoperative scene. This view is augmented with overlaid information about the scene (detected blood regions) and system performance (triggered high-level actions). The interface enables both passive supervision and direct teleoperation. Under normal operation, the robotic aspirator performs the task autonomously while the human assistant supervises its behavior through the HRI interface. If necessary, the operator can take direct control of the instrument by switching to teleoperation mode using a button on the haptic controller. In this mode, instrument motion is controlled through the hand motions of the controller, allowing intuitive manipulation of the tool.

2.1. Blood Segmentation and Region Analysis

The perception pipeline is summarized in Figure 3. For blood segmentation, we employ a CNN based on the U-Net architecture, a well-established model for medical image segmentation. This architecture is particularly suitable due to its ability to capture both global and local features, enabling precise blood segmentation. For each incoming endoscopic frame I t , the CNN model produces a pixel-wise probability map indicating the likelihood of blood presence. By applying a predefined threshold, this probability map is converted into a binary mask M t representing the segmented blood regions.
This binary mask is then analyzed to extract descriptive features for each detected blood region, which are aggregated into a blood descriptor tuple D r as:
D r = ( r , A r ,   C P r , H r )
where r denotes the region identifier, A r is the area of the region ,   C P r represents the 3D coordinates of its centroid with respect to the camera, and H r is the temporal persistence descriptor encoding the age of the region pixels.
First, to improve robustness against noise and small artifacts, area filtering is applied to the mask: regions whose area is smaller than a predefined threshold A t h are discarded to avoid spurious detections. In addition to spatial information, the algorithm maintains a temporal persistence map H t ( i ) , which accumulates the number of consecutive frames in which each pixel i is classified as blood. This allows the system to estimate the temporal age of bleeding regions and distinguish persistent bleeding from transient artifacts. Then, the maximum persistence value within each region is computed as:
H r = max i r H t ( i )
This value is used by the task planner to prioritize blood regions in the presence of simultaneous bleeding.
Finally, the 3D centroid of each blood region ,   C P r , is transformed into the robot frame and used as the target location for the navigation action. As this position evolves with blood flow and progressive removal, it defines the dynamic trajectory of the aspirator. Thus, different centroid definitions can be employed, resulting in different suction strategies: while a geometric centroid provides a purely spatial estimate, a persistence-weighted centroid—computed based on pixel age guides the target towards either the bleeding source (older pixels) or the region boundaries. In this work, we explore both approaches (see Section 2.5.2) to evaluate their impact on the suction strategy.
The centroid p ( x c , y c ) is initially computed in the 2D image space, as shown in Figure 4, which illustrates the complete geometric model of the system. The 3D centroid   C P r ( x r , y r , z r ) is obtained by back-projecting the image coordinates using the pinhole camera model of an RGB-D camera as:
x r = ( x c o x ) Z f x y r = ( y c o y ) Z f y z r = Z
where Z corresponds to the depth value at the centroid location, ( f x , f y ) are the focal lengths in pixels, and ( o x , o y ) denote the optical center of the camera. The transformation from the camera frame C to the robot base frame R is detailed in Section 2.2.

2.2. Task Planner

The task planner triggers the system high-level actions, namely navigation toward a target position (NavigateTo(pos)) and suction activation or deactivation (Suction(on/off)). The task planner workflow is described in Algorithm 1. When the perception layer identifies blood regions in the image, whose descriptors are stored in D r , the blood region with the largest area is selected as the target region to be removed ( r * ), i.e.:
r * = arg max r D r A r
While blood of the target region remains, i.e., A r * is larger than a certain threshold A s t o p , the system continuously updates the descriptor of the selected region r * , i.e., D r * = ( r * , A r * ,   C P r * , H r * ) , and computes the target position used to navigate the robot, P r as:
P r =   R T C [ C P r * 1 ]
where   R T C is the homogeneous transformation between the camera reference frame { C } and the robot base frame { R } . A navigation command is then issued to guide the suction tool toward P r (note that the low-level control is handled by the control layer, see Section 2.3). The system evaluates the distance between the tool tip, P tool , and the target position and activates suction when P tool P r δ , being δ a certain distance threshold. Once the target blood region has been removed, suction is deactivated. If no blood regions are detected, the robot is commanded to move to its idle position.
Algorithm 1: Task planner workflow
Preprints 206149 i001

2.3. Autonomous Navigation Method

Figure 5 illustrates the implementation of the navigation toward a target position action (NavigateTo(pos)). The local planner takes as input the target location of the blood region to be removed, P r , and generates the desired reference position for the robot, P d . The low-level controller then maps this reference into the corresponding robot end-effector position, P E F , while enforcing the remote center of motion (RCM) constraint imposed by the fulcrum point, which is inherent to minimally invasive procedures. This autonomous navigation mode can be switched to teleoperation mode through the haptic controller of the HRI interface.
The local planner to guide the suction tool toward the target position is performed through Artificial Potential Fields (APF). This is a motion planning method in which the robot is guided by an artificial potential function defined over the workspace. Thus, the robot moves under the influence of a virtual force derived from this potential, which attracts it toward the goal and repels it away from obstacles.
Let P ( x , y , z ) denote the current position of the suction tool tip, and P r ( x r , y r , z r ) the desired target position. The attractive potential is defined as a quadratic function of the distance to the goal:
U a t t ( x , y , z ) = 1 2 K a t t P P r 2
where K a t t is a positive scalar gain that determines the strength of the attractive potential toward the goal. The repulsive potential associated with the m-th obstacle is defined as:
U r e p m ( x , y , z ) = 1 2 K r e p 1 ρ m 1 ρ 0 2 , ρ m < ρ 0 , 0 , ρ m ρ 0 ,
where K r e p is a positive gain that determines how strongly the robot is pushed away from obstacles, ρ m is the distance between the robot tool tip and the m-th obstacle, and ρ 0 is the obstacle influence radius. The total potential field is defined as the sum of the attractive and repulsive contributions:
U ( x , y , z ) = U a t t ( x , y , z ) + m U r e p m ( x , y , z )
The robot motion is driven by the negative gradient of the potential field. In this work, a kinematic formulation is adopted, where the commanded velocity of the tool tip is defined as:
v c ( x , y , z ) = U ( x , y , z )
Therefore, the desired position of the tool tip, P d ( t ) , commanded to the low-level controller of the robot is computed as:
P d ( k + 1 ) = P d ( k ) + v c Δ ( t )
In practice, several known limitations of classical APF methods are alleviated in the proposed implementation. In particular, oscillatory behavior near the goal and excessive velocities are reduced by introducing a dead zone around the target and saturating the commanded velocity. Additionally, robustness against outdated visual references is improved through a timeout mechanism that prevents the robot from reacting to stale target updates. It should be noted, however, that the present formulation corresponds primarily to an attractive-field implementation, and therefore obstacle-induced local minima are not explicitly addressed through repulsive interactions in the current system.
Finally, the desired reference position is then transformed into the corresponding command of the robot end-effector, P E F , considering the RCM constraints imposed by the fulcrum position, P f u l c r u m . Estimation of this position is implemented following the algorithm described in our previous work [30], which relies on the wrench, measured by a force/torque sensor at the end-effector of the robot, W E F .

2.4. Experimental Setup

The experimental setup implemented in this work is illustrated in Figure 6. The robotic aspirator was implemented using a UR3 robot (Universal Robots) together with a conventional surgical aspirator donated by the Hospital Materno-Infantil of Málaga (Spain), which was modified to allow automatic activation and deactivation of the suction process.
To automate the suction process, a solenoid valve (model ST-DA 1 / 8 brass FKM, JP Fluid Control) controlled by a microcontroller (ESP32) was integrated into the system. The solenoid valve is connected on one side to the laparoscopic suction tool and on the other side to a suction pump (New Aspirate, CA-MI), allowing the suction flow to be automatically activated or deactivated through commands generated by the ESP32. To integrate the suction mechanism with the robotic manipulator, a dedicated tool adapter was designed. This component, shown in Figure 6(a), includes an internal cavity to house the solenoid valve and an outlet port for the suction tool. The rear part of the adapter is designed to be mechanically attached to the robot end effector, enabling the robot to manipulate the suction tool. Although the perception system has been evaluated using real surgical images (see Section 3.1), artificial bleeding was employed for laboratory experimentation. Bleeding was simulated using a custom-made fluid composed of water, milk, glycerin, and red dye, contained in a transparent reservoir.
As the vision system, an Intel RealSense D405 RGB-D camera (Intel Corp., USA) was used. This device provides both RGB images and depth information by combining two infrared sensors for stereo vision with an infrared projector. The camera offers a maximum RGB resolution of 1920 × 1080 pixels and a depth resolution of 1280 × 720 pixels, with a frame rate of up to 30 frames per second. The depth sensing range extends approximately from 0.2 m to 10 m, making it suitable for close-range perception tasks in the experimental surgical setup. The RGB-D camera is mounted on a second robot in order to facilitate the computation of the homogeneous transformation between the camera reference frame and the robot base frame ,   R T C of Equation 5.
As shown in Figure 6(b), the supervisory interface was implemented using the Meta Quest 3 platform. The headset provides a per-eye resolution of 2064 × 1920 pixels and a refresh rate of 90 Hz, enabling real-time visualization of the surgical field. The handheld controllers provide six degrees of freedom and include inertial sensors and programmable buttons, which are used to trigger teleoperation and to enable aspirator suction in this mode. The scene information integrated into the HRI interface includes:
  • Navigation: on/off. This indicator informs the human assistant that a bleeding region has been detected and that the system has initiated tool navigation towards the target position.
  • Suction: on/off. This indicator is set to on when the suction tool is activated, and to off otherwise.
The software integration of the different modules of the system was implemented using the Robot Operating System (ROS) as the common middleware platform. The mixed-reality application that merges the image from the vision system and the scene information for the HRI interface was developed in the Unity environment and deployed on the Meta Quest 3 headset. The integration with the rest of the system was implemented using a ROS bridge server to subscribe to the system topics, and a ROSConnection plugin to publish the information from the haptic controller.

2.5. Evaluation Methodology

Evaluation of the autonomous robotic aspirator proposed in this work was carried out following a twofold approach: (i) assessment of the bleeding segmentation module using real surgical images, and (ii) evaluation of the overall system performance in an in-vitro scenario.

2.5.1. Blood Segmentation Evaluation

The blood segmentation model described in Section 2.1 was evaluated using the dataset presented in [21], which contains 750 annotated images from real gynecological surgeries annotated with bleeding masks by junior surgeons and reviewed by experts clinicians. The dataset was split into training, validation, and test sets with proportions of 80%, 10%, and 10%, respectively. To improve generalization, data augmentation techniques including horizontal flipping and image resizing were applied. All images and corresponding masks were resized to 256 × 256 pixels. Images were processed in RGB format, while masks were converted to grayscale and binarized. The network was trained using the Binary Cross-Entropy loss function and optimized with Adam. Training was conducted for 45 epochs with a batch size of 16. All experiments were performed on a system equipped with an NVIDIA Tesla V100 GPU (32 GB).
The performance of the segmentation model was evaluated using standard metrics in medical image analysis, including the Dice coefficient, Intersection over Union (IoU), precision, and recall. Since the model outputs a pixel-wise probability map, different binarization thresholds τ were analyzed to determine the optimal operating point.

2.5.2. Robotic Aspirator Evaluation

The performance and robustness of the autonomous robotic aspirator has been evaluated under three distinct bleeding scenarios (see Figure 7). These scenarios, designed to emulate representative intraoperative conditions, were defined based on the relative position between the bleeding source and the resulting blood accumulation, which is influenced by gravity and the surface inclination of the bleeding container:
  • Source-centered accumulation (S1): the inclination of the blood container is such that blood remains localized in close proximity to the bleeding source. This condition represents a flat or near-horizontal surgical field where gravitational effects are minimal, allowing blood to pool around the origin of bleeding.
  • Downstream flow accumulation (S2): the blood container is inclined such that blood flows away from the bleeding source, accumulating at a distal location. This scenario mimics surgical conditions where the patient’s anatomy or positioning introduces a slope, such as in laparoscopic procedures or surgeries involving angled anatomical structures, where fluids tend to migrate due to gravity.
  • Bilateral flow distribution (S3): in this case, the bleeding source is positioned such that blood spreads semi-symmetrically to both sides of the source, creating multiple accumulation regions, leading to a more complex spatial distribution of blood. This situation reflects cases where anatomical features or tissue geometries cause blood to bifurcate, such as around raised structures, cavities, or during procedures with uneven surfaces where fluid disperses in multiple directions
For each bleeding scenario, four different suction strategies were implemented and compared, namely: geometric centroid (C1), source-oriented centroid (C2), front-oriented centroid (C3), and deepest-point target (C4). These strategies differ in the method used to compute the centroid of the detected blood region ( p ( x c , y c ) in Figure 4), which directly determines the target position of the suction tool. Let B denote the set of pixels belonging to the bleeding region, with cardinality N = | B | . Let ( x i , y i ) denote the image-plane coordinates of each pixel i B , and let p = ( x c , y c ) denote the 2D target point. The computation of ( x c , y c ) for each suction strategy is performed as follows:
  • Geometric centroid (C1): This method provides a global estimate of the spatial distribution of the bleeding region. The centroid is computed as:
    x c = 1 N i B x i y c = 1 N i B y i
  • Source-oriented centroid (C2): This strategy exploits the temporal persistence map H t ( i ) to prioritize pixels in the vicinity of the bleeding source. Let B core denote the subset of persistent pixels:
    B core = { i B H t ( i ) P p ( H t ) }
    where P p ( H t ) denotes the p-th percentile of persistence values within the bleeding region ( p = 80 in our implementation). The centroid is then computed as:
    x c = 1 | B core | i B core x i y c = 1 | B core | i B core y i
  • Front-oriented centroid (C3): In contrast to the previous strategy, this method prioritizes pixels with lower persistence, which are assumed to correspond to the advancing bleeding front. Each pixel is assigned a weight w ( i ) inversely proportional to its persistence value:
    w ( i ) = 1 H t ( i ) + 1
    The weighted centroid is then computed as:
    x c = i B x i w ( i ) i B w ( i ) y c = i B y i w ( i ) i B w ( i )
  • Deepest-point target (C4): This strategy selects the most interior point of the bleeding region by exploiting the distance transform of the segmented mask. Let D ( i ) denote the distance from pixel i B to the nearest boundary of the bleeding region. The target index i * is defined as:
    i * = arg max i B D ( i )
    The corresponding image-plane coordinates are then given by
    x c = x i * y c = y i *
Figure 8 illustrates the temporal evolution of the target position computation using the four centroid-based methods as the bleeding region progressively expands. This comparison highlights the distinct behaviors of the proposed strategies under dynamic conditions. From left to right, the sequence shows how each method responds to changes in the spatial distribution of blood over time. The geometric centroid (C1) remains centered within the overall bleeding region, providing a global representation of its distribution. The source-oriented centroid (C2) focuses on areas with sustained bleeding, remaining close to the bleeding source. In contrast, the front-oriented centroid (C3) shifts toward newly emerging regions, following the expansion of the bleeding front. Finally, the deepest-point target (C4) consistently identifies the most interior region of the accumulated blood, avoiding situations in which the estimated target falls outside the active bleeding flow, as may occur with other methods.
For each suction strategy, 20 independent trials were conducted in each bleeding scenario, being a trial defined as a single experimental run, beginning with the manual injection of approximately 10 ml of artificial blood and ending when the suction process is completed. Considering the four strategies and three scenarios, this resulted in a total of 240 trials, with 80 trials per scenario (see Table 1). This design enables a comprehensive assessment of how centroid estimation methods perform under varying fluid dynamics and spatial distributions, which are critical factors in real surgical environments.
To quantitatively assess the performance of the proposed system, the following metrics were computed for each trial:
  • Reaction time (s): time between a blood region is detected and the robot starts navigating.
  • Suction time (s): corresponds to the duration required to complete the removal of the target bleeding region. To ensure comparability across trials with different initial blood distributions, suction time was normalized by multiplying it by the factor A m e a n / A , where A m e a n represents the average blood area across trials within the same bleeding scenario, and A is the maximum area within each trial.
  • Suction efficiency (pixels/s): defined as the rate of blood removal, computed as the maximum area within each trial, A, divided by the non-normalized suction time.
  • Removed area (%): represents the percentage of blood that was successfully removed upon completion of the suction process.
To further analyze the differences between the centroid-based strategies, an additional geometric metric was defined to quantify the spatial discrepancy between target positions. Let p i = ( x c i , y c i ) denote the 2D coordinates of the centroid obtained with strategy C i . The pairwise distance between centroids i and j is defined as:
d i j = ( x c i x c j ) 2 + ( y c i y c j ) 2
This metric provides a quantitative measure of the spatial variability between centroid-based strategies. For each bleeding scenario, the reported distances correspond to the mean values computed over the 80 trials performed under that scenario.

3. Results

3.1. Blood Segmentation Results

The performance of the bleeding segmentation model was evaluated on an independent test set of the gynecological images used to train the model. In terms of pixel-wise prediction accuracy, the network achieved values of 98.9% and 98.1% on the validation and test sets, respectively, indicating a reliable distinction between bleeding and non-bleeding regions.
To further analyze the segmentation performance, the influence of the binarization threshold τ was evaluated. As shown in Figure 9, the Dice coefficient remains stable within the range τ [ 0.24 , 0.27 ] . Although the maximum Dice value is achieved at τ = 0.24 , a threshold of τ = 0.27 was selected as it provides a better balance between precision and recall while maintaining comparable segmentation performance.
The quantitative results obtained on the test dataset show that the model achieved a Dice coefficient of 0.686 and an IoU of 0.565, with precision and recall values of 0.72 and 0.73, respectively. These results indicate consistent detection of the main bleeding regions across different surgical scenes. During training, the Binary Cross-Entropy loss decreased steadily, while accuracy converged above 98% for both training and validation sets. The similar behavior observed in both cases indicates stable convergence and no significant overfitting.
Qualitative examples of the segmentation results are shown in Figure 10. The model successfully identifies bleeding regions under varying conditions, including changes in illumination, tissue appearance, and the presence of surgical instruments. Although minor inaccuracies are observed at region boundaries, the detected regions are sufficiently accurate for the extraction of the geometric descriptors required by the robotic control system. In addition to the quantitative evaluation, a representative visualization of the perception layer operating on real surgical images is provided in the accompanying multimedia material. The video illustrates the real-time performance of the blood segmentation module, where the predicted blood mask and the geometric centroid of each region are overlaid onto the endoscopic image.

3.2. Robotic Aspirator Performance

This section presents the experimental results obtained for the evaluation of the autonomous robotic aspirator. First, a geometric analysis of the centroid-based strategies is provided to assess the spatial differences between target positions. Then, the overall system performance is evaluated using quantitative metrics. Finally, the variability of the results is analyzed through distribution-based representations.
A complementary video is provided to qualitatively illustrate the performance of the complete system. The first part of the video presents the system behavior in a single bleeding source scenario, followed by an experiment with multiple bleeding sources, demonstrating the capability of the autonomous robotic aspirator to detect, prioritize, and remove multiple blood regions. For each experiment, both an external view of the full robotic setup and the real-time visualization delivered to the HRI interface are shown, enabling a comprehensive understanding of the system operation from both perspectives. Finally, the temporal evolution of the centroids computed using the four suction strategies is shown for a representative bleeding scenario as the blood flows and accumulates. This visualization highlights the distinct target selection behaviors of each method and their influence on the resulting navigation of the robotic aspirator.

3.2.1. Geometric Analysis of Centroid Strategies

Table 2 reports the average pairwise distances between centroid-based strategies for each bleeding scenario. The results show that the centroid strategies produce distinct target locations, with varying levels of spatial separation depending on the bleeding scenario. In all cases, non-zero distances are observed between strategies, confirming that each method leads to different navigation targets. However, results reveal that strategies C 1 and C 3 lead to similar results in all bleeding scenarios evaluated. Larger distances are observed between certain pairs of strategies, which will be discussed in the following section.

3.2.2. Quantitative Performance Evaluation

Table 3, Table 4 and Table 5 present the average values of the performance metrics, computed over the 80 trials conducted for each bleeding scenario, as defined in Section 2.5.2. The reaction time and removed area provide an overall indication of the system effectiveness, reflecting the responsiveness and the capability to remove the bleeding region. In contrast, suction time and suction efficiency are used to compare the performance of the four suction strategies across different scenarios.
The results show that all strategies achieve consistently low reaction times, with values below 0.04 s in all scenarios, indicating fast system response regardless of the selected method. Additionally, all strategies achieve a high removed area, with values ranging approximately from 80% to 94% across the evaluated scenarios, demonstrating the capability of the system to effectively remove the bleeding regions. It should be noted that full (100%) removal is not achieved, as the suction process is intentionally stopped when the remaining area falls below a predefined threshold A s t o p = 500 pixels in the implemented algorithm, in order to avoid unnecessary tool motion and reduce oscillatory behavior around small residual regions.
In the source-centered accumulation scenario (S1), the lowest suction times and highest efficiency values are obtained with strategies C 2 and C 4 , while C 1 and C 3 exhibit longer execution times and lower efficiency. However, C 1 and C 3 achieve higher removed area percentages. In the downstream flow accumulation scenario (S2), larger differences between strategies are observed. Strategy C 4 achieves the lowest suction time and highest efficiency, whereas C 1 and C 2 present significantly longer execution times and lower efficiency values. In the bilateral flow distribution scenario (S3), performance variability increases across strategies. Strategy C 1 achieves the lowest suction time and highest efficiency, while C 2 , C 3 , and C 4 exhibit longer execution times. In terms of removed area, all strategies achieve high removal percentages, with C 2 and C 4 obtaining the highest values.

3.2.3. Distribution Analysis of Performance Metrics

Figure 11 and Figure 12 present the distribution of removed area (%) and suction efficiency (pixels/s), respectively, for the four centroid-based strategies across the three bleeding scenarios. Overall, the boxplots highlight differences in variability between centroid-based strategies, which are not fully captured by the average values reported in Table 3, Table 4 and Table 5.
Regarding the removed area, the distributions show generally high values for all strategies, with median values above 85% in most cases. In scenario S1, strategies C 1 and C 3 exhibit more concentrated distributions around higher values, whereas C 2 and C 4 present a wider spread. In scenario S2, all strategies show relatively compact distributions, although some outliers with lower removed area are observed, particularly for C 4 . In scenario S3, all strategies achieve high median values, with increased variability in C 1 .
For suction efficiency, greater variability is observed across both strategies and scenarios. In scenario S1, C 2 presents the highest median efficiency with relatively low dispersion, while C 3 shows lower median values and higher variability. In scenario S2, C 4 achieves the highest efficiency values with a broader distribution. In scenario S3, the efficiency distributions are more dispersed across all strategies, with C 1 showing higher median values compared to the others.

4. Discussion

The results obtained in this work highlight the impact of the centroid estimation strategy on the performance of the autonomous robotic aspirator. Although all strategies achieve high removed area values and low reaction times, significant differences are observed in suction time and efficiency depending on both the selected strategy and the bleeding scenario.
From the geometric analysis, the pairwise centroid distances reveal that different strategies generate distinct target positions, with consistent similarities observed between certain methods. In particular, strategies C 1 and C 3 exhibit the smallest spatial discrepancies across all scenarios, indicating similar target selection behavior. This geometric similarity does not directly translate into equivalent performance, as reflected in the quantitative results. This can be explained by the fact that the centroid distance captures a static spatial relationship, whereas the robotic aspirator operates in a dynamic environment.
The quantitative evaluation shows that no single centroid strategy provides optimal performance across all scenarios. In the source-centered accumulation scenario (S1), strategies that prioritize persistent or central regions, such as C 2 and C 4 , achieve higher efficiency and lower suction times. In contrast, in the downstream flow scenario (S2), the deepest-point strategy ( C 4 ) clearly outperforms the others, suggesting that targeting interior regions of accumulated blood is advantageous when fluid displacement is significant. For the bilateral flow scenario (S3), the geometric centroid ( C 1 ) yields the best efficiency, indicating that a global representation of the bleeding region is more effective under complex spatial distributions.
An important observation is the existence of a trade-off between suction efficiency and removed area. Strategies that achieve lower suction times and higher efficiency tend to remove a smaller percentage of the bleeding region, whereas strategies with longer execution times achieve higher removal percentages. This behavior suggests that faster strategies focus on the most relevant regions of the bleeding, while slower strategies continue operating on smaller residual areas, increasing the overall removal percentage.
The distribution analysis further reveals differences in robustness across strategies. While some methods achieve high average performance, their variability across trials can differ significantly. Strategies such as C 2 in S1 and C 4 in S2 exhibit both high efficiency and relatively low dispersion, indicating stable behavior. In contrast, increased variability is observed in more complex scenarios such as S3, where blood distribution is less structured and multiple accumulation regions exist. This highlights the importance of considering not only average performance but also consistency when evaluating autonomous robotic systems.
Overall, these results demonstrate that the optimal centroid strategy is scenario-dependent. This suggests that adaptive approaches, capable of selecting or combining centroid strategies based on the characteristics of the bleeding, could further improve system performance. Additionally, the observed relationship between spatial target selection and task efficiency reinforces the importance of integrating perception and control strategies in a unified framework for autonomous surgical assistance.

5. Conclusion

This paper presented the design and experimental validation of an autonomous robotic aspirator for intraoperative blood removal in minimally invasive surgery. The proposed system integrates perception, decision-making, and control modules within a unified framework, enabling autonomous navigation and suction based on visual information extracted from endoscopic images.
The experimental results demonstrate that the system is capable of reliably detecting and removing blood regions, achieving high removed area values and fast reaction times across all evaluated scenarios. The comparison of different suction strategies shows that the choice of the centroid computation method has a significant impact on the efficiency and execution time of the suction process. A key finding of this work is that no single centroid strategy provides optimal performance across all bleeding scenarios. Instead, the results reveal a scenario-dependent behavior, where different strategies perform better depending on the spatial distribution and dynamics of the bleeding. Additionally, a trade-off between suction efficiency and removed area has been identified, highlighting the balance between rapid intervention and complete blood removal.
These findings suggest that future work should focus on adaptive strategies capable of selecting or combining centroid estimation methods based on the current surgical context. Additional research may also explore the integration of more advanced perception models and the extension of the system to in-vivo validation scenarios. Overall, the proposed system represents a step forward toward increasing autonomy in surgical assistance, contributing to the reduction of surgeon workload and the improvement of surgical workflow efficiency.
Despite the promising results, several limitations of the present study should be acknowledged. First, the experimental validation was conducted under controlled in vitro conditions using simulated bleeding, which may not fully capture the complexity and variability of real surgical environments. Factors such as tissue deformation, occlusions, and lighting variability were not explicitly addressed. Second, the perception module relies on a pre-trained segmentation model evaluated on a specific dataset, and its generalization to different surgical procedures or imaging conditions may require further validation. Additionally, the current approach assumes reliable depth estimation, which may be affected by noise or sensor limitations in real scenarios. Finally, the navigation strategy is based on a deterministic centroid selection without incorporating predictive or adaptive mechanisms. As shown in the results, the performance of each centroid strategy is scenario-dependent, which suggests that a fixed strategy may not be optimal under all conditions. Moreover, in the experimental setup we have not considered obstacles during the tool navigation, which may limit its applicability in real surgical environments where interactions with other instruments or anatomical structures are expected.

Author Contributions

Conceptualization, I.R.-B.; methodology, I.R.-B., E.G.-R. and C.L.-C.; software, E.G.-R. and A.G.-C; validation, E.G.-R.; formal analysis, I.R.-B. and V.M.; writing—original draft preparation, I.R.-B, E.G.-R.; writing—review and editing, I.R.-B. and I.G.-M.; funding acquisition, I.R.-B and V.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Spanish Ministry of Science and Innovation under grant numbers PID2021-125050OA-I00 and PID2022-138206OB-C31.

Data Availability Statement

The code and multimedia material supporting the results of this study are publicly available at: https://github.com/SurgicalRoboticsUMA/autonomous_aspirator.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Yang, G.Z.; Cambias, J.; Cleary, K.; Daimler, E.; Drake, J.; Dupont, P.E.; Hata, N.; Kazanzides, P.; Martel, S.; Patel, R.V.; et al. Medical robotics-Regulatory, ethical, and legal considerations for increasing levels of autonomy. Science robotics 2017, 2. [Google Scholar] [CrossRef]
  2. Estebanez, B.; del Saz-Orozco, P.; García-Morales, I.; Muñoz, V.F. Interfaz multimodal para un asistente robótico quirúrgico: uso de reconocimiento de maniobras quirúrgicas. Revista Iberoamericana de Automática e Informática Industrial RIAI 2011, 8, 24–34. [Google Scholar] [CrossRef]
  3. Stolzenburg, J.U.; Franz, T.; Kallidonis, P.; Minh, D.; Dietel, A.; Hicks, J.; Nicolaus, M.; Al-Aown, A.; Liatsikos, E. Comparison of the FreeHand® robotic camera holder with human assistants during endoscopic extraperitoneal radical prostatectomy. BJU International 2011, 107, 970–974. [Google Scholar] [CrossRef] [PubMed]
  4. Noonan, D.; Mylonas, G.; Shang, J.; Payne, C.; Darzi, A.; Yang, G.Z. Gaze contingent control for an articulated mechatronic laparoscope. In Proceedings of the 2010 3rd IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics. IEEE, 9 2010, pp. 759–76. [CrossRef]
  5. Laina, I.; Rieke, N.; Rupprecht, C.; Vizcaíno, J.P.; Eslami, A.; Tombari, F.; Navab, N. Concurrent segmentation and localization for tracking of surgical instruments. In Proceedings of the Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer Verlag, 9 2017, Vol. 10434 LNCS, pp. 664–672. [CrossRef]
  6. Du, X.; Kurmann, T.; Chang, P.L.; Allan, M.; Ourselin, S.; Sznitman, R.; Kelly, J.D.; Stoyanov, D. Articulated multi-instrument 2-d pose estimation using fully convolutional networks. IEEE Transactions on Medical Imaging 2018, 37, 1276–1287. [Google Scholar] [CrossRef] [PubMed]
  7. Rivas-Blanco, I.; Lopez-Casado, C.; Perez-del Pulgar, C.J.; Garcia-Vacas, F.; Fraile, J.C.; Munoz, V.F. Smart Cable-Driven Camera Robotic Assistant. IEEE Transactions on Human-Machine Systems 2018, 48, 183–196. [Google Scholar] [CrossRef]
  8. Attanasio, A.; Scaglioni, B.; Leonetti, M.; Frangi, A.F.; Cross, W.; Biyani, C.S.; Valdastri, P. Autonomous Tissue Retraction in Robotic Assisted Minimally Invasive Surgery - A Feasibility Study. IEEE Robotics and Automation Letters 2020, 5, 6528–6535. [CrossRef]
  9. Nguyen, N.D.; Nguyen, T.; Nahavandi, S.; Bhatti, A.; Guest, G. Manipulating soft tissues by deep reinforcement learning for autonomous robotic surgery. In Proceedings of the SysCon 2019 - 13th Annual IEEE International Systems Conference, Proceedings. Institute of Electrical and Electronics Engineers Inc., 4 2019. [CrossRef]
  10. Seita, D.; Krishnan, S.; Fox, R.; McKinley, S.; Canny, J.; Goldberg, K. Fast and Reliable Autonomous Surgical Debridement with Cable-Driven Robots Using a Two-Phase Calibration Procedure. In Proceedings of the Proceedings - IEEE International Conference on Robotics and Automation. Institute of Electrical and Electronics Engineers Inc., 9 2018, pp. 6651–6658. [CrossRef]
  11. Shademan, A.; Decker, R.S.; Opfermann, J.D.; Leonard, S.; Krieger, A.; Kim, P.C. Supervised autonomous robotic soft tissue surgery. Science translational medicine 2016, 8. [Google Scholar] [CrossRef]
  12. Saeidi, H.; Opfermann, J.D.; Kam, M.; Wei, S.; Leonard, S.; Hsieh, M.H.; Kang, J.U.; Krieger, A. Autonomous robotic laparoscopic surgery for intestinal anastomosis. Science robotics 2022, 7. [Google Scholar] [CrossRef]
  13. Mikada, T.; Kanno, T.; Kawase, T.; Miyazaki, T.; Kawashima, K. Suturing Support by Human Cooperative Robot Control Using Deep Learning. IEEE Access 2020, 8, 167739–167746. [Google Scholar] [CrossRef]
  14. Chow, D.L.; Newman, W. Improved knot-tying methods for autonomous robot surgery. In Proceedings of the 2013 IEEE International Conference on Automation Science and Engineering (CASE). IEEE, 8 2013, pp. 461–465. [CrossRef]
  15. Barragan, J.A.; Yang, J.; Yu, D.; Wachs, J.P. A neurotechnological aid for semi-autonomous suction in robotic-assisted surgery. Scientific Reports 2022, 12 12, 4504. [Google Scholar] [CrossRef]
  16. Richter, F.; Shen, S.; Liu, F.; Huang, J.; Funk, E.K.; Orosco, R.K.; Yip, M.C. Autonomous Robotic Suction to Clear the Surgical Field for Hemostasis Using Image-Based Blood Flow Detection. IEEE Robotics and Automation Letters 2021, 6, 1383–1390. [Google Scholar] [CrossRef]
  17. Ou, Y.; Tavakoli, M. Learning Autonomous Surgical Irrigation and Suction With the da Vinci Research Kit Using Reinforcement Learning. IEEE Transactions on Automation Science and Engineering 2025, 22, 16753–16767. [Google Scholar] [CrossRef]
  18. Garcia-Martinez, A.; Vicente-Samper, J.M.; Sabater-Navarro, J.M. Automatic detection of surgical haemorrhage using computer vision. Artificial Intelligence in Medicine 2017, 78, 55–60. [Google Scholar] [CrossRef] [PubMed]
  19. Horita, K.; Hida, K.; Itatani, Y.; Fujita, H.; Hidaka, Y.; Yamamoto, G.; Ito, M.; Obama, K. Real-time detection of active bleeding in laparoscopic colectomy using artificial intelligence. Surgical endoscopy 2024, 38, 3461–3469. [Google Scholar] [CrossRef] [PubMed]
  20. Hua, S.; Gao, J.; Wang, Z.; Yeerkenbieke, P.; Li, J.; Wang, J.; He, G.; Jiang, J.; Lu, Y.; Yu, Q.; et al. Automatic bleeding detection in laparoscopic surgery based on a faster region-based convolutional neural network. Annals of translational medicine 2022, 10. [Google Scholar] [CrossRef]
  21. Rabbani, N.; Seve, C.; Bourdel, N.; Bartoli, A. Video-based Computer-aided Laparoscopic Bleeding Management: a Space-time Memory Neural Network with Positional Encoding and Adversarial Domain Adaptation. Proceedings of Machine Learning Research 2022, 172, 1–14. [Google Scholar]
  22. Richter, F.; Zhang, Y.; Zhi, Y.; Orosco, R.K.; Yip, M.C. Augmented reality predictive displays to help mitigate the effects of delayed telesurgery. Proceedings - IEEE International Conference on Robotics and Automation, 2019; pp. 444–450. [Google Scholar] [CrossRef]
  23. Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv arXiv:1509.02971. [CrossRef]
  24. Khatib, O. Real-time obstacle avoidance for manipulators and mobile robots. Proceedings - IEEE International Conference on Robotics and Automation, 1985; pp. 500–505. [Google Scholar] [CrossRef]
  25. Xia, X.; Li, T.; Sang, S.; Cheng, Y.; Ma, H.; Zhang, Q.; Yang, K. Path Planning for Obstacle Avoidance of Robot Arm Based on Improved Potential Field Method. Sensors 2023, 23. [Google Scholar] [CrossRef]
  26. Chen, Q.; Liu, Y.; Chen, Z.; Zhou, Y. An Autonomous Obstacle Avoidance Path Planning Method Involving PSO for Dual-Arm Surgical Robot. In Proceedings of the 2022 5th International Conference on Mechatronics, Robotics and Automation (ICMRA), 2022; pp. 1–6. [Google Scholar] [CrossRef]
  27. Surya Prakash, S.K.; Prajapati, D.; Narula, B.; Shukla, A. iAPF: an improved artificial potential field framework for asymmetric dual-arm manipulation with real-time inter-arm collision avoidance. Frontiers in Robotics and AI 2025, 12–2025. [Google Scholar] [CrossRef]
  28. Tang, A.; Cao, Q.; Pan, T. Spatial motion constraints for a minimally invasive surgical robot using customizable virtual fixtures. The International Journal of Medical Robotics and Computer Assisted Surgery 2014, 10, 447–460. Available online: https://onlinelibrary.wiley.com/doi/pdf/10.1002/rcs.1551. [CrossRef]
  29. Hao, L.; Liu, D.; Du, S.; Wang, Y.; Wu, B.; Wang, Q.; Zhang, N. An improved path planning algorithm based on artificial potential field and primal-dual neural network for surgical robot. Computer Methods and Programs in Biomedicine 2022, 227, 107202. [Google Scholar] [CrossRef]
  30. Galan-Cuenca, A.; De Luis-Moura, D.; Herrera-Lopez, J.M.; Rollon, M.; Garcia-Morales, I.; Muñoz, V.F. Sutura automatizada para una plataforma robotica de asistencia a la cirugia laparoscopica. Revista Iberoamericana de Automatica e Informatica industrial 21. [CrossRef]
Figure 1. Overview of the surgical scenario proposed in this work. A conventional minimally invasive procedure is augmented with an autonomous robotic aspirator for blood removal without direct human control. Supervision of the system is performed through a human-robot interaction (HRI) interface, consisting of a VR headset to visualize the endoscopic image and a haptic controller for robot teleoperation if necessary.
Figure 1. Overview of the surgical scenario proposed in this work. A conventional minimally invasive procedure is augmented with an autonomous robotic aspirator for blood removal without direct human control. Supervision of the system is performed through a human-robot interaction (HRI) interface, consisting of a VR headset to visualize the endoscopic image and a haptic controller for robot teleoperation if necessary.
Preprints 206149 g001
Figure 2. Overall system architecture and workflow of the autonomous robotic aspirator. The system processes endoscopic images for blood segmentation and region analysis, which are used by the task planner to generate the high-level navigation and suction actions to control the robotic aspirator. Human supervision is provided through a mixed-reality HRI interface, enabling direct robot teleoperation when necessary.
Figure 2. Overall system architecture and workflow of the autonomous robotic aspirator. The system processes endoscopic images for blood segmentation and region analysis, which are used by the task planner to generate the high-level navigation and suction actions to control the robotic aspirator. Human supervision is provided through a mixed-reality HRI interface, enabling direct robot teleoperation when necessary.
Preprints 206149 g002
Figure 3. Perception layer pipeline. Endoscopic Images I t are processed by a CNN to obtain a blood mask M t . Then, candidate blood regions are analyzed to extract a descriptive tuple D r containing their area, a temporal persistence map and their centroid.
Figure 3. Perception layer pipeline. Endoscopic Images I t are processed by a CNN to obtain a blood mask M t . Then, candidate blood regions are analyzed to extract a descriptive tuple D r containing their area, a temporal persistence map and their centroid.
Preprints 206149 g003
Figure 4. Geometric model of the system. The 2D image centroid p ( x c , y c ) is deprojected into 3D space as P r using depth information from the RGB-D camera. This position is transformed from the camera reference frame { C } into the robot base frame { R } to guide the suction tool tip, P t o o l .
Figure 4. Geometric model of the system. The 2D image centroid p ( x c , y c ) is deprojected into 3D space as P r using depth information from the RGB-D camera. This position is transformed from the camera reference frame { C } into the robot base frame { R } to guide the suction tool tip, P t o o l .
Preprints 206149 g004
Figure 5. Control architecture of the proposed autonomous navigation method based on artificial potential fields, including low-level remote center of motion (RCM) control of the robot, and integration of the teleoperation mode through the haptic controller of the HRI.
Figure 5. Control architecture of the proposed autonomous navigation method based on artificial potential fields, including low-level remote center of motion (RCM) control of the robot, and integration of the teleoperation mode through the haptic controller of the HRI.
Preprints 206149 g005
Figure 6. Experimental setup of the proposed robotic aspirator system. (a) The robotic aspirator consists of a conventional surgical aspirator modified to allow automatic activation and deactivation and to be mounted onto the end-effector of a UR3e robotic arm. The vision system is a RGB-D camera mounted onto a second robot. (b) The supervisory interface is implemented using the VR Meta Quest 3 platform, which consists of a VR headset for visualization and a haptic controller to interact with the system.
Figure 6. Experimental setup of the proposed robotic aspirator system. (a) The robotic aspirator consists of a conventional surgical aspirator modified to allow automatic activation and deactivation and to be mounted onto the end-effector of a UR3e robotic arm. The vision system is a RGB-D camera mounted onto a second robot. (b) The supervisory interface is implemented using the VR Meta Quest 3 platform, which consists of a VR headset for visualization and a haptic controller to interact with the system.
Preprints 206149 g006
Figure 7. Bleeding scenarios simulated during the experimentation: (a) Source-centered accumulation (S1); (b) Downstream flow accumulation (S2); and (c) Bilateral flow distribution (S3). The bleeding source is represented with a green circle.
Figure 7. Bleeding scenarios simulated during the experimentation: (a) Source-centered accumulation (S1); (b) Downstream flow accumulation (S2); and (c) Bilateral flow distribution (S3). The bleeding source is represented with a green circle.
Preprints 206149 g007
Figure 8. Temporal evolution of the four centroid-based methods to compute the suction target position. From left to right, the images show the expansion of the bleeding region over time.
Figure 8. Temporal evolution of the four centroid-based methods to compute the suction target position. From left to right, the images show the expansion of the bleeding region over time.
Preprints 206149 g008
Figure 9. Dice coefficient versus segmentation threshold τ , showing stable performance in the selected range.
Figure 9. Dice coefficient versus segmentation threshold τ , showing stable performance in the selected range.
Preprints 206149 g009
Figure 10. Qualitative bleeding segmentation results. Each pair shows an endoscopic frame (left) and the corresponding predicted blood mask (right). Four representative frames from the test dataset are shown.
Figure 10. Qualitative bleeding segmentation results. Each pair shows an endoscopic frame (left) and the corresponding predicted blood mask (right). Four representative frames from the test dataset are shown.
Preprints 206149 g010
Figure 11. Comparative results of the removed area ( % ) for the four suction strategies ( C 1 , C 2 , C 3 , C 4 ) and for the three bleeding scenarios: (a) Bleeding scenario 1, (b) Bleeding scenario 2, and (c) Bleeding scenario 3.
Figure 11. Comparative results of the removed area ( % ) for the four suction strategies ( C 1 , C 2 , C 3 , C 4 ) and for the three bleeding scenarios: (a) Bleeding scenario 1, (b) Bleeding scenario 2, and (c) Bleeding scenario 3.
Preprints 206149 g011
Figure 12. Comparative results of the suction efficiency ( p i x e l s / s ) for the four suction strategies ( C 1 , C 2 , C 3 , C 4 ) and for the three bleeding scenarios: (a) Bleeding scenario 1, (b) Bleeding scenario 2, and (c) Bleeding scenario 3.
Figure 12. Comparative results of the suction efficiency ( p i x e l s / s ) for the four suction strategies ( C 1 , C 2 , C 3 , C 4 ) and for the three bleeding scenarios: (a) Bleeding scenario 1, (b) Bleeding scenario 2, and (c) Bleeding scenario 3.
Preprints 206149 g012
Table 1. Summary of experimental trials per bleeding scenario and centroid strategy.
Table 1. Summary of experimental trials per bleeding scenario and centroid strategy.
Centroid strategies
Bleeding scenario C1 C2 C3 C4 Total
S1 20 20 20 20 80
S2 20 20 20 20 80
S3 20 20 20 20 80
Total 60 60 60 60 240
Table 2. Average pairwise centroid distances (pixels) for each bleeding scenario (mean over 80 trials).
Table 2. Average pairwise centroid distances (pixels) for each bleeding scenario (mean over 80 trials).
Bleeding scenario d 12 d 13 d 14 d 23 d 24 d 34
S1 22.49 1.31 17.32 23.25 13.40 18.18
S2 48.52 4.33 40.21 51.25 38.14 41.24
S3 30.41 2.65 47.04 31.12 51.34 47.58
Mean 33.81 2.76 34.86 35.21 34.29 35.67
Table 3. Comparison of the system performance for source-centered accumulation scenario (S1)
Table 3. Comparison of the system performance for source-centered accumulation scenario (S1)
Centroid Reaction time (s) Removed area (%) Suction time (s) Suction efficiency (pixels/s)
C1 0.0337 90.39 % 3.9010 1658.5
C2 0.0307 80.51 % 2.3877 2668.9
C3 0.0260 90.96 % 4.5476 1444.8
C4 0.0325 81.31 % 2.7165 2356.8
Table 4. Comparison of the system performance for downstream flow accumulation scenario (S2)
Table 4. Comparison of the system performance for downstream flow accumulation scenario (S2)
Centroid Reaction time (s) Removed area (%) Suction time (s) Suction efficiency (pixels/s)
C1 0.0318 90.82 % 11.2268 660.6
C2 0.0272 89.03 % 11.6913 677.8
C3 0.0353 89.31 % 7.2107 1156
C4 0.0349 87.69 % 3.9516 1893.8
Table 5. Comparison of the system performance for bilateral flow distribution scenario (S3)
Table 5. Comparison of the system performance for bilateral flow distribution scenario (S3)
Centroid Reaction time (s) Removed area (%) Suction time (s) Suction efficiency (pixels/s)
C1 0.0309 89.51 % 5.5110 1915.6
C2 0.0283 93.82 % 9.9348 1235.4
C3 0.0316 91.86 % 11.9343 1282.7
C4 0.0294 92.86 % 10.3484 1057.7
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated