1. Introduction
The Varroa mite (
Varroa destructor) presents a significant risk to honey bee colonies, which can lead to their decline and eventual collapse within a few years [
1]. This issue affects not only beekeeping but also broader agricultural sectors reliant on bee pollination. Detected in Western Europe in the 1970s [
2], the Varroa mite has since become a global concern. By June 2022, it had reached Australia, the last region previously unaffected, indicating its worldwide spread [
3].
Various strategies are available for managing Varroa mites, often combined within an integrated pest management (IPM) framework. These strategies can be classified into cultural, mechanical, biological, and chemical [
4]. Cultural controls involve breeding resistant bee lines selected for their genetic capacity to reduce mite populations, though achieving optimal effectiveness with this approach may take decades. Mechanical controls include techniques such as hyperthermia, screened bottom boards, and drone brood trapping, all of which physically remove mites from colonies. Biological controls employ parasitic fungi [
5,
6] and natural predators like pseudoscorpions [
7] to organically reduce mite levels. Lastly, chemical methods utilize both natural treatments, such as formic and oxalic acids, and synthetic treatments like Amitraz, which offer more aggressive mite control.
Each method varies in terms of risk, efficacy, and response time. Cultural approaches typically pose the lowest risks and may be more sustainable in the long term, but their effectiveness is currently limited and needs improvement. Chemical methods are highly effective but are associated with higher risks, including the potential for residue accumulation in wax and honey, environmental hazards [
2], and the development of resistant mite populations. The hyperthermia method, a mechanical approach, provides a balanced option, offering moderate effectiveness with manageable risks. Thermal treatments exploit the tolerance difference between bees (°45C) and mites (°41C), aiming to eliminate mites without harming bees, yet precise temperature control remains challenging [
2].
A
need-based strategy, closely monitoring the infection ratio and responding accordingly, is recommended [
2]. An implementation of this approach called
traffic light warning system categorizes infestation rates into three thresholds (red, orange, green) tailored to specific regions and times, guiding appropriate actions [
2]. However, traditional infestation assessments are labor-intensive, requiring manual sampling of about 300 worker bees to detect mites [
8,
9], stressing bees, and potentially missing early infestation stages [
10]. This inefficiency has motivated efforts towards automation and precision, mainly through image or video analysis [
10,
11,
12,
13,
14] or other sensor technologies [
15,
16], aiming for early and accurate detection.
Incorporating these detection methods into a wireless sensor network (WSN) could streamline beehive health monitoring, reducing manual inspection errors and labor [
17,
18,
19]. WSN offers a robust framework for continuous health surveillance in beehives.
From another angle, modeling disease spread and population impacts across colonies provides a comprehensive understanding of infestation dynamics and underpins effective management strategies. Machine learning and mathematical approaches allow analysis of mite reproduction, bee population dynamics, and environmental drivers of Varroa spread. For example, a mathematical model captures the coupled spread of Varroa mites and associated diseases within colonies [
20]. Other models offer insight into colony behavior and dynamics under Varroa pressure [
21,
22].
Exploring the spread of Varroa mite also involves analyzing how infestations move between colonies. Spatial network analysis offers a method to model these interactions, focusing on the pathways through which Varroa mite invades and affects bee populations [
23]. Such models are instrumental in identifying effective control measures by understanding the spatial dynamics of disease spread.
Digital twins sit at the intersection of predictive modeling and sensor networking for effective Varroa management . A Digital twin is a virtual model that mirrors a physical entity, combining various data inputs for processing and ensuring two-way data exchange between the virtual and physical realms. It requires synchronization to accurately reflect changes in the physical object’s condition or actuate processes or mechanisms within them [
24]. Digital twins transcend the notion of being mere digital replicas; they can be augmented through the integration of simulations and predictive models [
25].
Digital twins, are being applied in various agricultural sectors [
26], extended to beekeeping, offering insights into hive anomalies and broader ecological considerations [
27]. Though current applications are broad and less focused, they suggest development of targeted and effective Varroa mite management strategies, emphasizing precision and proactive intervention using digital twins.
This study presents the design of a digital twin system aimed at effectively controlling the spread of Varroa mites. The proposed system incorporates a WSN for real-time data acquisition and integrates generative artificial intelligence (AI) models for time series to predict both vertical and horizontal transmission pathways of the Varroa mite. Furthermore, these models facilitate the development of treatments, such as hyperthermia. The system dynamically updates the machine learning models based on the latest data collected from the WSN, ensuring the provision of accurate forecasts of population dynamics and optimal treatment recommendations. The effectiveness of the proposed design has been validated through simulations.
2. Design Components
Figure 1 depicts the main components of our digital twin design, where
Figure 2 shows the main workflow.
As illustrated in
Figure 1, the digital twin system consists of the physical beehives (
Figure 1 A) and their corresponding digital twin counterparts (
Figure 1 B). Some of the beehives are equipped with sensors, and additional methods are employed to monitor environmental conditions (
Figure 1 C). The digital twin instances are integrated with models for predicting infection spread, running what-if simulations, and recommending treatments (
Figure 1 E). Furthermore, the system includes components for implementing planned interventions and treatments (
Figure 1 D). Finally, a wireless sensor network facilitates communication between the components (
Figure 1 F).
The workflow of our digital twin system is structured as a loop, indicated by the flow of thick blue lines. The current status of the beehives, particularly infection levels, is used to predict potential future scenarios and assess responses to various treatment strategies, as illustrated in
Figure 2.
In this workflow (
Figure 2), the infection status of beehives is continuously monitored by sampling data through a WSN (1). The digital twin instances are then updated with the latest beehive conditions (2). These updates are derived from either direct sensor readings, extrapolation of sampled data, or models, with parameters adjusted based on the most recent sensor data.
Based on the updated status, prediction about the progression of Varroa mite infection is generated (4). To address these predicted scenarios, a response must be developed, taking into account environmental and infrastructural constraints, as well as overarching agricultural and environmental policies. To formulate this response, various feasible target scenarios are envisioned, each reflecting different trade-offs between capacity and environmental considerations (5-).
Each envisioned scenario results in a response characterized by varying levels of cost, impact, and feasibility (6-). After the initial cycle in the workflow, the effectiveness of prior interventions –reflected in actual outcomes— becomes an additional parameter (6) that is updated alongside other parameters (3).
Finally, based on policy considerations, managerial decisions, and the assessed cost, impact, and feasibility of each scenario, one is selected as the optimal choice (7). The chosen intervention is then implemented (8), often through remote commands to applicators in the beehives to perform specific actions, such as adjusting the temperature for hyperthermia treatment.
The intervention or the natural progression of the situation will result in new conditions, which will be sensed and fed back into the system as part of the loop (1).
4. Modeling Vertical and Horizontal Dynamics
The spread of mites within a single bee colony, referred to as
vertical spread, is often studied using population models. By contrast, the spread between colonies, or
horizontal spread, is typically examined with network-based models. Both processes can be investigated through mathematical or computational approaches [
36]. Mathematical models, particularly those based on differential equations, offer a structured framework to describe disease transmission using parameters such as infection, recovery, and mortality rates [
37,
38]. Computational approaches, on the other hand, frequently rely on simulations –such as agent-based, stochastic, or network models– to capture the more complex dynamics that emerge from individual behaviors and spatial interactions [
39].
In both cases, the complexity of biological processes is simplified into a smaller set of essential parameters, such as survival rates, transmission rates, and grooming rates, along with their interactions. This reductionist approach has its challenges, as it may not fully reflect the complexities of real-world conditions. By reducing processes to a small set of parameters or clear-cut interactions, these models may overlook relevant variables and the natural variability that occurs in actual settings, leading to large discrepancies between model predictions and actual outcomes.
Furthermore, traditional models often rely on coarse abstractions and are generally poorly suited for what-if analyses. Such analyses depend on many drivers beyond within-colony dynamics or disease transmission between colonies. Important drivers include but not limited to beekeeping practices, monitoring and response routines, agricultural policies at regional and national levels, broader economic and industrial trends that shape land use and transportation, and both short-term variability and long-term trends in climate. While data and stand-alone predicting models may be available for many of these factors, integrating them into a single model of Varroa mite spread remains a significant challenge.
To address these challenges, we adopt two complementary directions. First, we develop digital twins that capture the real-time states of hives and other key entities. These twins continuously aggregate and update measurements, providing a consistent foundation for monitoring, visualization, forecasting, and what-if analysis. Second, we employ generative models for time series to project future trajectories and simulate counterfactual interventions. Such models learn from historical data, avoid restrictive assumptions about factor interactions, and can uncover important dependencies that may be locally relevant or overlooked in previous analyses. They also enable the integration of diverse drivers, including biological, environmental, and operational factors, into a single predictive framework. Because of their generative nature, these models can be conditioned on alternative policies, practices, interventions, and timings, thus supporting rigorous what-if evaluation without the risks and costs of real-world trials.
A digital twin for each hive and other key entities (such as apiaries, treatment devices, or local climate inputs) provides a living, data-driven representation of the system. It continuously integrates sensor streams and field logs to maintain up-to-date estimates of colony conditions, including brood temperature, forager activity, and inferred mite load, with quantified uncertainty. This unified perspective supports high-resolution monitoring and facilitates the early detection of abnormal patterns. The twin also records interventions and contextual information, ensuring traceability and auditability of outcomes. When combined with generative models for time series, it can simulate alternative conditions and schedules, creating a practical workspace for what-if analysis and decision support. Moreover, the approach can be scaled to large numbers of hives with minimal additional effort, delivering consistent summaries, insights, and alerts that enable timely action by beekeepers and policymakers.
Several classes of neural network based generative models for time series are currently prominent and applicable to predicting and performing counterfactual analyses of beehive conditions. At present, transformers are the most widely used sequence models. However, they have certain characteristics that may limit their suitability for modeling Varroa mite spread. Transformers require fixed time steps in their input data and demand substantial computational resources when extending the attention window to capture long sequences, where distant past events can strongly influence current conditions. While transformers have revolutionized language-related sequence modeling and serve as the backbone of large language models (LLMs), with famous implementations such as ChatGPT, these same characteristics make them less practical for modeling beehive dynamics, where events and sampling often occur at irregular intervals and time series can extend over long horizons.
For the above reasons, in modeling Varroa mite dynamics we favor state-space families of machine learning models. Examples include structured state-space sequence models (such as S4 [
40] and Mamba [
41]), neural ordinary differential equation (N-ODE) [
42], and deep Markov model (DMM) [
43]. We chose this family for three main reasons. First, beehive data is typically irregular and multi-rate: hive measurements, environmental covariates, agricultural and beekeeping policies, and planned interventions arrive on different timelines with non-uniform sampling. State-space models naturally support event-time updates and continuous-time formulations. Second, we require long-horizon forecasts at high temporal resolution. These state-space model families follow approaches that avoid the window-size computational challenges of transformers [
40,
41,
42,
43]. Third, the datasets contain many temporal and spatial gaps in hive health variables as well as environmental and policy data. State-space models can handle missing data through filtering, smoothing, and latent-state inference, although different subfamilies achieve this in different ways. In practice, only a subset of hives can be fully instrumented with measurement devices, and deploying sensors to every colony is infeasible. For many colonies, we must extrapolate from instrumented hives or rely on sparse, opportunistic observations (e.g., spot checks of mite loads and other health indicators). These characteristics make state-space models particularly suitable for data with both temporal and spatial missingness.
To provide a deeper understanding of how these state-space neural network models support prediction and what-if analysis in beehive colonies, we now focus on the architecture of DMM.
Figure 3 illustrates a DMM during training, while
Figure 4 shows its application for prediction and counterfactual analysis.
During the training phase, we have a set of observations describing beehive states along with environmental and policy indicators. These are represented as vectors denoted by
, where
corresponds to the vector of all observations recorded at time or step
t. The DMM model receives these observations as input (shown at the bottom of
Figure 3) and produces outputs denoted by
(shown at the top of
Figure 3). The model parameters are optimized so that the generated
series closely match their corresponding
series. The
series represents intervention inputs, which may include intentional actions such as Varroa mite treatments or unintentional factors such as weather changes. The vector
encodes baseline characteristics of each hive that generally remain constant over time. The vector
denotes the internal (latent) state of the model at time or step
t. Although it is usually difficult to map the elements of
directly to real-world indicators –since they are automatically learned– these latent variables can be interpreted as analogous to important features in traditional modeling, such as grooming rate, mite reproductive rate, or queen egg-laying rate. The sequence neural network that processes the
series is a predictive architecture (e.g., recurrent neural network (RNN), gated recurrent unit (GRU), or long short-term memory (LSTM)) that learns temporal dependencies. However, these architectures alone do not naturally capture probabilistic or continuous transitions of the system’s internal state, nor are they well suited for what-if simulations. Therefore, only their hidden states
are used to construct
. The transition neural network models the dynamics of changes in the latent state
conditioned on interventions (
) and baseline characteristics (
), while the emission neural network is trained to translate latent states into observable outputs.
In prediction mode, we retain the transition and emission neural networks that have already been trained to capture how the internal state of the system (the beehive alone, or the beehive together with its surrounding environment) evolves over time under different interventions (
) and baseline characteristics (
). The emission network has already learned to translate these internal states into observable outputs about the beehives, which we now denote as
at each time step. For prediction tasks, we provide new sets of actual interventions (
) and baseline characteristics (
). For what-if (counterfactual) analysis, we instead supply alternative interventions and baseline characteristics. It should be noted that both the treatment model and the vertical spread model, depicted in component E of
Figure 1, are implemented within the same underlying architecture shown in
Figure 3 and
Figure 4; they represent different uses of the same model by providing different intervention input data.
An important feature of this approach is that, within the digital twin system, training does not occur only once but is continuously repeated. This iterative process tunes the model to improve its predictive accuracy and to capture factors and dynamics that may have been previously overlooked. It corresponds to the feedback loop illustrated in
Figure 1 between components B and E. Such a loop also helps address data drift and concept drift [
44], where the underlying distributions and dynamics evolve over time and the model must be updated to reflect these changes.
When it comes to vertical spread modeling, there is a growing and promising body of research [
45,
46] that applies neural networks, particularly graph neural network (GNN), and more specifically temporal–spatial graph neural network (TS-GNN). There are also approaches that employ frameworks such as exponential random graph model (ERGM) enhanced with neural network-based estimators of parameters or posteriors [
47,
48]. While these newer models may require time to mature and demonstrate consistent applicability, a comprehensive modeling of Varroa mite dynamics within each colony can already provide valuable inputs for more traditional ERGM-based approaches.
To summarize, digital twins of hives and other key entities, combined with generative state-space time series models, provide a unified framework that integrates biological, environmental, and operational factors. This framework enables high-resolution, near-real-time monitoring of current colony conditions while also supporting forecasting and counterfactual what-if analysis of events, including different treatment decisions. Such what-if analysis not only allows simulation of alternative scenarios but also assists in intervention scheduling and provides insights by quantifying uncertainty in the predicted results.
5. Pretraining the Models for Population Dynamics and Mite Spread
A common challenge in applying neural networks to model Varroa spread, both within colonies (vertical) and between colonies (horizontal), is the large volume of training data required and the time needed to collect it from beehives. To address this challenge, we propose using existing mathematical and computational models to generate synthetic simulation data for the initial training phase. This strategy provides a well-informed baseline for the digital twin models, ensuring that they start close to realistic operating conditions. In turn, it can accelerate deployment and shorten the time needed to reach higher predictive accuracy.
There are several approaches for mathematically and computationally modeling the dynamics of vertical (within-colony) and horizontal (between-colony) spread of Varroa mites. To generate initial data for vertical (intra-colony) dynamics, we draw on the mathematical models of Torres and Torres [
22] and Messan et al. [
49], which we describe in the next section. An alternative that can be used in the same way is the agent-based model BEEHAVE [
21,
36,
50], which simulates honey bee colony dynamics under Varroa pressure alongside other stressors such as viruses, pesticides, and landscape change. The BEEHAVE model is a viable option to consider for future studies and for generating synthetic training data.
For horizontal (inter-colony) spread, we model infection transmission with an ERGM. In this formulation, colonies are nodes and ties represent opportunities for Varroa transfer (e.g., drifting or robbing bees, shared forage, equipment movement, or managed colony relocations). ERGMs specify the probability of these ties as a function of network structure (such as distance, clustering, or shared apiary) and colony attributes. The fitted model thus captures how spatial arrangement and management connectivity shape Varroa transmission between hives and enables simulation of spread under alternative network configurations or interventions.
5.1. Vertical Spread Model, Within a Colony
Torres’s model uses systems of differential equations to track daily bee populations and account for caste-specific survival rates [
22]. Messan’s study employs nonlinear delay differential equations (DDEs), incorporating explicit time delays and seasonal forcing, to analyze honey bee-mite population dynamics [
49]. Either model can be combined with a horizontal spread component, for example by casting the infectious states within an susceptible, exposed, infectious (SEI) or susceptible, exposed, infected, recovered (SEIR) framework for epidemiological invasion.
The population model for each colony was based on Torres’ model (Equation (1)), where
represents the bee population at age
i,
is the survival rate for that age group, and
is the rate at which hive bees become foragers.
The following equation describes the population dynamics of healthy hive bees, accounting for their life cycle, mite infestation, and recovery rates [
22]:
In this equation,
represents the transition from healthy to infested bees, with
indicating the proportion of infested bees, showing the infection’s spread within the colony. For infested hive bees, the dynamics are given by:
This equation captures the effect of mite infestation on survival rates and the potential for recovery, with
indicating the number of infested bees and
their reduced survival rate. To represent the mite population, we used this equation:
Here, is the mite population at age i, with as their daily survival rate, which varies with the seasons. accounts for mite deaths from grooming and the mortality of pupae in capped cells. The simulation began with an infected node at the network’s centroid to study the impact of a central infection on the wider health of bee colonies. By adjusting parameters such as infection range (r) and infection rate (), the model adapts to the changing landscape of Varroa mite transmission, providing valuable training data for training the digital twin models.
There is already research applying these equations, beginning with the original formulation by Torres and Torres [
22] and extending to studies that explore different contexts and treatment schedules [
19,
51]. For example, Dasyam et al. [
19] show that varying treatment intensity and timing can markedly alter population trajectories, enabling recovery and sustained colony health when interventions are guided by what-if analysis.
5.2. Horizontal Spread Model, Among Colonies
We integrated an ERGM (using EpiModel in R) which is approximated on geographic coordinates of bee apiaries in southern Sweden [
52] to simulate regional Varroa transmission. The model represents three transmission pathways:
self-mediated dispersal, where infection spreads to nearby nodes according to a negative exponential distance kernel;
intra-locality dispersal, which captures spread within a defined area driven by local interactions (for example, contacts among beekeepers and markets); and
inter-locality dispersal, which accounts for long-distance transmission, potentially via trade, migratory beekeeping, or equipment movement. The layout of these nodes is shown in
Figure 5, with distribution statistics provided in
Table 1.
Repeating the simulations with varying base infection and spread rates revealed several important patterns. As expected, the number of infected and high-risk colonies increases significantly when colonies are located in close proximity. Enhanced surveillance of high-risk nodes allows for rapid identification and management of new infection epicenters. The model further demonstrates that early interventions can substantially slow the spread of infection. In addition, the results indicate the presence of critical thresholds beyond which containment measures become less effective, emphasizing the importance of timely detection and intervention. We anticipate that a digital twin system enhanced with predictive models would be capable of identifying these critical thresholds and epicenters, thereby making timely detection and intervention more feasible.
5.3. Pretraining the Models for Treatment Effects
In addition to modeling population dynamics and mite spread, it is also important to pretrain the models on the effects of treatments used to control Varroa infestations (
Figure 6). Among the available strategies, hyperthermia, a heat-based method that reduces mite populations by exposing them to controlled elevated temperatures, provides a well-studied example [
53]. While a single hyperthermia session does not eradicate mites completely, it significantly lowers their numbers and thus changes the trajectory of colony dynamics. By incorporating such treatment events into simulations, we extend the training data beyond natural progression and include realistic intervention scenarios.
This approach allows digital twins to capture not only how infestations develop but also how different treatment strategies can alter outcomes. For example, simulations without intervention often lead to colony collapse within a relatively short period, whereas regular treatments can enable recovery and long-term colony survival (
Figure 7). Such contrasting scenarios enrich the training data with both successful and unsuccessful outcomes, which are crucial for robust model pretraining.
Finally, environmental conditions (e.g., seasonal cycles) play a major role in shaping bee population trajectories, with fluctuations driven by changes in birth rates and resource availability. Incorporating these cycles alongside intervention events makes the training data more representative of real-world conditions.
7. Feasibility of Implementing Digital Twins
Our digital twin design employs a WSN for sensing, data exchange, and dispatching intervention commands. The network spans a large area and consists of nodes with heterogeneous resource levels. A typical node includes (i) a camera for Varroa mite detection, (ii) a communication module for uplink and downlink, (iii) temperature sensors to monitor hive and treatment conditions when hyperthermia is used, and (iv) a heating applicator to deliver hyperthermia. To ensure practicality, we quantify per-node energy consumption to assess the feasibility of continuous monitoring and on-demand treatment at a single hive, since node-level budgets ultimately determine system-level viability.
Power estimates in our feasibility analysis draw on measurements reported for widely used components. For the camera (e.g., ESP32-CAM), reported consumption is
in deep sleep and about
–
during active image capture [
54]. For the WSN uplink, an NB-IoT modem (e.g., Quectel BC66) typically draws
at
during uplink (
), with idle/paging currents around
and power-saving-mode currents in the microamp range [
55]. For temperature sensing, low-power digital sensors such as the DHT22 draw roughly 1–
while measuring and tens of microamps in standby [
56]. Heaters used for Varroa control span a broad range, from low-power silicone pads around
to vaporizer-class devices near
, depending on the treatment method [
57]. These sources provide practical baselines for sizing monitoring and treatment energy in the proposed digital twin system.
Table 2 shows the energy used during a single monitoring cycle by various components and operations. Using this data, we can estimate the annual energy requirements, considering both regular and increased monitoring schedules.
Treatment requires using a heater for 2.5 hours per cycle [
58].
Table 3 details the energy consumption for one treatment cycle, including both the heater and microcontroller operations.
It’s important to note that these energy consumption calculations are based on typical values, considering certain fixed parameters specific to regions and devices. An advantage of digital twins is their ability to provide more accurate and factual calculations and predictions of these consumption patterns by incorporating variable environmental and device factors.