3.1. Electric Vehicle Routing Problem with Time Windows
EVRPTW extends the classical VRP by considering electric vehicles and time-constrained service at customers. The formulation introduced by [
9] has become the standard reference model, and most existing EVRPTW studies are based on this framework. Each customer must be serviced within a predefined time window, while routing decisions are constrained by limited battery capacity, energy consumption, and recharging requirements. In contrast to conventional vehicles, battery recharging is significantly slower than refueling, which introduces additional temporal constraints and complicates route planning. As a result, routing decisions involve not only customer assignment and sequencing, but also battery feasibility, charging scheduling, and strict adherence to customer time windows.
The problem is defined on a fully connected symmetric graph , where nodes represent the depot, customers, and charging stations. A single depot is assumed, serving as both the start and end location for all routes. Each link is associated with a distance , which determines the corresponding travel time, while energy consumption is assumed proportional to traveled distance.
To formally define the EVRPTW setting considered in this work, the following notation is used:
: set of customer nodes;
: set of charging-station nodes;
0: depot node;
: set of all nodes;
: distance between nodes i and j;
: nominal demand of customer i;
: nominal service time of customer i;
: service time window associated with customer i;
C: vehicle cargo capacity;
Q: vehicle battery capacity;
v: nominal vehicle speed;
r: energy-consumption rate;
g: charging rate;
: arrival time at customer i;
: completion time of service at customer i;
: tardiness at customer i, defined as .
Each customer must be serviced exactly once by a single vehicle, prohibiting split deliveries. Service at customer requires service duration and must be performed within the associated time window , where denotes the earliest allowable service start time and denotes the due time. Vehicles arriving before must wait until service can begin. Depending on the formulation, time windows may be treated as either hard, where late arrivals are infeasible, or soft, where late service is permitted but penalized.
A homogeneous fleet of vehicles is available at the depot. Each vehicle has cargo capacity C, battery capacity Q, and travels at nominal speed v. Vehicles depart from the depot fully loaded and fully charged. Cargo capacity decreases as customers are served and cannot be replenished during route execution, while battery consumption is assumed proportional to traveled distance.
Charging stations allow vehicles to recharge their batteries during route execution. A full recharging strategy is assumed, meaning that each charging visit restores the battery to full capacity. Charging time is modeled linearly using charging rate g, such that recharging duration is proportional to the replenished energy amount. Charging stations are assumed to have unlimited capacity, allowing simultaneous charging of multiple vehicles.
Three optimization objectives are considered independently: minimization of fleet size, total energy consumption, and total tardiness, subject to customer-service, cargo-capacity, battery, charging, and time-window constraints.
The total tardiness objective is defined as
Total energy consumption is computed as
where
denotes the set of traversed route links and
r is the energy-consumption rate.
The fleet-size objective is defined as
where
denotes the set of vehicles used to service all customers.
3.2. Stochastic Extension
In this work, a stochastic extension of the EVRPTW with
soft time windows is considered. Uncertainty is introduced through multiplicative random factors applied independently to customer demand, service time, and vehicle speed, denoted by
,
, and
, respectively. These sources of uncertainty follow established stochastic VRP formulations [
14,
15,
31], where variability in demand, service duration, and travel time forms the standard modeling basis. Travel-time uncertainty is implemented indirectly via stochastic vehicle speed, preserving the network structure while inducing variability in traversal times.
The variability of each stochastic factor is controlled through a normalized variability parameter denoted by , while preserving the deterministic parameter as the expected value of the corresponding stochastic variable. Consequently, the deterministic instance represents the mean-case scenario, whereas increasing introduces progressively stronger stochastic perturbations around the nominal values. In this study, values between and are used to represent moderate to pronounced uncertainty levels while maintaining stable and interpretable routing behavior.
Two distributions are considered in order to model qualitatively different uncertainty regimes. The first is a bounded symmetric uncertainty model implemented using a uniform distribution,
which represents controlled fluctuations around the nominal value. The second is a positively skewed uncertainty model implemented using a lognormal distribution,
allowing strictly positive and asymmetric deviations with occasional larger realizations, better reflecting rare but substantial operational disruptions. Such parametric uncertainty models are commonly adopted in stochastic VRP and EVRP literature, where variability is typically represented through analytically defined probability distributions rather than real-world observational data [
14,
15,
17].
The stochastic parameters are defined as
where
,
, and
are independently sampled random variables representing demand, service-time, and speed variability, respectively. Correlated stochastic processes are not considered in this work, as the objective is to isolate the individual and combined effects of the principal uncertainty sources while maintaining a controlled and computationally tractable experimental setting.
Vehicle travel times are computed using the realized speed as
A new realization of
is independently sampled for each vehicle movement, inducing stochastic travel times throughout route execution. In contrast, the realized customer demand
and service time
become known only upon arrival at customer
i, reflecting the standard assumption of online information revelation in stochastic VRP models [
33,
34,
35].
3.3. Routing Policy
Since this study considers a stochastic variant of the EVRPTW, a static precomputed routing plan is not an appropriate solution paradigm. In deterministic settings, routes can be constructed offline under the assumption that all input parameters are known in advance. In contrast, in the stochastic environment considered here, key parameters such as customer demand, service time, and vehicle speed are revealed only during route execution. Consequently, routes that appear feasible or near-optimal under nominal values may become suboptimal or infeasible once uncertainty is realized.
Instead of searching for a fixed routing solution, the objective is to develop a
dynamically adaptive routing policy that makes sequential decisions online based on the current system state and realized stochastic information. This approach builds upon the routing policy framework introduced for the deterministic EVRPTW in [
20], and extends it to the stochastic setting considered in this work. This perspective is consistent with evolutionary systems in nature, where adaptive behavior emerges from local decision rules interacting with a changing environment.
The proposed methodology consists of two components:
- 1.
Route Generation Scheme (RGS), which defines how routes are incrementally constructed;
- 2.
Priority Function (PF), which determines which customer should be selected next at each decision point.
Figure 1 provides an overview of the proposed framework, including the offline evolution of priority functions and the online adaptive route construction process under stochastic realizations. The route construction process defined by the RGS is detailed in the following section, while the representation and learning of the priority function are described in
Section 3.5.
3.4. Route Generation Scheme
RGS defines the mechanism by which vehicle routes are constructed. In this study, five variants are considered: the Serial, Semi-parallel, and Parallel schemes adopted from [
20], along with two extensions proposed in this work, namely the Semi-parallel-B and Parallel-B variants.
3.4.1. Common Feasibility and Operational Rules
Energy feasibility is verified before each movement decision. A vehicle may proceed to a destination only if it has sufficient energy to (i) reach the destination and (ii) subsequently reach the nearest charging station, preventing infeasible states without access to recharging. If this condition is not satisfied, the vehicle first travels to a feasible charging station, recharges, and then continues its route. Charging stations are selected by minimizing the total energy required to travel from the current location to the station and from the station to the intended destination.
Under stochastic demand, the realized demand of a selected customer may exceed the vehicle’s remaining capacity. In such cases, the service attempt is aborted, the vehicle returns to the depot (via a charging station if necessary), and the route is finalized. The customer remains unserved and is reconsidered later. This mechanism ensures feasibility under stochastic realizations and is applied implicitly in all RGS variants, although omitted from Algorithm 1 for clarity.
Time windows are modeled as soft. Under stochastic variations in service time and travel speed, enforcing hard time windows would frequently lead to infeasible routes, introducing additional feasibility-repair mechanisms that are orthogonal to the focus of this study. Moreover, strict hard time windows are often unrealistic in practical settings, where delays can occur and are typically tolerated to some extent. Therefore, late arrivals are allowed but penalized in the objective function.
In the Semi-parallel and Parallel variants,
denotes a capacity-based lower bound on the required number of vehicles, computed as
where
is the set of customer nodes. This bound represents the minimum number of vehicles required to satisfy total nominal customer demand under cargo-capacity constraints.
3.4.2. Baseline RGS Variants
The Serial, Semi-parallel, and Parallel RGS variants adopted from [
20] differ in how vehicles are activated and extended during route construction. In the
Serial scheme, routes are constructed sequentially by activating one vehicle at a time. In contrast, the
Semi-parallel and
Parallel schemes extend multiple routes concurrently, initially activating
vehicles. The Semi-parallel variant selects the earliest available vehicle for extension and may switch to sequential construction if all active vehicles complete their routes. The Parallel variant maintains continuous parallelism by immediately activating a new vehicle whenever one completes its route. All variants follow the same route construction logic: routes are incrementally extended by selecting an unserved customer, verifying capacity and energy feasibility, and either continuing the route or returning the vehicle to the depot. The differences between variants lie solely in vehicle initialization and selection policies.
3.4.3. Extended RGS with Candidate Set Selection
The proposed Semi-parallel-B and Parallel-B variants extend their respective base schemes by modifying the vehicle selection mechanism. Instead of selecting only the earliest available vehicle, the set of the K earliest available vehicles is considered, and the vehicle with the largest remaining capacity is chosen. In this study, K is set to 3, providing a small but diverse candidate set that allows capacity-aware selection while keeping the decision process local and computationally efficient. This modification improves robustness under stochastic demand by reducing the likelihood of premature capacity saturation and unnecessary route termination.
3.4.4. Unified RGS Framework
All variants described above can be expressed within a single unified route construction framework. The variants differ only in vehicle initialization, selection, and update rules, while the underlying route construction logic remains identical. This unified procedure is presented in Algorithm 1, with variant-specific rules summarized in
Table 1.
|
Algorithm 1 Unified RGS framework |
-
Require:
variant Serial, Semi-parallel, Parallel, Semi-parallel-B, Parallel-B}
- 1:
▹ set of unserved customers
- 2:
- 3:
- 4:
- 5:
while do
- 6:
if then
- 7:
- 8:
end if
- 9:
select a customer from using the PF
- 10:
if has sufficient remaining capacity for nominal demand of then
- 11:
- 12:
else
- 13:
- 14:
end if
- 15:
if does not have sufficient battery charge to reach and then the nearest charging station in then
- 16:
select a charging station from
- 17:
move to
- 18:
recharge
- 19:
end if
- 20:
move to
- 21:
if then
- 22:
serve
- 23:
- 24:
else
- 25:
add the completed route of to
- 26:
- 27:
- 28:
end if
- 29:
end while
- 30:
return
|
3.5. Genetic Programming for Evolving Priority Functions
While the RGS defines how routes are constructed, the quality of the routing policy depends on the rule used to select the next customer at each decision point. In this work, this rule is represented by a priority function (PF), which assigns a numerical score to each unserved customer, and the customer with the highest score is selected.
The PF is automatically evolved using Genetic Programming (GP), which constructs decision rules by combining problem-specific features through mathematical operators [
21,
45]. Each GP individual represents a candidate PF encoded as an expression tree, where terminal nodes correspond to features of the current system state and internal nodes represent mathematical operators.
During route construction, the expression tree is evaluated for each candidate customer. Since some features depend on stochastic quantities (e.g., travel times or arrival times), a single evaluation may be sensitive to random realizations. To mitigate this effect, each candidate is evaluated multiple times under independent stochastic samples. In this work, the evaluation is repeated five times, providing a small number of samples that stabilizes the selection while keeping the computational cost manageable. The most frequently preferred customer is then selected. In the case of ties, preference is given to the customer selected by the smaller GP expression tree. This sampling-based decision mechanism improves robustness while maintaining computational efficiency.
The GP procedure follows a standard generational evolutionary scheme, where a population of candidate PFs is iteratively improved through selection, crossover, and mutation. The initial population consists of expression trees generated using the full method [
21], with a depth of five. During evolution, the tree depth is limited to
, and individuals exceeding this limit are considered invalid and removed from the population. The evolutionary process runs for a fixed number of generations. The configuration largely follows standard GP settings, consistent with common practice [
21] and with prior GP-based routing approaches [
20]. The general workflow is shown in Algorithm 2, and the configuration parameters used in the experiments are summarized in
Table 2. The implementation is based on the
Jenetics1 evolutionary computation library (version 8.3.0).
|
Algorithm 2 Descriptive evolutionary workflow in Jenetics |
- 1:
Generate the initial population
- 2:
Evaluate the fitness of all individuals in
- 3:
while the termination criterion is not satisfied do
- 4:
Increment the generation counter:
- 5:
Select the survivor population from the previous population
- 6:
Select the offspring population from the previous population
- 7:
Apply genetic alterations (crossover and mutation) to
- 8:
Remove invalid individuals from and
- 9:
Construct the new population by combining the filtered survivors and offspring
- 10:
Evaluate the fitness of all individuals in
- 11:
end while
|
3.5.1. Terminal and Function Nodes
The primitive set builds upon the terminals and functions proposed in [
20], which are retained to preserve the original EVRPTW routing policy framework. To address the stochastic setting considered in this work, the terminal set is extended with stochastic descriptors (
,
,
) intended to capture expected variability in customer demand, travel time, and service time, as well as global-state and coordination descriptors (
,
,
,
) intended to provide information about the remaining workload, available vehicle resources, and interactions between concurrently active vehicles. In addition,
is included to provide information about the remaining time-window slack available for servicing a customer.
For objective-specific optimization, is introduced for the vehicle minimization objective to capture the minimum remaining capacity among active vehicles, while is introduced for tardiness minimization to capture the available temporal slack of the currently considered vehicle for customer i, computed as the difference between the customer due time and the estimated vehicle arrival time.
Due to the online nature of the stochastic setting, realized stochastic values are not available during customer selection. Consequently, all terminal nodes whose computation depends on stochastic quantities are evaluated using nominal parameter values. The complete set of terminal nodes is summarized in
Table 3, while their relevance and usage patterns within evolved routing policies are further analyzed in
Section 4.6.
The function nodes used in the GP representation are summarized in
Table 4. The first row contains binary operators, and the second row contains unary operators. To ensure numerical stability during tree evaluation, the operators
div,
log, and
sqr are implemented as safe functions. Specifically,
div returns 0 when the denominator is close to zero, while
log and
sqr return 0 for non-positive arguments.