GPS: Improvement in the formulation of the TSP for its generalizations type QUBO

Saúl González-Bermejo, Guillermo Alonso-Linaje, and Parfait Atchade-Adelomou 3 Universidad de Valladolid C/Plaza de Santa Cruz, 8, 47002 Valladolid (Spain) ∗ Engineering Department Research Group on Data Science for the Digital Society La Salle Universitat Ramon Llull Carrer de Sant Joan de La Salle, 42 08022 Barcelona (Spain) † Lighthouse Disruptive Innovation Group, LLC 7 Broadway Terrace, Apt 1 Cambridge MA 02139 Middlesex County, Massachusetts (USA) ‡ (Dated: Oct 2021)

We propose a new binary formulation of the Travelling Salesman Problem (TSP), with which we overcame the best formulation of the Vehicle Routing Problem (VRP) in terms of the minimum number of necessary variables. Furthermore, we present a detailed study of the constraints used and compare our model (GPS) with other frequent formulations (MTZ and native formulation). Finally, we have carried out a coherence and efficiency check of the proposed formulation by running it on a quantum annealing computer, D-Wave 2000Q6.

I. INTRODUCCIÓN
It is known that the best current QUBO formulation of the Traveling Salesman Problem (TSP) requires N 2 binary variables (IX A)). However, when this formulation is extended to more general problems such as Vehicle Routing Problem (VRP), its modelling is no longer efficient in terms of the number of variables because it requires the use of new variables to transform it into QUBO formulation. The number of the qubits, are crucial in today's quantum computing, so it is necessary to develop models focused on reducing the number of required variables.
The VRP encompasses two different problems: one in which the distance travelled by vehicles subject to capacity restrictions is minimized [1][2][3][4] and another, in which the time spent it takes vehicles to complete their routes is minimized. In this article, everything related to the VRP will optimize the time to complete these routes.
When modelling this problem, it is necessary to establish relationships between the distances travelled by the vehicles. For this reason, it is required that the distance calculation function between journeys has to be a linear function if we want to reduce the number of auxiliary variables used.
The best QUBO models of the TSP that can represent distance linearly use more than N 3 variables (native TSP formulation). There is indeed a formulation that uses N 2 log 2 (N ) variables (MTZ [5]), but its results are not as satisfactory when applying the annealing algorithms on it.
Our purpose in this work is to present a new QUBO model of the TSP whose the travelled distance's calculation is linear, and that uses only 3N 2 variables, considerably improving the existing TSP models (both of N 3 and N 2 log 2 (N ) variables). Furthermore, this new formulation of the TSP will be generalized to define the most efficient formulation in terms of the number of variables of the VRP.
The document is organized as follows. Section (II) shows previous work on the TSP algorithm and its derivatives. Section (III) will explore adiabatic computation. In the section (IV), the general formulations of TSP and MTZ will be explored. Section (V) presents our TSP proposal with the improvements in numbers of qubits for quantum projects of this NISQ era and beyond. A generalization of our contribution is seen in Section (VI) where we propose an optimal VRP at the qubit level. In Section (VII), we present the results of the entire study carried out. In the section (VIII), we discuss the results and contrast the contributions. Finally, Section (IX) concludes the work carried out, and we open ourselves to some lines of the future of the proposed model.

II. RELATED WORK
In the mid-1920s, these two references, [6,7] were the first articles to provide a solution to the minimal spanning tree (MST) problem. Based on these works, the mathematical researcher, Joseph B. Kruskal Jr, applied these solutions to the TSP [8], giving life to some of the first solutions to this problem that will arise during the next decades.
Almost at the end of the sixties, this work [9] offered a compilation and synthesis of the research on the travelling salesman problem. Its authors began by defining the problem and presenting several relevant theorems. They also classified and detailed the solution techniques and computational results. Before that, in the mid-1960s, the TSP started to emerge in many different contexts. This article [10] highlights some applications that began to gain space in everyday life, such as vehicle routing or job shop scheduling problems. Other applications such as planning, logistics and the manufacture of electronic circuits became of particular interest.
By making a few small modifications to the original TSP, we could apply it in many fields such as SWP [11] and DNA sequencing [12,13] among others. In this last application, the concept of 'city' would come to be fragments of DNA and the idea of 'distance', a measure of similarity between the pieces of DNA. In many applications, additional restrictions such as resource limits or time windows make the problem considerably difficult.
Computationally, the TSP [14] is an NP-Hard problem within combinatorial optimization. As an NP-Hard problem, it is computationally complex, and heuristics are continually being developed to get as close as possible to the optimal solution. However, considering the computational complexity nature of these problems, the new approach that quantum computing takes is very promising.
Many works are related to the standard/native TSP or some related variant in a quantum environment within this new approach. For example, in this work [15], the authors introduced a different quantum annealing scheme based on a path-integral Monte Carlo processes to address the symmetric version of the Travelling Salesman Problem (sTSP). In these other articles [16,17], the authors did a comparative study using the D-Wave platform to evaluate and compare the efficiency of quantum annealing with classical methods for solving standard TSP.
In this reference, [18] several comparisons of heuristic techniques were made for some TSPLIB problems, both symmetric and asymmetric, and their results have been compared to other methods such as Self Organizing Maps and Simulated Annealing. In both cases, the local search technique was applied to the results found with Wang's Recurrent Neural Network with "Winner Takes All". The computational complexity of their heuristics method got O(n 2 + n) [19], considered competitive when compared to Self Organizing Maps, which has a complexity of O(n 2 ) [20]. The authors went on to show that the CAN technique has a computational complexity of O(n 2 log(n)) [21], while the Simulated Annealing technique has a complexity of O(n 4 log(n)) [22]. Results that are quite far from what we propose in this article.
One of the generalizations of the TSP, known as the VRP, was studied on the D-Wave platform [23,24]. In tasks where routing and planning capacity (time) is required, the TSP with time windows (TSPTW) was generalized [25,26], and has high inherent complexity and presents enormous resolution difficulties. In these references [11,27,28], the authors modelled combinatorial optimization problems in which social workers visit their patients at their respective homes and attend to them at a specific time, called Social Workers' Problem (SWP). SWP is a significant problem because additional time constraints allow more realistic scenarios to be modelled than native TSP. The optimal or near-optimal solution for such issues is important in minimizing distance and time and environmental problems such as reducing fuel consumption.
The generalization of the TSP that we will use in work will be the VRP. However, there are other TSP derivatives, such as the Job Shop Scheduling Problem (JSSP) [29] that are not included in the study of this work.
During state of the art carried out, we have found several articles [23,24,30] that solve the TSP and VRP (focusing on minimizing distance and not time) for annealing computers. However, the number of variables is still intractable for the current size of quantum computers. For this reason, we propose a new TSP formulation with a representation of the linear distance that uses only 3N 2 variables, which we will use to outperform the current best VRP modelling in terms of the number of required variables.

III. ADIABATIC QUANTUM COMPUTERS
Adiabatic computation was born from the use of the adiabatic theorem [31,32] to perform the calculations. Using the tunnel effect to go from the global minimum of a simple Hamiltonian to the global minimum of the problem of interest. One of the market leaders for this type of computing is D-Wave, which roughly solves the quadratic unconstrained binary optimization problem (QUBO). The QUBO formulation (1) is suitable for running a D-Wave architecture; however, QUBO can be mapped to the Ising [33] model and thus be used in computers based on quantum gates, for example, IBMQ, Rigetti, Xanadu (strawberryfields), etc.
The problems that D-Wave quantum computers are prepared to solve are those that consist in finding the minimum of a function of the following form: where the variables x i ∈ {0, 1} and the coefficients b i , q i,j ∈ R.
We have then that, given a problem, we need to model it with the above structure where the variables that form the solution will only take the values 0 or 1. Let us observe that, by taking the variables x i the values 0 or 1, it is true that x 2 i = x i . Therefore, we can group the linear terms with the quadratic terms and express the above equation in matrix format: with x ∈ {0, 1} n and Q ∈ M n×n which is compactly representing the QUBO formulation.

IV. TSP FORMULATION
As discussed in the introduction, we will develop the QUBO model of the TSP in which we have worked and which improves the number of variables required from the previous models whose distance calculation function is linear. To do this, first, we will present the models that we will improve by analyzing their strengths and weaknesses.

A. Native Formulation
In this section, we will recall the formulation of the native TSP [30] to analyze it in detail. This modelling, which has been defined in [34], despite appearing in a very natural way which facilitates its understanding, requires N 3 variables to be implemented. For today's quantum computers, due to qubit number limitations, we could hardly solve cases with several cities less than 15.
The variables that appear in this model are the variables x i,j,t such that i, j ∈ {0, ..., N + 1} and t ∈ {0, ..., N }. Let us consider that the variables x i,i,t do not exist in this model. The interpretation of the variables x i,j,t is simple, since x i,j,t = 1 if at instant t we traverse the edge that connects the cities i and j, and x i,j,t = 0 for all other cases.
We can define the objective function of the native (Native in the sense of general, the most used) TSP [30] as: where d u,v represents the distance between nodes u and v. This objective function is subject to a series of restrictions: • Constraint 1. The salesman must leave each city once.
• Constraint 3. If the salesman leave a city, he cannot return to it later. This constraint ensures that no unconnected cycles are formed as a solution. There are two ways of posing this constraint.
-Imposing that once he leaves a city he cannot return to it. For each u ∈ {1, ..., N + 1}: -Imposing that once he arrives in a city, at the next moment, he must leave it.
This formulation requires N 3 variables. Next, we will analyze another model used to define the TSP, which, although it uses fewer variables, offers worse results when working with annealing.
The idea of this formulation is to consider the variables x i,j = 1 if the edge that connects the cities i and j appears in the solution path, where x i,j = 0 for all other cases. Once we have these variables, we can establish order on the route employing a set of variables that will represent the moment the salesman arrives at that city (the variable u i expressed in binary format, will take the integer value t if the city i is reached in the t -th position.). This model requires N 2 log 2 (N ), greatly improving the number of variables in the general formulation. However, when implemented using annealing it presents surprisingly bad results. This is because the annealing algorithm gets stuck trying to minimize the part of the objective function generated by the constraint of the equation (11), since the representation of integers in their binary format has the disadvantage that close numbers such as 2 n − 1 and 2 n differ by a large number of qubits, so from the annealing they are perceived as very different solutions. Next, let us define the variables needed for this model.
• x i,j such that i, j ∈ {0, ..., N + 1}. They will represent the edges that appear in the route.
• Let H := log 2 (N + 1) + 1 and then define u i := H h=0 u i,h such that i ∈ {0, ..., N +1}. As discussed in the previous paragraph, these variables represent the order in which the cities are travelled.
• To be able to handle the previous variables, it is necessary to include the following variables: These auxiliary variables will work as slack variables to transform inequalities into equalities.
Once the model variables have been defined, let us analyze the constraints that must be met. Similar to the native TSP, we can define the MTZ objective function as follows: where d i,j represents the distance between cities i and j.
In this case, the restrictions to be satisfied are as follows: • Constraint 1 : The salesman must leave each city once.
For each j ∈ {1, ..., N + 1}: • Constraint 3 : There must be an order between the cities so that there is a single cycle that runs through all of them. Then, for each i, j ∈ {0, ..., N + 1}: This restriction guarantees that if x i,j = 1, what means, from city i we travel to city j, then the value of u j must be greater than u i . This ensures that the u i variables provide a meaningful order.
Once the two most common QUBO models of the TSP have been presented, let us analyze the formulation with which we improve the number of variables of the previous two.

V. GPS FORMULATION
To develop this model, we take the variables x i,j,r with i, j ∈ {0, ..., N +1} and r ∈ {0, 1, 2}. In all the modelling, the variables x i,j,r such that i = j are not considered. We work with directional edges, that is, if in the model the edge (i, j) appears, we will understand that first we go through node i and immediately after that we go to j.
Let us analyze what each variable represents: • x i,j,0 = 1 means that the edge (i, j) does not appear in the path and the node i is reached earlier than the j.
• x i,j,1 = 1 means that the edge (i, j) appears in the path, so the node i is reached earlier than the j.
• x i,j,2 = 1 means that the edge (i, j) does not appear in the path, and the node j is reached earlier than the i.
From the definition of our variables, we can define the distance travelled through the following objective function as: The constraints that must be met are: • Constraint 1 : For each i, j one and only one of the 3 cases of r must be given, so For all i, j: 2 r=0 x i,j,r = 1.
• Constraint 2 : Each node must be exited once.
• Constraint 3 : Each node must be reached once.
For each j ∈ {1, ..., N + 1}: • Constraint 4 : If node i is reached before j, then node j is reached after i, so, for all i, j ∈ {0, ..., N + 1} such that i = j: It would also have to be specified for r = 0 and r = 1, however this restriction is sufficient since by (13) it is implicit.
• Constraint 5 : If node i is reached before node j and node j is reached before node k, then node i must be reached before k. This condition will prevent the route from returning to a node from which it had already exited, thus preventing cycles from forming. We then arrive at the penalty function equation (17).
(17) With only the cases in which i = j, i = k and j = k will be taken in the summation and in the annex (X) we will provide the approach followed to arrive at it. The following is deduced from the equation (17).
We have x i,j,2 = 0 if i is reached before j and x i,j,2 = 1 in the case where j is reached before i. Thus, with the previous equation we penalize these following cases in which x i,j,2 = 0, x j,k,2 = 0 and x i,k,2 = 1 and x i,j,2 = 1, x j,k,2 = 1 and x i,k,2 = 0 which lead to cases in which it would be forming cycles (for these two situations we have that the value of the parentheses is 1 and for the rest 0). In (31) we have a similar situation for our VRP formulation, where we offer more details. For this condition, we must have directly constructed a penalty function that avoids erroneous cases without first posing linear conditions through which to generate its corresponding penalty function.
Formulated in this way we have managed to reduce the number of variables required from N 2 log 2 N to 3N 2 , achieving very noticeable reductions when working with large problems. Once we have this formulation, let us see how we can generalize it to the VRP.

VI. VRP FORMULATION
As discussed in the introduction, this model is oriented to be optimal concerning the number of qubits used. However, this generalization does not appear as naturally as expected because it requires a delicate step to get the constraints of the equation (31). To do this, we will detail each step and explain each constraint step by step.
A. Original formulation 5N 2 Q For this VRP, we will consider that N is the number of cities and Q is the number of available vehicles. We first present the variables that will form the problem. We then take the following set of variables.
x i,j,r,q with i, j ∈ {0, ..., N + 1}, r ∈ {0, 1, 2, 3, 4} and q ∈ {1, ..., Q} In all the modelling, the variables x i,j,r such that i = j are not considered. The variables i, j refer to the cities must travel to, and the variable q refers to the vehicle. The nodes 0 and N + 1 correspond to the starting and ending points. Note that they may be the same node but we will separate them for convenience in the formulation. The values d i,j with i, j ∈ {0, ..., N + 1} correspond to the distance between node i and j. Let us dive into the interpretation of each variable: • x i,j,0,q = 1 means that the vehicle q travels to the cities i and j, does not travel across the edge (i, j) and arrives at the city i before the j.
• x i,j,1,q = 1 means that the vehicle q travels to the cities i and j travels across the edge (i, j) (that is, once it passes through the city i the next city it reaches is the j) and therefore the city i is reached earlier than the city j.
• x i,j,2,q = 1 means that the vehicle q travels through the cities i and j and arrives at the city j earlier than at the i.
• x i,j,3,q = 1 means that the vehicle q does not go through the cities i and j, and the city i is reached earlier than the city j. Note that x i,j,3,q can take the value 1 whether the vehicle q passes through one of both cities or neither of them.
• x i,j,4,q = 1 means that the vehicle q does not travel to the cities i and j, and the city j is reached earlier than the city i.
Even if no vehicle passes through the objects i and j, the formulation must establish an order between them. However, this restriction does not make the modelling meaningless, since we can assume that if the vehicles are ordered in the order of {1, ..., Q}, then i will be reached before j if the vehicle that passes through node i has a lower number than the one that passes through node j.
Once the interpretation of each variable is explained, let us analyse the constraints that must be met.
• Constraint 1: For each i, j, q, one and only one of the possibilities must be met for r, so: For all i, j, q: • Constraint 2: Each vehicle has to fulfill that it leaves the starting position. For this situation, we are going to impose that: For all q: No vehicle can return to the starting position from a city, so: For all q: • Constraint 3: Every vehicle must finish in the final position. For this, it must be fulfilled that: No vehicle can leave the final position. We then have that: x N +1,j,1,q = 0.
Vehicles that do not travel on any road will meet all constraints when taking the following condition: x 0,N +1,1,q = 1.
• Constraint 4: The vehicle must leave once and only once from each city, then: For each i ∈ {1, ..., N }: • Constraint 5: The vehicle must arrive once and only once to each city, then: For each j ∈ {1, ..., N }: • Constraint 6: The city i is reached before the city j does not depend on each vehicle. Therefore, for all the vehicles that either arrive at city i earlier than j, or arrive at city j earlier than i. Introducing the auxiliary variables a i,j , we have the following constraint. For all i, j ∈ {1, ..., N }: It will then be true that for each i, j or a i,j = 1, which means that the city i is reached earlier than the city j and therefore for each q we will have x i,j,r,q = 1 for any value of the r in which i is reached before j, or a i,j = 0, and, we will have x i,j,r,q = 0 for all the vehicles and for values r where i is reached before j.
• Constraint 7: If the vehicle q arrives in the city j, then the vehicle q must leave the city j.
• Constraint 9: If city i is reached before j and city j is reached before city k, then city i must be reached before city k. This condition will prevent the vehicle from returning to a city it has already passed through and therefore prevents a cycle from forming.
To introduce this constraint, we will directly calculate a penalty function worth 0 in the correct cases and 1 in those that are not. To facilitate the understanding of the penalty function, we are going to take, for i, j, k, q, the following variables: Remember that it is not necessary to introduce these conditions because the constraint (26) establishes the correct values of the variables a i,j . Therefore, a i,j = 1 means that the city i is reached before the city j and the same with j and k. Also, it is very important to remember that due to the same constraint (26), we can take any of the vehicles as a reference. In this case, we have taken the first vehicle as a reference.
In this way, fixed i, j, k, we have the 3 variables a i,j , a j,k , a i,k . Remember that a i,j , a j,k , a i,k only take the values 0 or 1. Also, let us note that the cases that lead to values of the variables for which cycles can be formed and that we must discard are (a i,j , a j,k , a i,k ) = (0, 0, 1) and (a i,j , a j,k , a i,k ) = (1, 1, 0).
In the case (0, 0, 1) we would have that the city j is reached after the i, the k after the j, and yet the city k is reached rather than i, which is absurd. The case (1, 1, 0) cannot be given either, since it reaches i before j and j before k, so it cannot be that we also reach k before i. We therefore must construct a penalty function so that for f (a i,j , a j,k , a i,k ) it holds that f (0, 0, 1) > 0, f (1, 1, 0) > 0 y f (a i,j , a j,k , a i,k ) = 0 for all other cases. A function that satisfies these conditions is from the equation (30).
then, adding to the cost function the equation (31) (31) we will have that the best solutions will be those that comply with this constraint.
• Constraint 10: The objective we seek is to minimize vehicle travel time. What we could do is to see how long each vehicle takes to complete the route and try to minimize as much of the time as possible. However, this function soon becomes complex so we have decided to develop a different idea that simplifies the process and smoothes the objective function. If we impose the condition that all vehicles travel less distance than the distance travelled by vehicle number 1, we will have that minimizing the maximum of the distances will be equivalent to minimizing the distance travelled by the first vehicle. We then have the following condition. For each q ∈ {2, ..., Q}: We transform this inequality into equality by taking once again D max := n i=0 max j {d i,j } and the variables b h,q (the variables b h,q are like in (11) slack variables and they are in their binary expression) in: Under these conditions the function to be minimized corresponds to: This condition has the disadvantage that we are eliminating solutions where it is another vehicle that travels the longest distance. Let us explore how to avoid this problem and get more flexibility in the model to make it easier for the Quantum Annealing to find the optimum one. We can establish an auxiliary variable D and we set that the distance travelled by each vehicle must be less than this variable, that is to say: The variable D is an integer, so we must treat it in some way in order to include it in the model. As we explained in the introduction of the section dedicated to the formulation of the MTZ model (IV B), it is convenient to try to avoid the binary representation of integer variables. To do so, we can express D as a combination of the distances between edges by taking D = Thus after imposing the constraint (35) we have that the function to minimize is D.
Thanks to this modelling of the VRP we have been able to reduce the number of variables required to the order of 5N 2 Q. However, we have managed to reduce it even further to 3N 2 Q, which is detailed in (IX B). However, we have preferred to present this other model due to its easy understanding.

VII. RESULTS
Before analyzing the results, we are pleased with our GPS formulation's positive results achieved. Let us observe in the different tables the comparison of the number of qubits, time during which the D-Wave Quantum Annealing simulator has been executed, and the length of the path found. The sign "-" represents that the algorithm did not find a possible way during the elapsed time (in minutes). In this examples, the cities which form the TSP to solve are the vertex of the regular polygon with these number of vertex.  Table I. In this scenario of 4 cities, we set comparison with the 3 models, MTZ, native TSP and GPS. The comparison is based on the number of times to find the solution, the distance travelled, and the number of qubits. We can appreciate the good performance of our GPS model, and above all the savings it offers us in the number of qubits.

Number of qubits 147 294 266
Elapsed Time (min) 0.337 0.39 1.338 Path Length (m) 6.00 6.00 8.46 Table II. In this scenario of 6 cities, we set comparison with the 3 models, MTZ, native TSP and GPS. The comparison is based on the number of times to find the solution, the distance travelled, and the number of qubits. We can appreciate the good performance of our GPS model, and above all the savings it offers us in the number of qubits.
These results have been obtained using a simulator because we would require access to a quantum computer for a time similar to that needed to perform the simulations (in some cases more than an hour). However, it is the benefits of modeling with few qubits (such as GPS modelling) will be much more notable when these problems are implemented in real quantum computers. Other studies that did not require many hours of the quantum computer were carried out on the DW 2000Q 6. In the discussion section, we detail some interesting cases.  Table V. In this scenario of 12 cities, we set comparison with the 3 models, MTZ, native TSP and GPS. The comparison is based on the number of times to find the solution, the distance travelled, and the number of qubits.

VIII. DISCUSSIONS
Once the different models had been implemented, we achieved the following results. Through the results of the figure (1) up to the figure (4), the good performance of our formulation compared to the general TSP [30] can be observed. An almost identical operation is seen with the generic TSP, except that we are improving at least the number of qubits for the same cases in our proposal. Although the time difference is not significant again, the difference between path lengths is. Let us remember that the advantage of the formulation in which we have worked is based on improving the number of qubits used. We then have that the larger the problems we are working on, the better this difference will be appreciated in the number of variables.
The MTZ model does not offer positive results. This is since Annealing presents many difficulties to find minimum expressions in which the representation of integers appears in their binary format. This is because although the numbers 2 k − 1 and 2 k are close, they are not close in their binary form since they differ in k variables, so the annealing tends to present bad results. Apart from that adjusting, the Lagrange coefficient of this type of constraint is also a complicated task.
General and GPS modelling show better results. While it is true that general modelling gives slightly better results, it requires the use of a higher number of qubits. This may be since the function to be optimized for this model has a smaller number of local minima where the Annealing can get stuck or to a bad fit of the Lagrange coefficients.
The problem on which the simulations are carried out consists in finding the optimal path when the points are placed on the vertices of the regular polygons that have the same number of vertices as nodes in our problem. In this graph, we see how the length of the solution paths for the case of 9 cities is very similar so that both models give good results.
One of the behaviours and results that we believe is important to mention is the following. We realized that it is even more important to consider the number of edges that our model generates. The vertex/connections in a quantum computer are limited and define our quantum computer's typology and quality for error mitigation. Thus, a model that produces many edges (direct links) may request more from a computer than another generates few. The figure (5) offers us a comparative study between our GPS model and the native TSP. This figure shows the exponential behaviour and the number of interconnections that each model offers. Our model improves the number of qubits and gives us a great result reducing the number of connections a lot. The native TSP behaves as 0.8(N + 2) 5 while the GPS as 2(N + 2) 3 . . This graph shows the time taken to carry out the executions in the case of 9 cities. Although it seems that there is a lot of difference, it only represents 10% of the total time, which, as we have seen in other experiences, is not significant. Figure 3. Path length comparison for N = 11. For the example of 11 cities, we can observe that the outcomes are quite similar. Although the time difference is not significant again, the difference between path lengths is. Let us remember that the advantage of the modelling we have worked is based on improving the number of qubits used. We then have that the larger the problems we are working on, the better that difference will be appreciated in the number of variables.
One aspect of GPS worth commenting on here is to generalize it also to be used for the Cutting-plane method. We must change the current constraint (17) since this methodology only works with linear constraints. The way to do this is as follows. For each i, j, k: In these equations, the variables w p i,j,k are auxiliaries. The purpose of these variables is to satisfy the said constrains. These two restrictions are satisfied by all cases of  (x j,i,2 , x k,j,2 , x k,i,2 ) except for (0, 0, 1) (because it doesn't satisfy the second constraint) and (1, 1, 0) (because it doesn't satisfy the first constraint).

IX. CONCLUSIONS AND FURTHER WORK
The modelling of both the TSP and the VRP developed in this work has given outstanding results to be used to solve both problems and the tasks that derive from them. We have improved the number of variables over other similar ones that provide good results in the condition of having a linear distance calculation function. The improvement in our models represents a fairly significant order of magnitude because we went from N 3 variables to 3N 2 . The figure (6) and (7) summarizes the major contribution of this article.  The results obtained from our VRP formulation and all the experiments carried out maintain the number of variables QN 2 and allow us to offer the community new formulations that minimize the time it takes for vehicles to travel. Future work will apply the ideas developed in the QUBO model of these problems to similar ones. In particular, we will look for other variants of the TSP to use the modelling of this that we have carried out.

Compliance with Ethics Guidelines
Funding: This research received no external funding.
Institutional review: This article does not contain any studies with human or animal subjects.
Informed consent: Informed consent was obtained from all individual participants included in the study.
Data availability: Data sharing not applicable. No new data were created or analyzed in this study. Data sharing is not applicable to this article. There is a TSP modeling that requires N 2 variables, where these are the following: x i,t such as i ∈ {0, ..., N +1} and t ∈ {0, ..., N +1}. (36) Under this formulation x i,t = 1 denotes that the city i is reached at position t. The distance calculation function with this formulation is as follows where d i,j represents the distance between the node i and the node j. This expression has the problem that the distance formulation has terms of degree two and when trying to generalize this idea to other types of problems such as the VRP it will become a 4 degree constraint making use of a large number of auxiliary variables to convert it to QUBO type format.
In the previous modelling, we can improve the number of variables used from 5N 2 Q to 3N 2 Q since certain variables are redundant. Let us see how we can do this. Let us take the set of variables as follows: x i,j,r,q with i < j ∈ {0, .., N +1}, r ∈ {0, 1, 2} and q ∈ {1, .., Q} In all the modelling, the variables x i,j,r such that i = j are not considered. Let us analyze the interpretation of each variable. For each edge (i, j), different cases depending on whether a vehicle passes through both cities, which city is visited before the other and whether the edge is travelled or not.
• x i,j,0,q = 1 means that the city i is reached earlier than the j and the edge (i, j) is not travelled.
• x i,j,1,q = 1 means that the vehicle q travels the cities i and j, it reaches the city i before the j and it travels the edge (i, j).
• x i,j,2,q = 1 means that the city j is reached earlier than the i and the edge (j, i) is not travelled.
This new simplification keeps constraints (20), (22), (24), (25), (27) and (32) defined in the same way as the first proposal of the VRP formulation, so we will only focus on the changes of the remaining constraints: • Constraint 1: For each i, j, q, one and only one of the possibilities must be met for r, so: For all i, j, q: 2 r=0 x i,j,r,q = 1, • Constraint 6: That the city i is reached before the city j does not depend on each vehicle. Therefore, for all the vehicles that either arrive at city i earlier than j, or arrive at city j earlier than i. Introducing the auxiliary variables a i,j , we have the following constraint. For all i, j ∈ {1, ..., N }: Q q=1 x i,j,0,q + x i,j,1,q = a i,j Q.
• Constraint 8: It must be fulfilled that either the vehicle pass through the city i before the j or arrive before to the city j than the i. Therefore, it must be verified that, for i ∈ {0, ..., N }, j ∈ {1, ..., N } and q ∈ {1, ..., Q}: x i,j,0,q + x i,j,1,q = 1 − (x j,i,0,q + x j,i,1,q ). (40) • Constraint 9: If the city i is reached before j and the city j is reached before the city k, then the city i must be reached before the city k. This condition will prevent the vehicle from returning to a city it has already passed through and therefore prevents a cycle from forming.
(a i,j a j,k − a i,j a i,k − a j,k a i,k + a 2 i,k ), (41)

X. RESTRICTION PENALTY
Let us analyze the system that must be solved to build the penalty function from the equation (17). Our penalty function P (a i,j , a j,k , a i,k ) must satisfy that P (0, 0, 1) = 1, P (1, 1, 0) = 1 and P (a i,j , a j,k , a i,k ) = 0 for the rest of the cases. Let us call the variables a i,j = x, a j,k = y, a i,k = z to simplify the notation. Then, we arrive at the quadratic function P , as is shown as follows: P (x, y, z) = c 1 x 2 + c 2 xy + c 3 xz + c 4 y 2 + c 5 yz + c 6 z 2 Imposing the previous restrictions, we have the following system of equations.
So far, we have a system of 6 equations with six certain compatible unknowns. First, however, an additional restriction must be verified. Let us verify if it is met.
We then have that the following function which is a penalty function for the constraint (17). P (a i,j , a j,k , a i,k ) = a i,j a j,k − a i,j a i,k − a j,k a i,k + a 2 i,k