Preprint

Article

Altmetrics

Downloads

73

Views

13

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Ant Colony Optimization (ACO) is a stochastic optimization algorithm inspired by the foraging behavior of ants. We investigate a simplified computational model of ACO, wherein ants sequentially engage in binary decision-making tasks, leaving pheromone trails contingent upon their choices. The quantity of pheromone left is the number of correct answers. We scrutinize the impact of a salient parameter in the ACO algorithm, specifically, the exponent $\alpha$ that governs the pheromone levels in the stochastic choice function. In the absence of pheromone evaporation, the system is accurately modeled as a multivariate nonlinear P\'{o}lya urn, undergoing a phase transition as $\alpha$ varies. The probability of selecting the correct answer for each question asymptotically approaches the stable fixed point of the nonlinear P\'{o}lya urn. The system exhibits dual stable fixed points for $\alpha\ge \alpha_c$ and a singular stable fixed point for $\alpha<\alpha_c$. When pheromone evaporates over a time scale $\tau$, the phase transition does not occur and leads to a bimodal stationary distribution of probabilities for $\alpha\ge \alpha_c$ and a monomodal distribution for $\alpha<\alpha_c$.

Keywords:

Subject: Physical Sciences - Theoretical Physics

"Socio-physics emerged in the 1970s and has evolved into a captivating research field within statistical physics [1,2]. In particular, herding behavior, or the inclination to follow the majority, has captured the attention of many researchers due to its pivotal role in understanding social phenomena [3,4,5,6,7,8]. Various probabilistic models have been proposed to describe herding behavior, with one notable example being the ant recruitment model. This model explains the intermittent oscillation observed in ants when they are presented with two identical food sources [9,10]. When ants choose a food source among two food sources, it incorporates a straightforward herding mechanism in which a randomly selected ant chooses one of the two based on the number of ants that have already made the same choice. Scouts play a crucial role by exploring the terrain to locate food sources [11,12]. When a scout discovers food, it returns to the nest, leaving a pheromone trail in its wake. Other ants are drawn to these pheromone marks and consequently become recruited to forage at the food source."

Ant Colony Optimization (ACO) is a model-based meta-heuristic inspired by the foraging behavior of ants in their search for the shortest path to food sources [13,14,15]. While ants may not be highly intelligent individually, they collectively find the shortest path by following pheromone trails left by their fellow ants. The optimal path is determined by the route on which the maximum number of ants travel[12]. Consider a classic problem known as the traveling salesman problem (TSP), which involves finding the shortest possible route that visits each city in a given list exactly once and returns to the origin city. In ACO, ants make decisions about their next city to visit based on a concept called ’pheromone.’ Pheromone represents the preference for a particular choice and is collaboratively learned by the ants during their search process [16]. In the context of TSP, pheromone values are typically associated with pairs of cities and reflect the preference for traveling from one city to another within the pair. These pheromone values are learned through a reinforcement strategy, where each ant reinforces its chosen paths based on the quality of the solution constructed. This quality is often determined by the inverse of the total length of the route. ACO has found successful applications in various industrial and academic constraint optimization problems and has become one of the most popular methods.

ACO has seen significant improvements, and modern ACO algorithms deviate substantially from the original ACO[17]. The fundamental modification lies in controlling the diversity of solutions and achieving convergence [18,19]. In this context, ’convergence’ refers to the tendency of ants to cluster around similar solutions in the neighborhood and ultimately converge toward the same solution. Early convergence to a small region of the search space leaves large sections of the search space unexplored and fails to find good solutions. On the other hand, very slow convergence means that the computational cost required to reach good solutions is high, rendering the search inefficient. Diversity control aims to prevent complete convergence by slowing down the search process.

Many algorithms have been proposed for controlling the diversity of the ACO algorithm. One of the diversity control mechanisms involves modifying the probabilistic decision function [17,20]. Meyer studied the influence of $\alpha $, the exponent on the pheromone level in the selection function, and suggested that $\alpha $ qualitatively determines diversity and convergence behavior. Additionally, he introduced a dynamic $\alpha $ that changes throughout the search process to enhance search efficiency, a technique known as $\alpha $-"annealing". In this paper, we study a simple model of ACO in which ants sequentially answer a series of two-choice quizzes. We investigate the phase transition and the qualitative change of the convergence behavior by varying $\alpha $. In Section 1, we introduce a model and derive stochastic differential equations (SDEs) using the diffusion approximation. In Section 2, we investigate the time evolution and examine the effect of $\alpha $ on the convergence properties of the solutions. Section 3 provides a summary of the results. Appendix A explains the estimation of the initial conditions for the SDEs.

There are N two-choice quizzes, each of which is answered by a large number of ants sequentially[21]. These quizzes are labeled by $n=1,\cdots ,N$. The answer provided by the t-th ant is denoted as $X(n,t)\in \{0,1\}$, where $X(n,t)=1$ indicates a correct answer, and $X(n,t)=0$ indicates an incorrect answer. Each ant receives 1 point for a correct answer. After ant t has answered all N questions, the total points (TP) earned by the ant can be calculated using the following equation,
Ant t deposits pheromones on his answer $X(n,t)\in \{0,1\}$. The amount of the pheromones is $\mathrm{TP}\left(t\right)$. We assume that the pheromones evaporate and decrease by ${e}^{-1/\tau}$ per unit time. Here $\tau $ represents the time scale of the pheromone evaporation.

$$\mathrm{TP}\left(t\right)=\sum _{n=1}^{N}X(n,t).$$

Ant $t+1$ observes the total values of the pheromones associated with question m for each choice $X(m,t+1)=\{0,1\}$. We denote the total value of pheromones that remains on all the questions after ant t has answered,
Then, the remaining pheromone on $X(m,s)=1,1\le s\le t$ is,
The remaining pheromone on $X(m,s)=0,1\le s\le t$ is given by $S\left(t\right)-S(m,t)$.

$$S\left(t\right)\equiv \sum _{s=1}^{t}\mathrm{TP}\left(s\right){e}^{t-s}=\sum _{s=1}^{t}\sum _{n=1}^{N}X(n,s){e}^{t-s}.$$

$$S(m,t)\equiv \sum _{s=1}^{t}\mathrm{TP}\left(s\right)X(m,s){e}^{t-s}=\sum _{s=1}^{t}\sum _{n=1}^{N}X(n,s)X(m,s){e}^{t-s}.$$

Ants are not highly intelligent, and the probability of them making the correct choice in the two-choice quizzes by themselves is 1/2. The information provided by $TP\left(s\right)$ gives them an indirect clue about the correct choice. If TP$\left(s\right)>N/2$, the posterior probability for $X(m,s)=1$ is larger than $1/2$. Similarly, if $S(m,t)$ is greater than $S\left(t\right)/2$, the posterior probability for $X(m,t+1)=1$ is greater than $1/2$. In ACO, a decision function is introduced that uses the values of the pheromones as follows,
Here, the exponent $\alpha $ determines the response of the choice to the values of the pheromones. $\u03f5>0$ is a small positive constant to avoid the absorbing states $S(m,t)=0$ and $S(m,t)=S\left(t\right)$ of the stochastic process.

$$P(X(m,t+1)=1)=(1-\u03f5)\frac{S{(m,t)}^{\alpha}}{S{(m,t)}^{\alpha}+{(S\left(t\right)-S(m,t))}^{\alpha}}+\frac{1}{2}\u03f5.$$

We denote the ratio of the remaining pheromones on the correct choices as $Z(m,t)$,
We divide both the denominator and numerator of eq.(3) by $S{\left(t\right)}^{\alpha}$, and the probability of the correct choice $X(m,t+1)=1$ is expressed as,
Here, $f\left(z\right)$ is defined as
We note that $f(1/2)=1/2$ and $f(1-x)=1-f\left(x\right)$. The slope of $f\left(x\right)$ at $x=1/2$ is $(1-\u03f5)\alpha $. We also introduce the discount factor $D\left(t\right)$ and the ratio of correct answers $Z\left(t\right)$ as,

$$Z(m,t)\equiv \frac{S(m,t)}{S\left(t\right)}.$$

$$P(X(m,t+1)=1)=(1-\u03f5)\left(\right)open="("\; close=")">\frac{Z{(m,t)}^{\alpha}}{Z{(m,t)}^{\alpha}+{(1-Z(m,t))}^{\alpha}}$$

$$f\left(z\right)\equiv (1-\u03f5)\left(\right)open="("\; close=")">\frac{{z}^{\alpha}}{{z}^{\alpha}+{(1-z)}^{\alpha}}$$

$$\begin{array}{ccc}\hfill D\left(t\right)& =& \sum _{n=1}^{N}\sum _{s=1}^{t}{e}^{-(t-s)/\tau}=N\left(\right)open="("\; close=")">\frac{1-{e}^{-t/\tau}}{1-{e}^{-1/\tau}}\hfill \end{array}$$

First, we derive the recursive relation for $S\left(t\right)$ and $D\left(t\right)$. According to the definition, $D(t+1)$ and $S(t+1)$ obey the next recursive relations,
If $\tau $ is finite, for $t>>\tau >>1$, we have $D\left(t\right)\simeq N\tau $. In the limit $\tau \to \infty $, the pheromone does not evaporate, and we have $D\left(t\right)=Nt$.

$$\begin{array}{ccc}\hfill D(t+1)& =& N+D\left(t\right){e}^{-1/\tau}\hfill \\ \hfill S(t+1)& =& \sum _{i=1}^{N}X(i,t+1)+S\left(t\right){e}^{-1/\tau}.\hfill \end{array}$$

For $N>>1$, we can replace the sum of $X(i,s),i=1,\cdots ,N$ with the sum of the expected values using the law of large numbers. We obtain,
We denote the average of $f\left(Z\right(i,t\left)\right)$ as $\overline{f\left(Z\right(i,t\left)\right)}$.
We have
The recursive relation for $Z\left(t\right)$ is
$\Delta Z\left(t\right)\equiv Z(t+1)-Z\left(t\right)$ is given as,
In the continuous time limit $dt=1\to 0$, we obtain
One see that $Z\left(t\right)$ converges to $\overline{f\left(Z\right(i,t\left)\right)}$. However, when the pheromones evaporate and $\tau <\infty $, $D\left(t\right)\simeq N\tau $ and the prefactor of the differential equation is $1/\tau $. If one assume that the dynamics of $Z(i,t)$ is faster than that of $Z\left(t\right)$(adiabatic approximation), the time scale of the convergence is given by $\tau $ as $\overline{f\left(Z\right(i,t\left)\right)}-Z\left(t\right)\propto {e}^{-t/\tau}$. When the pheromone does not evaporate and $\tau \to \infty $, $D\left(t\right)\simeq Nt$. The prefactor of the differential equation is $1/t$, and $\overline{f\left(Z\right(i,t\left)\right)}-Z\left(t\right)\propto {t}^{-1}$. The convergence becomes extremely slow.

$$S(t+1)\simeq \sum _{i=1}^{N}f\left(Z(i,t)\right)+S\left(t\right){e}^{-1/\tau}.$$

$$\overline{f\left(Z\right(i,t\left)\right)}\equiv \frac{1}{N}\sum _{i=1}^{N}f\left(Z(i,t)\right).$$

$$S(t+1)\simeq S\left(t\right){e}^{-1/\tau}+N\overline{f\left(Z\right(i,t\left)\right)}.$$

$$Z(t+1)\simeq \frac{S(t+1)}{D(t+1)}=\frac{D(t+1)-N}{D(t+1)}\xb7Z\left(t\right)+\frac{N}{D(t+1)}\xb7\overline{f\left(Z\right(i,t)}$$

$$\Delta Z\left(t\right)\simeq \frac{N}{D(t+1)}(\overline{f\left(Z\right(i,t\left)\right)}-Z\left(t\right)).$$

$$\frac{d}{dt}Z\left(t\right)=\frac{N}{D(t+1)}(\overline{f\left(Z\right(i,t\left)\right)}-Z\left(t\right)).$$

Next, we study the dynamics of $Z(m,t)$. The recursive relation for $S(m,t)$ is,
$Z(m,t+1)$ is then estimated as,
Using $S(t+1)=D(t+1)Z(t+1)$, we obtain
We denote the history of $Z\left(s\right),{\left\{Z(i,s)\right\}}_{i=1,\cdots ,N},s=1,\cdots ,t$ as ${H}_{t}$, and the conditional expected value of $\Delta Z(m,t)$ is estimated as
Likewise, the conditional variance of $\Delta Z(m,t)$ can be approximated as,
Here, we neglect the subleading terms in ${\left(\right)}^{1}$. We read the drift and diffusion term from the results and the SDEs are,
Here, $W\left(t\right)$ is the Wiener process. Eq.(5) and eq.(7) describe the dynamics of the system. The system can be described as a multi-variate Pólya urn process.

$$S(m,t+1)=S(m,t){e}^{-1/\tau}+X(m,t+1)(\sum _{i=1\ne m}^{N}X(i,t+1)+1).$$

$$\begin{array}{ccc}\hfill Z(m,t+1)& =& \frac{S(m,t+1)}{S(t+1)}=\frac{S\left(t\right){e}^{-1/\tau}}{S(t+1)}\xb7Z(m,t)+\frac{{\sum}_{i\ne m}X(i,t+1)+1}{S(t+1)}\xb7X(m,t+1)\hfill \\ & \simeq & \frac{S(t+1)-N\overline{f\left(Z\right(i,t\left)\right)}}{S(t+1)}\xb7Z(m,t)+\frac{N\overline{f\left(Z\right(i,t\left)\right)}+(1-f\left(Z(m,t)\right)}{S(t+1)}\xb7X(m,t+1).\hfill \end{array}$$

$$\Delta Z(m,t)=\frac{N\overline{f\left(Z\right(i,t\left)\right)}}{D(t+1)Z(t+1)}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(Z(m,t)\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}.$$

$$E(\Delta Z(m,t)|{H}_{t})=\frac{N\overline{f\left(Z\right(i,t\left)\right)}}{D(t+1)Z(t+1)}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(Z(m,t)\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}.$$

$$V(\Delta Z(m,t)|{H}_{t})\simeq {\left(\right)}^{\frac{N\overline{f\left(Z\right(i,t)}}{D(t+1)Z(t+1)}}2$$

$$dZ(m,t)=E(\Delta Z(m,t)|{H}_{t})dt+\sqrt{V(\Delta Z(m,t)|{H}_{t})}dW\left(t\right),m=1,\cdots ,N.$$

We note that $\frac{(1-f(Z(m,t)\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}$ in eq.(6) breaks the ${Z}_{2}$ symmetry of the system. If one neglects the term, $E(\Delta Z(m,t)|{H}_{t})$ is proportional to $f\left(Z\right(m,t\left)\right)-Z(m,t)$. As $f(1-x)=1-f\left(x\right)$, $f(1-x)-(1-x)=-\left(f\right(x)-x)$ holds. $Z(m,t)$ and $1-Z(m,t)$ obeys the same dynamics and we call the symmetry ${Z}_{2}$ symmetry. The term is always positive and drives $Z(m,t)$ in the positive direction. As the term is proportional to $1/N$, the strength of the ${Z}_{2}$-symmetry breaking field becomes smaller as N becomes larger.

We analyze the SDEs given in eq.(7) and investigate the convergence properties of $Z(m,t)$. As the convergence behavior relies on the initial value of $Z(m,t={t}_{0})$ in the context of the non-linear Pólya urn model, we commence by examining the distribution of $Z(m,t)$.

We assume that ants adopt $\alpha =0$ and do not respond to the values of the pheromones for $t=1,\cdots ,{t}_{0}$. The ants answer the questions independently and $P\left(X\right(i,t)=1)=1/2$. We estimate $Z\left(t\right)$ and $Z(m,t)$ for $\tau <t\le {t}_{0}$ as follows.
The details of the calculations are given in Appendix A. If $\tau $ is finite, we have ${D}_{h}\left(t\right)\simeq N\tau /2$ for $t>>\tau >>1$. In the limit $\tau \to \infty $, the pheromone does not evaporate and we have ${D}_{h}\left(t\right)=Nt$.

$$\begin{array}{ccc}\hfill Z\left(t\right)& \sim & \mathrm{N}\left(\right)open="("\; close=")">\frac{1}{2},\frac{{D}_{h}\left(t\right)}{4D{\left(t\right)}^{2}},\hfill \end{array}\hfill {D}_{h}\left(t\right)& \equiv & N\sum _{s=1}^{t}{e}^{-2(t-s)/\tau}=N\xb7\frac{1-{e}^{-2t/\tau}}{1-{e}^{-2/\tau}}.\hfill $$

The essential differences between $Z\left(t\right)$ and $Z(m,t)$ include a shift of the expected value by $1/2N$ and the presence of a factor of N in the numerator of the variance of $Z(m,t)$. The shift of $1/2N$ arises from the fact that $X(m,s)$ in $S(m,t)$ is 1, which is larger than $E\left[X\right(i,s\left)\right]=1/2$ for $i\ne m$. The value of the pheromone contains information about the correct choice, leading to $E\left[S(m,t)\right]>\frac{1}{2}E\left[S\left(t\right)\right]$. However, in the "cheating" process, the variables $X(i,s),i=1,\cdots ,N$ are combined by $X(m,s)$ as in eq.(2), resulting in a larger variance for $S(m,t)$. The factor of N in the numerator of the variance of $Z(m,t)$ is a consequence of this combination process.

From the distribution of $Z(m,{t}_{0})$, one can determine the values of $\tau $ or ${t}_{0}$ that guarantee that $Z(m,{t}_{0})$ is greater than 0.5 for the limits $t>>\tau >>1$ and $\tau \to \infty $, respectively. With a confidence level of 1%, $\tau $ and ${t}_{0}$ should satisfy the following conditions:

$$\begin{array}{ccc}& & t>>\tau >>1:\frac{1}{2N}\ge \frac{2.58}{2\sqrt{2}\tau}\to \tau \ge 3.33{N}^{2}\hfill \\ & & \tau \to \infty :\frac{1}{2N}\ge \frac{2.58}{2\sqrt{{t}_{0}}}\to {t}_{0}\ge 6.66{N}^{2}.\hfill \end{array}$$

We take the limit $\tau \to \infty $ in eq.(5) and eq.(7). We replace $D(t+1)$ and ${D}_{h}(t+1)$ with $N(t+1)$ and $N(t+1)$, respectively. This results in the following equations:
The initial conditions of $Z\left(t\right)$ and $Z(m,t)$ at $t={t}_{0}$ are as follows:

$$\begin{array}{ccc}\hfill dZ(m,t)& =& \frac{\overline{f\left(Z\right(i,t\left)\right)}}{(t+1)Z(t+1)}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(Z(m,t)\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}f\left(Z(m,t)\right)-Z(m,t)\hfill & dt\end{array}& +& \left(\right)open="("\; close=")">\frac{\overline{f\left(Z\right(i,t\left)\right)}}{(t+1)Z(t+1)}\sqrt{f\left(Z\right(m,t\left)\right(1-f\left(Z\right(m,t\left)\right))}dW\left(t\right),m=1,\cdots ,N\hfill $$

$$\begin{array}{ccc}\hfill Z\left({t}_{0}\right)& \sim & \mathrm{N}\left(\right)open="("\; close=")">\frac{1}{2},\frac{1}{4N{t}_{0}},\hfill \end{array}$$

In the adiabatic approximation, where the time development of $Z(m,t)$ is much faster than that of $Z\left(t\right)$, the structure of the SDE for each $Z(m,t)$, where $m=1,\cdots ,N$, is the same as that of a non-linear P’olya urn [22,23]. The probability for $Z(m,t)$ to converge to a stable solution of the following equation is positive[24].
Here, a stable (unstable) solution of eq.(9) means that the curve of the left-hand-side of the equation crosses the diagonal curve $y=z$ in in the downward (upward) direction[24]. We denote the stable and unstable solutions as ${z}_{s}$ and ${z}_{u}$, respectively.

$$\left(\right)open="("\; close=")">1+\frac{(1-f(z\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}$$

In the limit $N\to \infty $, the positive driving force $\frac{(1-f(z\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}$ in eq.(9) disappears. The system becomes ${Z}_{2}$ symmetric and $z=1/2$ is a the solution as $f(1/2)=1/2$. If N is finite, the positive driving force breaks the ${Z}_{2}$ symmetry. We plot $(1+b(1-f\left(z\right)\left)f\right(z)$ in Figure 1. $b>0$ corresponds with $1/N\overline{f\left(Z\right(i,t\left)\right)}$ and $b\simeq 0.01\sim 0.02$ for $N=100$ and $\overline{f\left(Z\right(i,t)}=1/2\sim 1$. In the left and the right figure, we adopt $b=0,\u03f5=0.1$ and $b=0.2,\u03f5=0.1$, respectively.

In the ${Z}_{2}$ symmetric case ($b=0$), the stability of the solution $z=1/2$ depends on the slope of $f\left(z\right)$ at $z=1/2$. As ${f}^{\prime}(1/2)=(1-\u03f5)\alpha $, the critical value of $\alpha $ is ${\alpha}_{c}=1/(1-\u03f5)$. If $\alpha <(>){\alpha}_{c}$, $z=1/2$ is (un)stable. If $\alpha ={\alpha}_{c}$, the curve of $f\left(z\right)$ is tangential to the diagonal. $Z(m,t)$ converges to $1/2$ for $\alpha \le {\alpha}_{c}$, as $z=1/2$ is the unique stable solution ${z}_{s}$. For $\alpha >{\alpha}_{c}$, there appears two stable solutions ${z}_{s},{z}_{s}^{\prime}$, one(${z}_{s}^{\prime}$) is in $(0,0.5)$ and the other(${z}_{s}$) is in $(0.5,1.0)$. $z=0.5$ becomes the unstale solution ${z}_{u}$.

The dotted lines Figure 2 shows the solutions vs. $\alpha $ for the ${Z}_{2}$ symmetric case. For $\alpha <{\alpha}_{c}$, $z=1/2$(black dotted line) is the stable solution. For $\alpha >\alpha >c$, $z=1/2$ (gray dotted line) becomes unstable (${z}_{u}=1/2$) and two stable solution ${z}_{s},{z}_{s}^{\prime}$ departs from $z=1/2$ continuously with $\alpha >{\alpha}_{c}$. Which stable solution does $Z(m,t)$ converge depends on the initial value of $Z(m,{t}_{0})$. In general, if $Z(m,{t}_{0})$ is greater (smaller) than $1/2$, the probability of the convergence to 1 is greater (smaller) than $1/2$. ${z}_{u}$ determines the "attractive domains" for the stable solutions ${z}_{s},{z}_{s}^{\prime}$. The susceptibility of the expected value of $Z(m,t)$ to the initial value $Z(m,{t}_{0})$ is the order parameter of the non-linear Pólya urn[22,23]. As the order parameter is proportional to the difference of the two stable states, the order parameter is a continuous function of $\alpha $ and the phase transition is continuous.

In the ${Z}_{2}$ asymmetric case ($b\ne 0$), for small values of $\alpha $, there is a stable solution ${z}_{s}$ in the range $(0.5,1)$. ${z}_{s}$ increase with $\alpha $ and at some critical value ${\alpha}_{c}$ of $\alpha $, the curve $f\left(z\right)(1+b(1-f\left(z\right))$ becomes tangential to the diagonal at $z={z}_{t}$. ${z}_{t}$ is known as touchpoint and to be stable[25]. As $\alpha $ continues to increase beyond ${\alpha}_{c}$, two significant changes occur: a new stable solution ${z}_{s}^{\prime}$ emerges from the touchpoint ${z}_{t}$, while an unstable solution ${z}_{u}$ also becomes apparent.

When $\alpha <{\alpha}_{c}$, only one stable solution, ${z}_{s}$, exists within the range $1/2<{z}_{s}<1$. Conversely, for $\alpha >{\alpha}_{c}$, the specific stable solution to which $Z(m,t)$ converges depends on the initial values of $Z(m,{t}_{0})$. At $\alpha ={\alpha}_{c}$, both the stable fixed point ${z}_{s}$ and the touchpoint ${z}_{t}$ remain stable. Which solution $Z(m,t)$ converges to is determined by the initial values of $Z(m,{t}_{0})$ in this case as well. For $\alpha >{\alpha}_{c}$, the situation mirrors that of $\alpha ={\alpha}_{c}$. Once $\alpha \ge {\alpha}_{c}$, the order parameter turns positive, and the phase transition becomes discontinuous.

Figure 3 shows the results of the numerical studies in the limit $\tau \to \infty $. We sampled a trajectory of $Z(m,t)$ and $Z\left(t\right)$ for $1\le t\le {10}^{9}$ with $N={10}^{2}$ and $\u03f5=0.01$. In the left figure, we present the distribution of $Z(m,{t}_{0})$ for two different values of ${t}_{0}$, namely, ${t}_{0}\in {10}^{3},{10}^{6}$. The mean value of $Z(m,{t}_{0})$ is approximately $1/2+1/2N$, which aligns with the theoretical predictions. The variance of $Z(m,{t}_{0})$ is given by $1/4{t}_{0}$, so the variance for ${t}_{0}={10}^{3}$ is about ${10}^{3}$ times larger than that for ${t}_{0}={10}^{6}$. Consequently, if we choose ${t}_{0}={10}^{3}$, a significant proportion of $m\in 1,\cdots ,N$ will have $Z(m,{t}_{0})<{z}_{u}\simeq 0.5$, resulting in a high probability that $Z(m,t)$ converges to ${z}_{s}^{\prime}<1/2$ for $\alpha =2.0$. In such cases, $Z\left(t\right)$ cannot reach 1 due to the convergence of $Z(m,t)$ to ${z}_{s}^{\prime}$. On the other hand, if we set ${t}_{0}={10}^{5}$, the ratio of $m\in 1,\cdots ,N$ with $Z(m,{t}_{0})<{z}_{u}\simeq 0.5$ is zero, ensuring that $Z(m,t)$ always converges to ${z}_{s}>1/2$. As a result, $Z\left(t\right)$ monotonically increases towards 1 for $\alpha =2.0$. For $\alpha =1.0<{\alpha}_{c}$, where only one stable state ${z}_{s}\simeq 1$ exists, $Z(m,t)$ consistently converges to ${z}_{s}$. It’s evident that $Z\left(t\right)$ monotonically approaches ${z}_{s}$ with time for both ${t}_{0}={10}^{3}$ and ${t}_{0}={10}^{5}$ cases within the range $t\le {10}^{9}$. In the case of $\alpha =0.5$, where ${z}_{s}\simeq 0.5$, $Z\left(t\right)$ experiences relatively little change

In the case where $\tau $ is finite, we make the assumption that $t\gg \tau \gg 1$ and replace $D(t+1)$ with $N\tau $ in eq.(5) and eq.(7). This leads to the following equations:
The initial conditions for $Z\left(t\right)$ and $Z(m,t)$ at $t={t}_{0}\gg \tau $ are given as follows:
The dynamics of ${Z(m,t)}_{m=1,\cdots ,N}$ are coupled through $Z\left(t\right)$ and $\overline{f}\left(Z(i,t)\right)$. To simplify and analyze this coupled system, we focus on the stationary state of $Z(m,t)$ in the limit $t\to \infty $.

$$\begin{array}{ccc}\hfill dZ(m,t)& =& \frac{\overline{f\left(Z\right(i,t\left)\right)}}{\tau Z(t+1)}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(Z(m,t)\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}f\left(Z(m,t)\right)-Z(m,t)\hfill & dt\end{array}& +& \left(\right)open="("\; close=")">\frac{\overline{f\left(Z\right(i,t\left)\right)}}{\tau Z(t+1)}\sqrt{f\left(Z\right(m,t\left)\right)(1-f(Z(m,t))}dW\left(t\right),m=1,\cdots ,N\hfill $$

$$\begin{array}{ccc}\hfill Z\left({t}_{0}\right)& \sim & \mathrm{N}\left(\right)open="("\; close=")">\frac{1}{2},\frac{1}{8N\tau},\hfill \end{array}$$

We anticipate that $Z(m,t)m=1,\cdots ,N$ will fluctuate around the stable fixed points of $f\left(z\right)$ in the stationary state. As we observed earlier in the case of $\tau \to \infty $, for $\alpha <{\alpha}_{c}$, there is only one stable fixed point, and for $\alpha \ge {\alpha}_{c}$, two stable fixed points exist, one of which is near 1. The stationary distribution is unimodal for $\alpha <{\alpha}_{c}$ and bimodal for $\alpha \ge {\alpha}_{c}$. We denote the stationary distribution and the mean value of $Z(m,t)$ as $Pst\left(z\right)$ and ${\mu}_{st}$, respectively. As $f\left(Z\right(m,t\left)\right)$ is the probability for $X(m,t+1)=1$, we can assume $\overline{f\left(Z\right(i,t)}=Z\left(t\right)={\mu}_{st}$ in the stationary state. The SDEs in eq.(10) can be simplified as follows when replacing $\overline{f}\left(Z(i,t)\right)$ and $Z\left(t\right)$ with ${\mu}_{st}$:
The stationary state with reflecting boundary conditions is determined by a potential solution[26], which can be expressed as:
The second term, $\frac{2\tau}{N{\mu}_{st}}$, arises from the ${Z}_{2}$ symmetry-breaking field and causes a shift in the stationary distribution in the positive direction.

$$\begin{array}{ccc}\hfill dZ(m,t)& =& \frac{1}{\tau}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(Z(m,t)\left)\right)}{N{\mu}_{st}}f\left(Z(m,1)\right)-Z(m,t)\hfill & dt\end{array}& +& \frac{1}{\tau}\sqrt{f\left(Z\right(m,t\left)\right)(1-f(Z(m,t))}dW\left(t\right)\hfill \\ & =& A\left(Z\right(m,t\left)\right)dt+B\left(Z\right(m,t\left)\right)dW\left(t\right)\hfill \\ \hfill A\left(z\right)& =& \frac{1}{\tau}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(z\left)\right)}{N{\mu}_{st}}f\left(z\right)-z\hfill $$

$$\begin{array}{ccc}\hfill {P}_{st}\left(z\right)& \propto & \frac{1}{B{\left(z\right)}^{2}}exp\left(\right)open="("\; close=")">{\int}_{1/2}^{z}\frac{2A\left(y\right)}{B{\left(y\right)}^{2}}dy.\hfill \end{array}$$

Figure 4 shows ${P}_{st}\left(z\right)$ in eq.(13) for $\u03f5=0.01,N={10}^{2}$. ${\mu}_{st}$ is chosen so that the mean value of ${P}_{st}\left(z\right)$ coincides with ${\mu}_{st}$.
The parameters $(\alpha ,{\mu}_{st})$ are $(0.0,0.50)$,$(0.5,0.51)$,$(0.9,0.54)$,$(0.99,0.67)$, $(1/0.99,0.74)$ and $(2.0,0.54)$. As $\alpha $ increases, the peak position shifts in the positive direction, which can be expected by the dependence of the stable solution ${z}_{s}$ on $\alpha $ in Figure 2. If $\alpha =1/(1-\u03f5)$, the peak appears at $z=1$, since there is only one stable fixed point near 1 in Figure 1. When $\alpha =2$, there are two stable fixed point and the stationary distribution is bimodal.

$${\mu}_{st}={\int}_{0}^{1}{P}_{st}\left(z\right)zdz.$$

In order to derive the dependence of ${\mu}_{st}$ and the variance of ${P}_{st}\left(z\right)$ on $\alpha $, we assume that $Z(m,t)$ fluctuates around ${\mu}_{st}\simeq \frac{1}{2}$ for $\alpha <{\alpha}_{c}$. We linearrize $f\left(z\right)$ in the vicinity of $z=1/2$ as,
We also approximate $B{\left(z\right)}^{2}$ as ${\mu}_{st}(1-{\mu}_{st})=1/4$, ${P}_{st}\left(z\right)$ becomes

$$f\left(z\right)=\frac{1}{2}+(1-\u03f5)\alpha \left(\right)open="("\; close=")">z-\frac{1}{2}$$

$${P}_{st}\left(z\right)\propto exp\left(\right)open="("\; close=")">-\frac{{\left(\right)}^{z}}{}2$$

In the case $\alpha =0$, the ants does not observe the information of the pheromones and decide by themselves. The expected value and the variance are consistent with the results for the initial state in eq.(11). The expected value and the variance increase with $\alpha $ for $0\le \alpha <1/(1-\u03f5)$.

The shape of the stationary distribution changes from the monomodal shape for $\alpha <{\alpha}_{c}$ to the bimodal shape for $\alpha \ge {\alpha}_{c}$. Figure 5 shows the stationary distribution of $Z(m,t)$ for $\alpha \in \{0.0,0.5,0.9,0.99,1/0.99,2.0\},\u03f5=0.01,\tau =100$ and $N={10}^{2}$. We also plot ${P}_{st}\left(z\right)$ in eq.(13) with solid line curves. Except for the $\alpha =2.0$ case, the numerical results agree with the theoretical ones. As $\alpha $ increases from 0 to $1/0.99$, the mean value and the variance of $Z(m,t)$ increases. For $\alpha =1/0.99$, the distribution of $Z(m,t)$ has a peak at $z=1$. The distribution becomes bimodal and has two peaks near $z=0$ and $z=1$ for $\alpha =2$. For $\alpha =2.0$, ${P}_{st}\left(z\right)$ becomes bimodal and the equilibriation time to reach the stationary state becomes extremely long. We think this is the reason for the discrepancy between the numerical and theoretical results.

We have studied a simple model for ACO and the convergence properties of the solutions. Ants answer many two-choice quizzes in sequence and deposit pheromone as they choose. As the amount of the pheromones is the number of correct answers, the following ants can receive information or hints about the correct choices. We have shown that the model reduces to a multi-variate non-linear Pólya urn process and the pheromones break the ${Z}_{2}$ symmetry of the process. By varying the exponent $\alpha $ of the decision function of the ants, there occurs a phase transition about the convergence of the probability of choosing the correct answer for each question in the limit $\tau \to \infty $. For $\tau <\infty $, the change of the stationary distribution between the monomodal and the bimodal shape occurs as we vary $\alpha $.

previous studies have adopted values of $\alpha =1$ or smaller in solving real problems, like TSP. In $\alpha $-annealing, $\alpha $ increases gradually, as shown in previous research[17]. In our study, we have shown that the duration of the period $\alpha =0$ should be long enough to ensure that the initial value of $Z(m,t)$ is in the attractive domain of the good stable state ($Z(m,t)>{z}_{u}$) in the case $\tau =\infty $. Subsequently, with $\alpha \ge 1$ in effect, $Z\left(t\right)$ converges to a value close to 1. In the case of $\tau <\infty $, the timescale for pheromone evaporation, represented by $\tau $, should be sufficiently long to maintain the same initial conditions. However, $Z(m,t)$ does not converge to a specific value; instead, it follows a stationary distribution that exhibits both bimodal and monomodal shapes depending on the value of $\alpha $. To achieve a distribution of $Z\left(t\right)$ with a prominent peak near $z=1$, the $\alpha $-annealing process is an effective strategy. Both the stable solution ${z}_{s}$ and the distribution of $Z(m,t)$ suggest that, after a lengthy period with $\alpha =0$, it is advantageous to gradually increase $\alpha $ from 1. However, the efficiency of the annealing process depends on the specific problem being addressed. Future study should clarify the efficient $\alpha $-annealing schedule.

Conceptualization, S.M. and M.H.; methodology, S.M.; software, S.N.; validation, S.M. and S.N.; formal analysis, S.M. and K.N.; investigation, S.N.; resources, S.M.; data curation, S.N.; writing—original draft preparation, S.M.; writing—review and editing, S.M., S.N. K.N. and M.H.; visualization, S.M.; supervision, S.M.; project administration, S.M.; funding acquisition, S.M. All authors have read and agreed to the published version of the manuscript.

This research was funded by JPSJ KAKENHI [Grant No. 22K03445].

We performed numerical simulations using Julia 1.7.3. The code is available on github[27].

This work was supported by JPSJ KAKENHI [Grant No. 22K03445].

The authors declare no conflict of interest.

The following abbreviations are used in this manuscript:

SDE | stochastic differential equation |

ACO | anto colony optimization |

iid | independent and identically distributed |

We assume $X(i,s)$ are iid Bernoulli random variable with $P\left(X\right(i,s)=1)=1/2$ for $i=1,\cdots ,N$ and $s\le {t}_{0}$. As $E\left(X\right(i,s\left)\right)=1/2$ and $V\left(X\right(i,s\left)\right)=1/4$, we have
Here, we define ${D}_{h}\left(t\right)$ as,
Applying the central limit theorem, we can conclude that $Z\left(t\right)=S\left(t\right)/D\left(t\right)$ behaves like a normal distribution, with its probability density function given by,

$$\begin{array}{ccc}\hfill E\left[S\right(t\left)\right]& =& \sum _{i=1}^{N}\sum _{s=1}^{t}E\left[X(i,s)\right]{e}^{-(t-s)\tau}=\frac{1}{2}D\left(t\right)\hfill \\ \hfill V\left(S\right(t\left)\right)& =& \sum _{i=1}^{N}\sum _{s=1}^{t}V\left(X(i,s){e}^{-(t-s)/\tau}\right)=\frac{1}{4}N\sum _{s=1}^{t}{e}^{-2(t-s)/\tau}=\frac{1}{4}{D}_{h}\left(t\right).\hfill \end{array}$$

$${D}_{h}\left(t\right)=N\sum _{s=1}^{t}{e}^{-2(t-s)/\tau}.$$

$$Z\left(t\right)\sim \mathrm{N}\left(\right)open="("\; close=")">\frac{1}{2},\frac{{D}_{h}\left(t\right)}{4D{\left(t\right)}^{2}}$$

$S(m,t)$ in eq.(2) is rewritten as
Conditional and unconditional expected values of $S(m,t)$ are,
Conditional variance of $S(m,t)$ is
Unconditional variance is
We estimate the variance of $Z(m,t)=S(m,t)/S\left(t\right)$ by neglecting the fluctuation of $S\left(t\right)$ as,
By the central limit theorem, $Z(m,t)=S(m,t)/S\left(t\right)$ behaves as

$$S(m,t)=\sum _{s=1}^{t}X(m,s)(1+\sum _{i\ne m}X(i,s)){e}^{-(t-s)/\tau}.$$

$$\begin{array}{ccc}\hfill E\left[S(m,t)\right|{\left\{X(m,s)\right\}}_{s=1,\cdots ,t}]& =& \sum _{s=1}^{t}X(m,s)\left(\frac{1}{2}(N+1)\right){e}^{-(t-s)/\tau}\hfill \\ \hfill E\left[S\right(m,t\left)\right]& =& E\left[E\left[S(m,t)\right|{\left\{X(m,s)\right\}}_{s=1,\cdots ,t}]\right]=\left(\right)open="("\; close=")">\frac{1}{4}+\frac{1}{4N}D\left(t\right).\hfill \end{array}$$

$$V\left(S(m,t)\right|{\left\{X(m,s)\right\}}_{s=1,\cdots ,t})=\sum _{s=1}^{t}X(m,s)\frac{1}{4}(N-1){e}^{-2(t-s)/\tau}$$

$$\begin{array}{ccc}\hfill V\left(S\right(m,t\left)\right)& =& E\left[V\left(S(m,t)\right|{\left\{X(m,s)\right\}}_{s=1,\cdots ,t})\right]+V\left(E\left[S(m,t)\right|{\left\{X(m,s)\right\}}_{s=1,\cdots ,t}]\right)\hfill \\ & =& \left(\right)open="("\; close=")">\frac{1}{16}N+\frac{1}{4}-\frac{1}{16N}{D}_{h}\left(t\right)\simeq \frac{1}{16}N\xb7{D}_{h}\left(t\right).\hfill \end{array}$$

$$V\left(Z(m,t)\right)\simeq \frac{V\left(S\right(m,t\left)\right)}{E{\left[S\left(t\right)\right]}^{2}}=\frac{N{D}_{h}\left(t\right)}{4D{\left(t\right)}^{2}}.$$

$$Z(m,t)\sim N\left(\right)open="("\; close=")">\frac{1}{2}+\frac{1}{2N},\frac{N{D}_{h}\left(t\right)}{4D{\left(t\right)}^{2}}$$

- Galam, S. Sociophysics: A review of Galam models. Int. J. Mod. Phys. C
**2008**, 19, 409–440. [Google Scholar] [CrossRef] - Galam, S. A physicist’s modeling of psycho-political phenomena; Springer: New York, 2012. [Google Scholar]
- Galam, S. Majority rule, hierarchical structures and democratic totalitarism: a statistical approach. J. of Math. Psychology
**1986**, 30, 426–434. [Google Scholar] [CrossRef] - Arthur, W.B. Competing Technologies, Increasing Returns, and Lock-In by Histrical Events. Econ. Jour.
**1989**, 99, 116–131. [Google Scholar] [CrossRef] - Bikhchandani, S.; Hirshleifer, D.; Welch, I. A Theory of Fads, Fashion, Custom, and Cultural Changes as Informational Cascades. J. Polit. Econ.
**1992**, 100, 992–1026. [Google Scholar] [CrossRef] - Mori, S.; Hisakado, M.; Takahashi, T. Phase transition to two-peaks phase in an information cascade voting experiment. Phys. Rev. E
**2012**, 86, 026109–026118. [Google Scholar] [CrossRef] [PubMed] - Nakayama, K.; Hisakado, M.; Mori, S. Nash Equilibrium of Social-Learning Agents in a Restless Multiarmed Bandit Game. Sci.Rep.
**2017**, 7, 1937. [Google Scholar] [CrossRef] [PubMed] - Galam, S.; Cheon, T. Asymmetric contrarians in opinion dynamics. Entropy
**2020**, 22(1), 25. [Google Scholar] [CrossRef] [PubMed] - Kirman, A. Ants, rationality and recruitment. Q. J. Econ.
**1993**, 108, 137–156. [Google Scholar] [CrossRef] - Hisakado, M.; Mori, S. Information cascade, Kirman’s ant colony model, and kinetic Ising model. Physica A
**2015**, 417, 63–75. [Google Scholar] [CrossRef] - Pasteels, J.; Deneubourg, J.; Detrain, C. Information processing in social insects; Birkhauser Verlag: Basel, 2007. [Google Scholar]
- Camazine, S.; Deneubourg, J. Self-organization in biological systems; Princeton University Press: NJ, 2001. [Google Scholar]
- Dorgio, M. Optimization, learning and Natural algorithms. PhD thesis, Poltecnico di Milan, 1992. [Google Scholar]
- Dorgio, M.; Caro, G.D. The ant colony meta-heuristic. In Proceedings of the New Ideas in Optimization; Corne, D., Dorgio, M.M., Glover, F., Eds.; McGraw Hill: London, 1999; pp. 11–32. [Google Scholar]
- Cordon, O.; Herrera, F.; Stutzle, T. A review on the ant colony optimization metaheuristic. Mathware and Soft Computing
**2002**, 9, 141–175. [Google Scholar] - Dorgio, M.; Gambardella, L. Ant Colonies for the Travelling Salesman Problem. Biosystem
**1997**, 43, 73–81. [Google Scholar] [CrossRef] [PubMed] - Mayer, B. On the convergence behaviour of ant colony search. Complexity
**2005**, 12, 73–81. [Google Scholar] - Gutjahr, W. ACO algorithms with guaranteed convergence to the optimal solution. Information Processing Letters
**2002**, 82, 145–153. [Google Scholar] [CrossRef] - Nakamichi, Y.; Arita, T. Diversity control in ant colony optimization. Artificail Life and Robotics
**2001**, 7, 198–204. [Google Scholar] [CrossRef] - Randall, M.; Tonkes, E. Intensification and diversification strategies in ant colony system. Complexity International
**2002**, 9, 1–7. [Google Scholar] - Hisakado, M.; Hino, M. Between Ant Colony Optimization and Genetic Algorithm. IPSJ TOM
**2016**, 9(3), 8–14. [Google Scholar] - Mori, S.; Hisakado, M. Correlation function for generalized Pólya urns: Finite-size scaling analysis. Phys.Rev. E
**2015**, 92, 052112–052121. [Google Scholar] [CrossRef] [PubMed] - Nakayama, K.; Mori, S. Universal function of the non-equilibrium phase transition of nonlinear Pólya urn. Phys. Rev.E
**2021**, 104, 014109–014118. [Google Scholar] [CrossRef] [PubMed] - Hill, B.; Lane, D.; Sudderth, W. A strong law for some generalized urn processes. Ann. Probab.
**1980**, 8, 214–226. [Google Scholar] [CrossRef] - Pemantle, R. When are touchpoints limits for generalized Pólya urns? Proc. Amer. Math. Soc.
**1991**, 113, 235–243. [Google Scholar] - Gardiner, C. Stochastic Methods: A handbook for the Natural and Social Science, 4th ed.; Springer: Berlin, 2009. [Google Scholar]
- Sampling programs for Phase transition in ACO. Available online: https://github.com/LABO-M/ACO.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Submitted:

17 October 2023

Posted:

19 October 2023

You are already at the latest version

Alerts

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

17 October 2023

Posted:

19 October 2023

You are already at the latest version

Alerts

Ant Colony Optimization (ACO) is a stochastic optimization algorithm inspired by the foraging behavior of ants. We investigate a simplified computational model of ACO, wherein ants sequentially engage in binary decision-making tasks, leaving pheromone trails contingent upon their choices. The quantity of pheromone left is the number of correct answers. We scrutinize the impact of a salient parameter in the ACO algorithm, specifically, the exponent $\alpha$ that governs the pheromone levels in the stochastic choice function. In the absence of pheromone evaporation, the system is accurately modeled as a multivariate nonlinear P\'{o}lya urn, undergoing a phase transition as $\alpha$ varies. The probability of selecting the correct answer for each question asymptotically approaches the stable fixed point of the nonlinear P\'{o}lya urn. The system exhibits dual stable fixed points for $\alpha\ge \alpha_c$ and a singular stable fixed point for $\alpha<\alpha_c$. When pheromone evaporates over a time scale $\tau$, the phase transition does not occur and leads to a bimodal stationary distribution of probabilities for $\alpha\ge \alpha_c$ and a monomodal distribution for $\alpha<\alpha_c$.

Keywords:

Subject: Physical Sciences - Theoretical Physics

"Socio-physics emerged in the 1970s and has evolved into a captivating research field within statistical physics [1,2]. In particular, herding behavior, or the inclination to follow the majority, has captured the attention of many researchers due to its pivotal role in understanding social phenomena [3,4,5,6,7,8]. Various probabilistic models have been proposed to describe herding behavior, with one notable example being the ant recruitment model. This model explains the intermittent oscillation observed in ants when they are presented with two identical food sources [9,10]. When ants choose a food source among two food sources, it incorporates a straightforward herding mechanism in which a randomly selected ant chooses one of the two based on the number of ants that have already made the same choice. Scouts play a crucial role by exploring the terrain to locate food sources [11,12]. When a scout discovers food, it returns to the nest, leaving a pheromone trail in its wake. Other ants are drawn to these pheromone marks and consequently become recruited to forage at the food source."

Ant Colony Optimization (ACO) is a model-based meta-heuristic inspired by the foraging behavior of ants in their search for the shortest path to food sources [13,14,15]. While ants may not be highly intelligent individually, they collectively find the shortest path by following pheromone trails left by their fellow ants. The optimal path is determined by the route on which the maximum number of ants travel[12]. Consider a classic problem known as the traveling salesman problem (TSP), which involves finding the shortest possible route that visits each city in a given list exactly once and returns to the origin city. In ACO, ants make decisions about their next city to visit based on a concept called ’pheromone.’ Pheromone represents the preference for a particular choice and is collaboratively learned by the ants during their search process [16]. In the context of TSP, pheromone values are typically associated with pairs of cities and reflect the preference for traveling from one city to another within the pair. These pheromone values are learned through a reinforcement strategy, where each ant reinforces its chosen paths based on the quality of the solution constructed. This quality is often determined by the inverse of the total length of the route. ACO has found successful applications in various industrial and academic constraint optimization problems and has become one of the most popular methods.

ACO has seen significant improvements, and modern ACO algorithms deviate substantially from the original ACO[17]. The fundamental modification lies in controlling the diversity of solutions and achieving convergence [18,19]. In this context, ’convergence’ refers to the tendency of ants to cluster around similar solutions in the neighborhood and ultimately converge toward the same solution. Early convergence to a small region of the search space leaves large sections of the search space unexplored and fails to find good solutions. On the other hand, very slow convergence means that the computational cost required to reach good solutions is high, rendering the search inefficient. Diversity control aims to prevent complete convergence by slowing down the search process.

Many algorithms have been proposed for controlling the diversity of the ACO algorithm. One of the diversity control mechanisms involves modifying the probabilistic decision function [17,20]. Meyer studied the influence of $\alpha $, the exponent on the pheromone level in the selection function, and suggested that $\alpha $ qualitatively determines diversity and convergence behavior. Additionally, he introduced a dynamic $\alpha $ that changes throughout the search process to enhance search efficiency, a technique known as $\alpha $-"annealing". In this paper, we study a simple model of ACO in which ants sequentially answer a series of two-choice quizzes. We investigate the phase transition and the qualitative change of the convergence behavior by varying $\alpha $. In Section 1, we introduce a model and derive stochastic differential equations (SDEs) using the diffusion approximation. In Section 2, we investigate the time evolution and examine the effect of $\alpha $ on the convergence properties of the solutions. Section 3 provides a summary of the results. Appendix A explains the estimation of the initial conditions for the SDEs.

There are N two-choice quizzes, each of which is answered by a large number of ants sequentially[21]. These quizzes are labeled by $n=1,\cdots ,N$. The answer provided by the t-th ant is denoted as $X(n,t)\in \{0,1\}$, where $X(n,t)=1$ indicates a correct answer, and $X(n,t)=0$ indicates an incorrect answer. Each ant receives 1 point for a correct answer. After ant t has answered all N questions, the total points (TP) earned by the ant can be calculated using the following equation,
Ant t deposits pheromones on his answer $X(n,t)\in \{0,1\}$. The amount of the pheromones is $\mathrm{TP}\left(t\right)$. We assume that the pheromones evaporate and decrease by ${e}^{-1/\tau}$ per unit time. Here $\tau $ represents the time scale of the pheromone evaporation.

$$\mathrm{TP}\left(t\right)=\sum _{n=1}^{N}X(n,t).$$

Ant $t+1$ observes the total values of the pheromones associated with question m for each choice $X(m,t+1)=\{0,1\}$. We denote the total value of pheromones that remains on all the questions after ant t has answered,
$$S\left(t\right)\equiv \sum _{s=1}^{t}\mathrm{TP}\left(s\right){e}^{t-s}=\sum _{s=1}^{t}\sum _{n=1}^{N}X(n,s){e}^{t-s}.$$
Then, the remaining pheromone on $X(m,s)=1,1\le s\le t$ is,
$$S(m,t)\equiv \sum _{s=1}^{t}\mathrm{TP}\left(s\right)X(m,s){e}^{t-s}=\sum _{s=1}^{t}\sum _{n=1}^{N}X(n,s)X(m,s){e}^{t-s}.$$
The remaining pheromone on $X(m,s)=0,1\le s\le t$ is given by $S\left(t\right)-S(m,t)$.

Ants are not highly intelligent, and the probability of them making the correct choice in the two-choice quizzes by themselves is 1/2. The information provided by $TP\left(s\right)$ gives them an indirect clue about the correct choice. If TP$\left(s\right)>N/2$, the posterior probability for $X(m,s)=1$ is larger than $1/2$. Similarly, if $S(m,t)$ is greater than $S\left(t\right)/2$, the posterior probability for $X(m,t+1)=1$ is greater than $1/2$. In ACO, a decision function is introduced that uses the values of the pheromones as follows,
$$P(X(m,t+1)=1)=(1-\u03f5)\frac{S{(m,t)}^{\alpha}}{S{(m,t)}^{\alpha}+{(S\left(t\right)-S(m,t))}^{\alpha}}+\frac{1}{2}\u03f5.$$
Here, the exponent $\alpha $ determines the response of the choice to the values of the pheromones. $\u03f5>0$ is a small positive constant to avoid the absorbing states $S(m,t)=0$ and $S(m,t)=S\left(t\right)$ of the stochastic process.

We denote the ratio of the remaining pheromones on the correct choices as $Z(m,t)$,
We divide both the denominator and numerator of eq.(3) by $S{\left(t\right)}^{\alpha}$, and the probability of the correct choice $X(m,t+1)=1$ is expressed as,
$$P(X(m,t+1)=1)=(1-\u03f5)\left(\right)open="("\; close=")">\frac{Z{(m,t)}^{\alpha}}{Z{(m,t)}^{\alpha}+{(1-Z(m,t))}^{\alpha}}$$
Here, $f\left(z\right)$ is defined as
$$f\left(z\right)\equiv (1-\u03f5)\left(\right)open="("\; close=")">\frac{{z}^{\alpha}}{{z}^{\alpha}+{(1-z)}^{\alpha}}$$
We note that $f(1/2)=1/2$ and $f(1-x)=1-f\left(x\right)$. The slope of $f\left(x\right)$ at $x=1/2$ is $(1-\u03f5)\alpha $. We also introduce the discount factor $D\left(t\right)$ and the ratio of correct answers $Z\left(t\right)$ as,
$$\begin{array}{ccc}\hfill D\left(t\right)& =& \sum _{n=1}^{N}\sum _{s=1}^{t}{e}^{-(t-s)/\tau}=N\left(\right)open="("\; close=")">\frac{1-{e}^{-t/\tau}}{1-{e}^{-1/\tau}}\hfill \end{array}$$

$$Z(m,t)\equiv \frac{S(m,t)}{S\left(t\right)}.$$

First, we derive the recursive relation for $S\left(t\right)$ and $D\left(t\right)$. According to the definition, $D(t+1)$ and $S(t+1)$ obey the next recursive relations,
$$\begin{array}{ccc}\hfill D(t+1)& =& N+D\left(t\right){e}^{-1/\tau}\hfill \\ \hfill S(t+1)& =& \sum _{i=1}^{N}X(i,t+1)+S\left(t\right){e}^{-1/\tau}.\hfill \end{array}$$
If $\tau $ is finite, for $t>>\tau >>1$, we have $D\left(t\right)\simeq N\tau $. In the limit $\tau \to \infty $, the pheromone does not evaporate, and we have $D\left(t\right)=Nt$.

For $N>>1$, we can replace the sum of $X(i,s),i=1,\cdots ,N$ with the sum of the expected values using the law of large numbers. We obtain,
We denote the average of $f\left(Z\right(i,t\left)\right)$ as $\overline{f\left(Z\right(i,t\left)\right)}$.
We have
The recursive relation for $Z\left(t\right)$ is
$$Z(t+1)\simeq \frac{S(t+1)}{D(t+1)}=\frac{D(t+1)-N}{D(t+1)}\xb7Z\left(t\right)+\frac{N}{D(t+1)}\xb7\overline{f\left(Z\right(i,t)}$$
$\Delta Z\left(t\right)\equiv Z(t+1)-Z\left(t\right)$ is given as,
$$\Delta Z\left(t\right)\simeq \frac{N}{D(t+1)}(\overline{f\left(Z\right(i,t\left)\right)}-Z\left(t\right)).$$
In the continuous time limit $dt=1\to 0$, we obtain
$$\frac{d}{dt}Z\left(t\right)=\frac{N}{D(t+1)}(\overline{f\left(Z\right(i,t\left)\right)}-Z\left(t\right)).$$
One see that $Z\left(t\right)$ converges to $\overline{f\left(Z\right(i,t\left)\right)}$. However, when the pheromones evaporate and $\tau <\infty $, $D\left(t\right)\simeq N\tau $ and the prefactor of the differential equation is $1/\tau $. If one assume that the dynamics of $Z(i,t)$ is faster than that of $Z\left(t\right)$(adiabatic approximation), the time scale of the convergence is given by $\tau $ as $\overline{f\left(Z\right(i,t\left)\right)}-Z\left(t\right)\propto {e}^{-t/\tau}$. When the pheromone does not evaporate and $\tau \to \infty $, $D\left(t\right)\simeq Nt$. The prefactor of the differential equation is $1/t$, and $\overline{f\left(Z\right(i,t\left)\right)}-Z\left(t\right)\propto {t}^{-1}$. The convergence becomes extremely slow.

$$S(t+1)\simeq \sum _{i=1}^{N}f\left(Z(i,t)\right)+S\left(t\right){e}^{-1/\tau}.$$

$$\overline{f\left(Z\right(i,t\left)\right)}\equiv \frac{1}{N}\sum _{i=1}^{N}f\left(Z(i,t)\right).$$

$$S(t+1)\simeq S\left(t\right){e}^{-1/\tau}+N\overline{f\left(Z\right(i,t\left)\right)}.$$

Next, we study the dynamics of $Z(m,t)$. The recursive relation for $S(m,t)$ is,
$Z(m,t+1)$ is then estimated as,
$$\begin{array}{ccc}\hfill Z(m,t+1)& =& \frac{S(m,t+1)}{S(t+1)}=\frac{S\left(t\right){e}^{-1/\tau}}{S(t+1)}\xb7Z(m,t)+\frac{{\sum}_{i\ne m}X(i,t+1)+1}{S(t+1)}\xb7X(m,t+1)\hfill \\ & \simeq & \frac{S(t+1)-N\overline{f\left(Z\right(i,t\left)\right)}}{S(t+1)}\xb7Z(m,t)+\frac{N\overline{f\left(Z\right(i,t\left)\right)}+(1-f\left(Z(m,t)\right)}{S(t+1)}\xb7X(m,t+1).\hfill \end{array}$$
Using $S(t+1)=D(t+1)Z(t+1)$, we obtain
$$\Delta Z(m,t)=\frac{N\overline{f\left(Z\right(i,t\left)\right)}}{D(t+1)Z(t+1)}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(Z(m,t)\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}.$$
We denote the history of $Z\left(s\right),{\left\{Z(i,s)\right\}}_{i=1,\cdots ,N},s=1,\cdots ,t$ as ${H}_{t}$, and the conditional expected value of $\Delta Z(m,t)$ is estimated as
$$E(\Delta Z(m,t)|{H}_{t})=\frac{N\overline{f\left(Z\right(i,t\left)\right)}}{D(t+1)Z(t+1)}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(Z(m,t)\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}.$$
Likewise, the conditional variance of $\Delta Z(m,t)$ can be approximated as,
$$V(\Delta Z(m,t)|{H}_{t})\simeq {\left(\right)}^{\frac{N\overline{f\left(Z\right(i,t)}}{D(t+1)Z(t+1)}}2$$
Here, we neglect the subleading terms in ${\left(\right)}^{1}$. We read the drift and diffusion term from the results and the SDEs are,
$$dZ(m,t)=E(\Delta Z(m,t)|{H}_{t})dt+\sqrt{V(\Delta Z(m,t)|{H}_{t})}dW\left(t\right),m=1,\cdots ,N.$$
Here, $W\left(t\right)$ is the Wiener process. Eq.(5) and eq.(7) describe the dynamics of the system. The system can be described as a multi-variate Pólya urn process.

$$S(m,t+1)=S(m,t){e}^{-1/\tau}+X(m,t+1)(\sum _{i=1\ne m}^{N}X(i,t+1)+1).$$

We note that $\frac{(1-f(Z(m,t)\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}$ in eq.(6) breaks the ${Z}_{2}$ symmetry of the system. If one neglects the term, $E(\Delta Z(m,t)|{H}_{t})$ is proportional to $f\left(Z\right(m,t\left)\right)-Z(m,t)$. As $f(1-x)=1-f\left(x\right)$, $f(1-x)-(1-x)=-\left(f\right(x)-x)$ holds. $Z(m,t)$ and $1-Z(m,t)$ obeys the same dynamics and we call the symmetry ${Z}_{2}$ symmetry. The term is always positive and drives $Z(m,t)$ in the positive direction. As the term is proportional to $1/N$, the strength of the ${Z}_{2}$-symmetry breaking field becomes smaller as N becomes larger.

We analyze the SDEs given in eq.(7) and investigate the convergence properties of $Z(m,t)$. As the convergence behavior relies on the initial value of $Z(m,t={t}_{0})$ in the context of the non-linear Pólya urn model, we commence by examining the distribution of $Z(m,t)$.

We assume that ants adopt $\alpha =0$ and do not respond to the values of the pheromones for $t=1,\cdots ,{t}_{0}$. The ants answer the questions independently and $P\left(X\right(i,t)=1)=1/2$. We estimate $Z\left(t\right)$ and $Z(m,t)$ for $\tau <t\le {t}_{0}$ as follows.
$$\begin{array}{ccc}\hfill Z\left(t\right)& \sim & \mathrm{N}\left(\right)open="("\; close=")">\frac{1}{2},\frac{{D}_{h}\left(t\right)}{4D{\left(t\right)}^{2}},\hfill \end{array}\hfill {D}_{h}\left(t\right)& \equiv & N\sum _{s=1}^{t}{e}^{-2(t-s)/\tau}=N\xb7\frac{1-{e}^{-2t/\tau}}{1-{e}^{-2/\tau}}.\hfill $$
The details of the calculations are given in Appendix A. If $\tau $ is finite, we have ${D}_{h}\left(t\right)\simeq N\tau /2$ for $t>>\tau >>1$. In the limit $\tau \to \infty $, the pheromone does not evaporate and we have ${D}_{h}\left(t\right)=Nt$.

The essential differences between $Z\left(t\right)$ and $Z(m,t)$ include a shift of the expected value by $1/2N$ and the presence of a factor of N in the numerator of the variance of $Z(m,t)$. The shift of $1/2N$ arises from the fact that $X(m,s)$ in $S(m,t)$ is 1, which is larger than $E\left[X\right(i,s\left)\right]=1/2$ for $i\ne m$. The value of the pheromone contains information about the correct choice, leading to $E\left[S(m,t)\right]>\frac{1}{2}E\left[S\left(t\right)\right]$. However, in the "cheating" process, the variables $X(i,s),i=1,\cdots ,N$ are combined by $X(m,s)$ as in eq.(2), resulting in a larger variance for $S(m,t)$. The factor of N in the numerator of the variance of $Z(m,t)$ is a consequence of this combination process.

From the distribution of $Z(m,{t}_{0})$, one can determine the values of $\tau $ or ${t}_{0}$ that guarantee that $Z(m,{t}_{0})$ is greater than 0.5 for the limits $t>>\tau >>1$ and $\tau \to \infty $, respectively. With a confidence level of 1%, $\tau $ and ${t}_{0}$ should satisfy the following conditions:
$$\begin{array}{ccc}& & t>>\tau >>1:\frac{1}{2N}\ge \frac{2.58}{2\sqrt{2}\tau}\to \tau \ge 3.33{N}^{2}\hfill \\ & & \tau \to \infty :\frac{1}{2N}\ge \frac{2.58}{2\sqrt{{t}_{0}}}\to {t}_{0}\ge 6.66{N}^{2}.\hfill \end{array}$$

We take the limit $\tau \to \infty $ in eq.(5) and eq.(7). We replace $D(t+1)$ and ${D}_{h}(t+1)$ with $N(t+1)$ and $N(t+1)$, respectively. This results in the following equations:
$$\begin{array}{ccc}\hfill dZ(m,t)& =& \frac{\overline{f\left(Z\right(i,t\left)\right)}}{(t+1)Z(t+1)}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(Z(m,t)\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}f\left(Z(m,t)\right)-Z(m,t)\hfill & dt\end{array}& +& \left(\right)open="("\; close=")">\frac{\overline{f\left(Z\right(i,t\left)\right)}}{(t+1)Z(t+1)}\sqrt{f\left(Z\right(m,t\left)\right(1-f\left(Z\right(m,t\left)\right))}dW\left(t\right),m=1,\cdots ,N\hfill $$
The initial conditions of $Z\left(t\right)$ and $Z(m,t)$ at $t={t}_{0}$ are as follows:
$$\begin{array}{ccc}\hfill Z\left({t}_{0}\right)& \sim & \mathrm{N}\left(\right)open="("\; close=")">\frac{1}{2},\frac{1}{4N{t}_{0}},\hfill \end{array}$$

In the adiabatic approximation, where the time development of $Z(m,t)$ is much faster than that of $Z\left(t\right)$, the structure of the SDE for each $Z(m,t)$, where $m=1,\cdots ,N$, is the same as that of a non-linear P’olya urn [22,23]. The probability for $Z(m,t)$ to converge to a stable solution of the following equation is positive[24].
$$\left(\right)open="("\; close=")">1+\frac{(1-f(z\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}$$
Here, a stable (unstable) solution of eq.(9) means that the curve of the left-hand-side of the equation crosses the diagonal curve $y=z$ in in the downward (upward) direction[24]. We denote the stable and unstable solutions as ${z}_{s}$ and ${z}_{u}$, respectively.

In the limit $N\to \infty $, the positive driving force $\frac{(1-f(z\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}$ in eq.(9) disappears. The system becomes ${Z}_{2}$ symmetric and $z=1/2$ is a the solution as $f(1/2)=1/2$. If N is finite, the positive driving force breaks the ${Z}_{2}$ symmetry. We plot $(1+b(1-f\left(z\right)\left)f\right(z)$ in Figure 1. $b>0$ corresponds with $1/N\overline{f\left(Z\right(i,t\left)\right)}$ and $b\simeq 0.01\sim 0.02$ for $N=100$ and $\overline{f\left(Z\right(i,t)}=1/2\sim 1$. In the left and the right figure, we adopt $b=0,\u03f5=0.1$ and $b=0.2,\u03f5=0.1$, respectively.

In the ${Z}_{2}$ symmetric case ($b=0$), the stability of the solution $z=1/2$ depends on the slope of $f\left(z\right)$ at $z=1/2$. As ${f}^{\prime}(1/2)=(1-\u03f5)\alpha $, the critical value of $\alpha $ is ${\alpha}_{c}=1/(1-\u03f5)$. If $\alpha <(>){\alpha}_{c}$, $z=1/2$ is (un)stable. If $\alpha ={\alpha}_{c}$, the curve of $f\left(z\right)$ is tangential to the diagonal. $Z(m,t)$ converges to $1/2$ for $\alpha \le {\alpha}_{c}$, as $z=1/2$ is the unique stable solution ${z}_{s}$. For $\alpha >{\alpha}_{c}$, there appears two stable solutions ${z}_{s},{z}_{s}^{\prime}$, one(${z}_{s}^{\prime}$) is in $(0,0.5)$ and the other(${z}_{s}$) is in $(0.5,1.0)$. $z=0.5$ becomes the unstale solution ${z}_{u}$.

The dotted lines Figure 2 shows the solutions vs. $\alpha $ for the ${Z}_{2}$ symmetric case. For $\alpha <{\alpha}_{c}$, $z=1/2$(black dotted line) is the stable solution. For $\alpha >\alpha >c$, $z=1/2$ (gray dotted line) becomes unstable (${z}_{u}=1/2$) and two stable solution ${z}_{s},{z}_{s}^{\prime}$ departs from $z=1/2$ continuously with $\alpha >{\alpha}_{c}$. Which stable solution does $Z(m,t)$ converge depends on the initial value of $Z(m,{t}_{0})$. In general, if $Z(m,{t}_{0})$ is greater (smaller) than $1/2$, the probability of the convergence to 1 is greater (smaller) than $1/2$. ${z}_{u}$ determines the "attractive domains" for the stable solutions ${z}_{s},{z}_{s}^{\prime}$. The susceptibility of the expected value of $Z(m,t)$ to the initial value $Z(m,{t}_{0})$ is the order parameter of the non-linear Pólya urn[22,23]. As the order parameter is proportional to the difference of the two stable states, the order parameter is a continuous function of $\alpha $ and the phase transition is continuous.

In the ${Z}_{2}$ asymmetric case ($b\ne 0$), for small values of $\alpha $, there is a stable solution ${z}_{s}$ in the range $(0.5,1)$. ${z}_{s}$ increase with $\alpha $ and at some critical value ${\alpha}_{c}$ of $\alpha $, the curve $f\left(z\right)(1+b(1-f\left(z\right))$ becomes tangential to the diagonal at $z={z}_{t}$. ${z}_{t}$ is known as touchpoint and to be stable[25]. As $\alpha $ continues to increase beyond ${\alpha}_{c}$, two significant changes occur: a new stable solution ${z}_{s}^{\prime}$ emerges from the touchpoint ${z}_{t}$, while an unstable solution ${z}_{u}$ also becomes apparent.

When $\alpha <{\alpha}_{c}$, only one stable solution, ${z}_{s}$, exists within the range $1/2<{z}_{s}<1$. Conversely, for $\alpha >{\alpha}_{c}$, the specific stable solution to which $Z(m,t)$ converges depends on the initial values of $Z(m,{t}_{0})$. At $\alpha ={\alpha}_{c}$, both the stable fixed point ${z}_{s}$ and the touchpoint ${z}_{t}$ remain stable. Which solution $Z(m,t)$ converges to is determined by the initial values of $Z(m,{t}_{0})$ in this case as well. For $\alpha >{\alpha}_{c}$, the situation mirrors that of $\alpha ={\alpha}_{c}$. Once $\alpha \ge {\alpha}_{c}$, the order parameter turns positive, and the phase transition becomes discontinuous.

Figure 3 shows the results of the numerical studies in the limit $\tau \to \infty $. We sampled a trajectory of $Z(m,t)$ and $Z\left(t\right)$ for $1\le t\le {10}^{9}$ with $N={10}^{2}$ and $\u03f5=0.01$. In the left figure, we present the distribution of $Z(m,{t}_{0})$ for two different values of ${t}_{0}$, namely, ${t}_{0}\in {10}^{3},{10}^{6}$. The mean value of $Z(m,{t}_{0})$ is approximately $1/2+1/2N$, which aligns with the theoretical predictions. The variance of $Z(m,{t}_{0})$ is given by $1/4{t}_{0}$, so the variance for ${t}_{0}={10}^{3}$ is about ${10}^{3}$ times larger than that for ${t}_{0}={10}^{6}$. Consequently, if we choose ${t}_{0}={10}^{3}$, a significant proportion of $m\in 1,\cdots ,N$ will have $Z(m,{t}_{0})<{z}_{u}\simeq 0.5$, resulting in a high probability that $Z(m,t)$ converges to ${z}_{s}^{\prime}<1/2$ for $\alpha =2.0$. In such cases, $Z\left(t\right)$ cannot reach 1 due to the convergence of $Z(m,t)$ to ${z}_{s}^{\prime}$. On the other hand, if we set ${t}_{0}={10}^{5}$, the ratio of $m\in 1,\cdots ,N$ with $Z(m,{t}_{0})<{z}_{u}\simeq 0.5$ is zero, ensuring that $Z(m,t)$ always converges to ${z}_{s}>1/2$. As a result, $Z\left(t\right)$ monotonically increases towards 1 for $\alpha =2.0$. For $\alpha =1.0<{\alpha}_{c}$, where only one stable state ${z}_{s}\simeq 1$ exists, $Z(m,t)$ consistently converges to ${z}_{s}$. It’s evident that $Z\left(t\right)$ monotonically approaches ${z}_{s}$ with time for both ${t}_{0}={10}^{3}$ and ${t}_{0}={10}^{5}$ cases within the range $t\le {10}^{9}$. In the case of $\alpha =0.5$, where ${z}_{s}\simeq 0.5$, $Z\left(t\right)$ experiences relatively little change

In the case where $\tau $ is finite, we make the assumption that $t\gg \tau \gg 1$ and replace $D(t+1)$ with $N\tau $ in eq.(5) and eq.(7). This leads to the following equations:
$$\begin{array}{ccc}\hfill dZ(m,t)& =& \frac{\overline{f\left(Z\right(i,t\left)\right)}}{\tau Z(t+1)}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(Z(m,t)\left)\right)}{N\overline{f\left(Z\right(i,t\left)\right)}}f\left(Z(m,t)\right)-Z(m,t)\hfill & dt\end{array}& +& \left(\right)open="("\; close=")">\frac{\overline{f\left(Z\right(i,t\left)\right)}}{\tau Z(t+1)}\sqrt{f\left(Z\right(m,t\left)\right)(1-f(Z(m,t))}dW\left(t\right),m=1,\cdots ,N\hfill $$
The initial conditions for $Z\left(t\right)$ and $Z(m,t)$ at $t={t}_{0}\gg \tau $ are given as follows:
$$\begin{array}{ccc}\hfill Z\left({t}_{0}\right)& \sim & \mathrm{N}\left(\right)open="("\; close=")">\frac{1}{2},\frac{1}{8N\tau},\hfill \end{array}$$
The dynamics of ${Z(m,t)}_{m=1,\cdots ,N}$ are coupled through $Z\left(t\right)$ and $\overline{f}\left(Z(i,t)\right)$. To simplify and analyze this coupled system, we focus on the stationary state of $Z(m,t)$ in the limit $t\to \infty $.

We anticipate that $Z(m,t)m=1,\cdots ,N$ will fluctuate around the stable fixed points of $f\left(z\right)$ in the stationary state. As we observed earlier in the case of $\tau \to \infty $, for $\alpha <{\alpha}_{c}$, there is only one stable fixed point, and for $\alpha \ge {\alpha}_{c}$, two stable fixed points exist, one of which is near 1. The stationary distribution is unimodal for $\alpha <{\alpha}_{c}$ and bimodal for $\alpha \ge {\alpha}_{c}$. We denote the stationary distribution and the mean value of $Z(m,t)$ as $Pst\left(z\right)$ and ${\mu}_{st}$, respectively. As $f\left(Z\right(m,t\left)\right)$ is the probability for $X(m,t+1)=1$, we can assume $\overline{f\left(Z\right(i,t)}=Z\left(t\right)={\mu}_{st}$ in the stationary state. The SDEs in eq.(10) can be simplified as follows when replacing $\overline{f}\left(Z(i,t)\right)$ and $Z\left(t\right)$ with ${\mu}_{st}$:
$$\begin{array}{ccc}\hfill dZ(m,t)& =& \frac{1}{\tau}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(Z(m,t)\left)\right)}{N{\mu}_{st}}f\left(Z(m,1)\right)-Z(m,t)\hfill & dt\end{array}& +& \frac{1}{\tau}\sqrt{f\left(Z\right(m,t\left)\right)(1-f(Z(m,t))}dW\left(t\right)\hfill \\ & =& A\left(Z\right(m,t\left)\right)dt+B\left(Z\right(m,t\left)\right)dW\left(t\right)\hfill \\ \hfill A\left(z\right)& =& \frac{1}{\tau}\left(\right)open="("\; close=")">\left(\right)open="("\; close=")">1+\frac{(1-f(z\left)\right)}{N{\mu}_{st}}f\left(z\right)-z\hfill $$
The stationary state with reflecting boundary conditions is determined by a potential solution[26], which can be expressed as:
$$\begin{array}{ccc}\hfill {P}_{st}\left(z\right)& \propto & \frac{1}{B{\left(z\right)}^{2}}exp\left(\right)open="("\; close=")">{\int}_{1/2}^{z}\frac{2A\left(y\right)}{B{\left(y\right)}^{2}}dy.\hfill \end{array}$$
The second term, $\frac{2\tau}{N{\mu}_{st}}$, arises from the ${Z}_{2}$ symmetry-breaking field and causes a shift in the stationary distribution in the positive direction.

Figure 4 shows ${P}_{st}\left(z\right)$ in eq.(13) for $\u03f5=0.01,N={10}^{2}$. ${\mu}_{st}$ is chosen so that the mean value of ${P}_{st}\left(z\right)$ coincides with ${\mu}_{st}$.
The parameters $(\alpha ,{\mu}_{st})$ are $(0.0,0.50)$,$(0.5,0.51)$,$(0.9,0.54)$,$(0.99,0.67)$, $(1/0.99,0.74)$ and $(2.0,0.54)$. As $\alpha $ increases, the peak position shifts in the positive direction, which can be expected by the dependence of the stable solution ${z}_{s}$ on $\alpha $ in Figure 2. If $\alpha =1/(1-\u03f5)$, the peak appears at $z=1$, since there is only one stable fixed point near 1 in Figure 1. When $\alpha =2$, there are two stable fixed point and the stationary distribution is bimodal.

$${\mu}_{st}={\int}_{0}^{1}{P}_{st}\left(z\right)zdz.$$

In order to derive the dependence of ${\mu}_{st}$ and the variance of ${P}_{st}\left(z\right)$ on $\alpha $, we assume that $Z(m,t)$ fluctuates around ${\mu}_{st}\simeq \frac{1}{2}$ for $\alpha <{\alpha}_{c}$. We linearrize $f\left(z\right)$ in the vicinity of $z=1/2$ as,
We also approximate $B{\left(z\right)}^{2}$ as ${\mu}_{st}(1-{\mu}_{st})=1/4$, ${P}_{st}\left(z\right)$ becomes
$${P}_{st}\left(z\right)\propto exp\left(\right)open="("\; close=")">-\frac{{\left(\right)}^{z}}{}2$$

$$f\left(z\right)=\frac{1}{2}+(1-\u03f5)\alpha \left(\right)open="("\; close=")">z-\frac{1}{2}$$

In the case $\alpha =0$, the ants does not observe the information of the pheromones and decide by themselves. The expected value and the variance are consistent with the results for the initial state in eq.(11). The expected value and the variance increase with $\alpha $ for $0\le \alpha <1/(1-\u03f5)$.

The shape of the stationary distribution changes from the monomodal shape for $\alpha <{\alpha}_{c}$ to the bimodal shape for $\alpha \ge {\alpha}_{c}$. Figure 5 shows the stationary distribution of $Z(m,t)$ for $\alpha \in \{0.0,0.5,0.9,0.99,1/0.99,2.0\},\u03f5=0.01,\tau =100$ and $N={10}^{2}$. We also plot ${P}_{st}\left(z\right)$ in eq.(13) with solid line curves. Except for the $\alpha =2.0$ case, the numerical results agree with the theoretical ones. As $\alpha $ increases from 0 to $1/0.99$, the mean value and the variance of $Z(m,t)$ increases. For $\alpha =1/0.99$, the distribution of $Z(m,t)$ has a peak at $z=1$. The distribution becomes bimodal and has two peaks near $z=0$ and $z=1$ for $\alpha =2$. For $\alpha =2.0$, ${P}_{st}\left(z\right)$ becomes bimodal and the equilibriation time to reach the stationary state becomes extremely long. We think this is the reason for the discrepancy between the numerical and theoretical results.

We have studied a simple model for ACO and the convergence properties of the solutions. Ants answer many two-choice quizzes in sequence and deposit pheromone as they choose. As the amount of the pheromones is the number of correct answers, the following ants can receive information or hints about the correct choices. We have shown that the model reduces to a multi-variate non-linear Pólya urn process and the pheromones break the ${Z}_{2}$ symmetry of the process. By varying the exponent $\alpha $ of the decision function of the ants, there occurs a phase transition about the convergence of the probability of choosing the correct answer for each question in the limit $\tau \to \infty $. For $\tau <\infty $, the change of the stationary distribution between the monomodal and the bimodal shape occurs as we vary $\alpha $.

previous studies have adopted values of $\alpha =1$ or smaller in solving real problems, like TSP. In $\alpha $-annealing, $\alpha $ increases gradually, as shown in previous research[17]. In our study, we have shown that the duration of the period $\alpha =0$ should be long enough to ensure that the initial value of $Z(m,t)$ is in the attractive domain of the good stable state ($Z(m,t)>{z}_{u}$) in the case $\tau =\infty $. Subsequently, with $\alpha \ge 1$ in effect, $Z\left(t\right)$ converges to a value close to 1. In the case of $\tau <\infty $, the timescale for pheromone evaporation, represented by $\tau $, should be sufficiently long to maintain the same initial conditions. However, $Z(m,t)$ does not converge to a specific value; instead, it follows a stationary distribution that exhibits both bimodal and monomodal shapes depending on the value of $\alpha $. To achieve a distribution of $Z\left(t\right)$ with a prominent peak near $z=1$, the $\alpha $-annealing process is an effective strategy. Both the stable solution ${z}_{s}$ and the distribution of $Z(m,t)$ suggest that, after a lengthy period with $\alpha =0$, it is advantageous to gradually increase $\alpha $ from 1. However, the efficiency of the annealing process depends on the specific problem being addressed. Future study should clarify the efficient $\alpha $-annealing schedule.

Conceptualization, S.M. and M.H.; methodology, S.M.; software, S.N.; validation, S.M. and S.N.; formal analysis, S.M. and K.N.; investigation, S.N.; resources, S.M.; data curation, S.N.; writing—original draft preparation, S.M.; writing—review and editing, S.M., S.N. K.N. and M.H.; visualization, S.M.; supervision, S.M.; project administration, S.M.; funding acquisition, S.M. All authors have read and agreed to the published version of the manuscript.

This research was funded by JPSJ KAKENHI [Grant No. 22K03445].

We performed numerical simulations using Julia 1.7.3. The code is available on github[27].

This work was supported by JPSJ KAKENHI [Grant No. 22K03445].

The authors declare no conflict of interest.

The following abbreviations are used in this manuscript:

SDE | stochastic differential equation |

ACO | anto colony optimization |

iid | independent and identically distributed |

We assume $X(i,s)$ are iid Bernoulli random variable with $P\left(X\right(i,s)=1)=1/2$ for $i=1,\cdots ,N$ and $s\le {t}_{0}$. As $E\left(X\right(i,s\left)\right)=1/2$ and $V\left(X\right(i,s\left)\right)=1/4$, we have
$$\begin{array}{ccc}\hfill E\left[S\right(t\left)\right]& =& \sum _{i=1}^{N}\sum _{s=1}^{t}E\left[X(i,s)\right]{e}^{-(t-s)\tau}=\frac{1}{2}D\left(t\right)\hfill \\ \hfill V\left(S\right(t\left)\right)& =& \sum _{i=1}^{N}\sum _{s=1}^{t}V\left(X(i,s){e}^{-(t-s)/\tau}\right)=\frac{1}{4}N\sum _{s=1}^{t}{e}^{-2(t-s)/\tau}=\frac{1}{4}{D}_{h}\left(t\right).\hfill \end{array}$$
Here, we define ${D}_{h}\left(t\right)$ as,
Applying the central limit theorem, we can conclude that $Z\left(t\right)=S\left(t\right)/D\left(t\right)$ behaves like a normal distribution, with its probability density function given by,
$$Z\left(t\right)\sim \mathrm{N}\left(\right)open="("\; close=")">\frac{1}{2},\frac{{D}_{h}\left(t\right)}{4D{\left(t\right)}^{2}}$$

$${D}_{h}\left(t\right)=N\sum _{s=1}^{t}{e}^{-2(t-s)/\tau}.$$

$S(m,t)$ in eq.(2) is rewritten as
Conditional and unconditional expected values of $S(m,t)$ are,
$$\begin{array}{ccc}\hfill E\left[S(m,t)\right|{\left\{X(m,s)\right\}}_{s=1,\cdots ,t}]& =& \sum _{s=1}^{t}X(m,s)\left(\frac{1}{2}(N+1)\right){e}^{-(t-s)/\tau}\hfill \\ \hfill E\left[S\right(m,t\left)\right]& =& E\left[E\left[S(m,t)\right|{\left\{X(m,s)\right\}}_{s=1,\cdots ,t}]\right]=\left(\right)open="("\; close=")">\frac{1}{4}+\frac{1}{4N}D\left(t\right).\hfill \end{array}$$
Conditional variance of $S(m,t)$ is
$$V\left(S(m,t)\right|{\left\{X(m,s)\right\}}_{s=1,\cdots ,t})=\sum _{s=1}^{t}X(m,s)\frac{1}{4}(N-1){e}^{-2(t-s)/\tau}$$
Unconditional variance is
$$\begin{array}{ccc}\hfill V\left(S\right(m,t\left)\right)& =& E\left[V\left(S(m,t)\right|{\left\{X(m,s)\right\}}_{s=1,\cdots ,t})\right]+V\left(E\left[S(m,t)\right|{\left\{X(m,s)\right\}}_{s=1,\cdots ,t}]\right)\hfill \\ & =& \left(\right)open="("\; close=")">\frac{1}{16}N+\frac{1}{4}-\frac{1}{16N}{D}_{h}\left(t\right)\simeq \frac{1}{16}N\xb7{D}_{h}\left(t\right).\hfill \end{array}$$
We estimate the variance of $Z(m,t)=S(m,t)/S\left(t\right)$ by neglecting the fluctuation of $S\left(t\right)$ as,
$$V\left(Z(m,t)\right)\simeq \frac{V\left(S\right(m,t\left)\right)}{E{\left[S\left(t\right)\right]}^{2}}=\frac{N{D}_{h}\left(t\right)}{4D{\left(t\right)}^{2}}.$$
By the central limit theorem, $Z(m,t)=S(m,t)/S\left(t\right)$ behaves as
$$Z(m,t)\sim N\left(\right)open="("\; close=")">\frac{1}{2}+\frac{1}{2N},\frac{N{D}_{h}\left(t\right)}{4D{\left(t\right)}^{2}}$$

$$S(m,t)=\sum _{s=1}^{t}X(m,s)(1+\sum _{i\ne m}X(i,s)){e}^{-(t-s)/\tau}.$$

- Galam, S. Sociophysics: A review of Galam models. Int. J. Mod. Phys. C
**2008**, 19, 409–440. [Google Scholar] [CrossRef] - Galam, S. A physicist’s modeling of psycho-political phenomena; Springer: New York, 2012. [Google Scholar]
- Galam, S. Majority rule, hierarchical structures and democratic totalitarism: a statistical approach. J. of Math. Psychology
**1986**, 30, 426–434. [Google Scholar] [CrossRef] - Arthur, W.B. Competing Technologies, Increasing Returns, and Lock-In by Histrical Events. Econ. Jour.
**1989**, 99, 116–131. [Google Scholar] [CrossRef] - Bikhchandani, S.; Hirshleifer, D.; Welch, I. A Theory of Fads, Fashion, Custom, and Cultural Changes as Informational Cascades. J. Polit. Econ.
**1992**, 100, 992–1026. [Google Scholar] [CrossRef] - Mori, S.; Hisakado, M.; Takahashi, T. Phase transition to two-peaks phase in an information cascade voting experiment. Phys. Rev. E
**2012**, 86, 026109–026118. [Google Scholar] [CrossRef] [PubMed] - Nakayama, K.; Hisakado, M.; Mori, S. Nash Equilibrium of Social-Learning Agents in a Restless Multiarmed Bandit Game. Sci.Rep.
**2017**, 7, 1937. [Google Scholar] [CrossRef] [PubMed] - Galam, S.; Cheon, T. Asymmetric contrarians in opinion dynamics. Entropy
**2020**, 22(1), 25. [Google Scholar] [CrossRef] [PubMed] - Kirman, A. Ants, rationality and recruitment. Q. J. Econ.
**1993**, 108, 137–156. [Google Scholar] [CrossRef] - Hisakado, M.; Mori, S. Information cascade, Kirman’s ant colony model, and kinetic Ising model. Physica A
**2015**, 417, 63–75. [Google Scholar] [CrossRef] - Pasteels, J.; Deneubourg, J.; Detrain, C. Information processing in social insects; Birkhauser Verlag: Basel, 2007. [Google Scholar]
- Camazine, S.; Deneubourg, J. Self-organization in biological systems; Princeton University Press: NJ, 2001. [Google Scholar]
- Dorgio, M. Optimization, learning and Natural algorithms. PhD thesis, Poltecnico di Milan, 1992. [Google Scholar]
- Dorgio, M.; Caro, G.D. The ant colony meta-heuristic. In Proceedings of the New Ideas in Optimization; Corne, D., Dorgio, M.M., Glover, F., Eds.; McGraw Hill: London, 1999; pp. 11–32. [Google Scholar]
- Cordon, O.; Herrera, F.; Stutzle, T. A review on the ant colony optimization metaheuristic. Mathware and Soft Computing
**2002**, 9, 141–175. [Google Scholar] - Dorgio, M.; Gambardella, L. Ant Colonies for the Travelling Salesman Problem. Biosystem
**1997**, 43, 73–81. [Google Scholar] [CrossRef] [PubMed] - Mayer, B. On the convergence behaviour of ant colony search. Complexity
**2005**, 12, 73–81. [Google Scholar] - Gutjahr, W. ACO algorithms with guaranteed convergence to the optimal solution. Information Processing Letters
**2002**, 82, 145–153. [Google Scholar] [CrossRef] - Nakamichi, Y.; Arita, T. Diversity control in ant colony optimization. Artificail Life and Robotics
**2001**, 7, 198–204. [Google Scholar] [CrossRef] - Randall, M.; Tonkes, E. Intensification and diversification strategies in ant colony system. Complexity International
**2002**, 9, 1–7. [Google Scholar] - Hisakado, M.; Hino, M. Between Ant Colony Optimization and Genetic Algorithm. IPSJ TOM
**2016**, 9(3), 8–14. [Google Scholar] - Mori, S.; Hisakado, M. Correlation function for generalized Pólya urns: Finite-size scaling analysis. Phys.Rev. E
**2015**, 92, 052112–052121. [Google Scholar] [CrossRef] [PubMed] - Nakayama, K.; Mori, S. Universal function of the non-equilibrium phase transition of nonlinear Pólya urn. Phys. Rev.E
**2021**, 104, 014109–014118. [Google Scholar] [CrossRef] [PubMed] - Hill, B.; Lane, D.; Sudderth, W. A strong law for some generalized urn processes. Ann. Probab.
**1980**, 8, 214–226. [Google Scholar] [CrossRef] - Pemantle, R. When are touchpoints limits for generalized Pólya urns? Proc. Amer. Math. Soc.
**1991**, 113, 235–243. [Google Scholar] - Gardiner, C. Stochastic Methods: A handbook for the Natural and Social Science, 4th ed.; Springer: Berlin, 2009. [Google Scholar]
- Sampling programs for Phase transition in ACO. Available online: https://github.com/LABO-M/ACO.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Phase Transition in Ant Colony Optimization

Shintaro Mori

et al.

,

2023

A Short Study on Minima Distribution

Loc Nguyen

,

2022

Evidence of Critical Dynamics in the Honey Bee Swarm

Ivan Ilich Shpurov

et al.

,

2022

© 2024 MDPI (Basel, Switzerland) unless otherwise stated