Preprint

Article

Altmetrics

Downloads

78

Views

26

Comments

0

This version is not peer-reviewed

Discretization in time and in the state space of the system leads to the necessity to solve the parameter identification problems for dynamic systems in a limited time (at a finite time interval) using observations obtained under the influence of unknown-but-bounded noise. Finding the solution in this case is more difficult compared to traditional identification problem setting which considers random independent zero-mean noise. For system parameter identification problem under unknown-but-bounded noise, a randomized stochastic optimization algorithm is given in the paper, estimates for the mean square values of the residuals for a finite observation interval are obtained. An example of application of the given method to the problem of tuning the parameters of a multi-mirror telescope is considered.

Keywords:

Subject: Computer Science and Mathematics - Mathematics

An exact solution to any problem is possible with an accurate formulation of the problem, but connections and relationships in the real world are so complex and diverse that it is almost impossible to describe many phenomena strictly in mathematical language. A typical approach in theory is to select a mathematical model close to real processes and include various noises related, on the one hand, to the roughness of the mathematical model and, on the other hand, characterizing uncontrolled external disturbances affecting the considered object or system. For all mathematical models, the result of the experiment is a mathematical object: a number, a set of numbers, a curve, etc. From a mathematical point of view, a significant range of applied problems aims to restore characteristics from experimental data (parameters) of the object. At the same time, real systems are rarely described thoroughly by limited mathematical models. When choosing a model to solve a real problem, it is common to consider so-called systematic error (model error), which can be quantified by the distance from the real operator to the selected model. Another type of error that an experimenter may encounter is associated with measurement errors. Such errors are called statistical errors (random errors). The process of selecting characteristics (parameters) of a model from a given class of models to describe the results in a best way is one of the general definitions of estimation. In practice the estimation process can often be related to some quantitative characteristic of the quality of estimation and, it is natural while choosing the estimates, to try to minimize the negative impact of errors, both statistical and, if possible, systematic [2]. Examples include:

- formation of multiscale vortex structures in turbulent fluid flows and plastic flow of solids under pulsed load;
- clustering in a stream of concentrated dispersive mixtures;
- propagation of the shock wave front inside the substance;
- transition layers near interphase boundaries;
- processes of protein formation in cells;
- hierarchy of structures in living systems;
- processes of fission of heavy elements nuclei;
- thermonuclear fusion;
- as well as the behavior of groups of people.

Among new directions in distributed systems research connections between distributed systems theory on the one hand, and canonical problems in turbulence and statistical mechanics on the other could be suggested. In one class of problems, spatio-temporal dynamical analysis clarifies old and complex questions in the theory of shear flow turbulence. In another class of problems, structured, distributed control design exhibits dimensionality-dependence and phase transition phenomena similar to those in statistical mechanics.

Assume that (unidirectional) time t is introduced, and consider non-isolated systems consisting of elements. Evolution of each system over time is determined by the current states of both the system itself and other elements of the system. The evolution is also affected by external disturbing influences W (the absence of influence can be interpreted as zero impact). External influences W can be formally included in the general set of system states. The inclusion of W into of the system state can significantly complicate the descriptive model. External influences W naturally fall into two groups:
controlled ones u (or simply, control) and uncontrolled ones w.

$$W=\left(\begin{array}{c}u\\ w\end{array}\right),$$

It is usually assumed that at time t the system state $X\left(t\right)$ is finite-dimensional, and to describe the dynamics of the system a system of differential equations is used
with some functions ${g}_{i}(\xb7)$ and external disturbances W. In models of complex systems of this type, consisting of a huge number of components, it is customary to assume a large dimension n of state space. But, on the one hand, the choice of the threshold for n significantly limits the “upper bound” of complexity, and, on the other hand, does not allow to take possible “flexibility” of the system into account during its changing process. We will assume that the system consists of a continuum of elements $X=\left\{{x}_{\gamma}\right\}$, parameterized by $\gamma \in [0,1]$, and the evolution of each of the elements is described by equation
Such setups can occur, for instance, in stochastic games with many players [3] and mean-field games with almost infinitely many players [4]. Additionally, we assume that the external influence W for all its arbitrariness at each moment of time k has a structure ${s}_{k}$ of finite order. Finite structure of external influence after some transition process causes discretization of spatial elements (clustering) in the considered complex system
Discretization occurs due to the self-organization of groups of elements and their synchronization. For clusters $i=1,\dots {m}_{{s}_{k}}$, a set of ${m}_{{s}_{k}}\in N$ variables ${\overline{x}}_{i}$ averaged over cluster ${X}_{i}$ is naturally introduced. Set of ${\overline{x}}_{i}$, $i=1,\dots ,{m}_{{s}_{k}}$ could be generalized as a set of some integrals over the clusters. Such approach is usually applied to simplify physical models (dimension reduction). The general integral characteristics of clusters of elements with similar properties are introduced and dynamic models in reduced state spaces are considered. Experiments show that such simplifications are justified and often give good results.

$${\dot{x}}_{i}={g}_{i}(X,W),\phantom{\rule{0.277778em}{0ex}}X=\left\{{x}_{i}\right\},\phantom{\rule{0.277778em}{0ex}}i\in \mathbb{M}=\{1,2,\dots ,m\}$$

$${\dot{x}}_{\gamma}={g}_{\gamma}(X,W),\phantom{\rule{0.277778em}{0ex}}X=\left\{{x}_{\gamma}\right\},\phantom{\rule{0.277778em}{0ex}}\gamma \in \mathbb{M}=[0,1].$$

$${\mathcal{X}}_{{s}_{k}}=\{{X}_{1},{X}_{2},\dots ,{X}_{{m}_{{s}_{k}}}\}:\phantom{\rule{0.277778em}{0ex}}X={\cup}_{i=1,2,\dots ,{m}_{{s}_{k}}}{X}_{i},\phantom{\rule{0.277778em}{0ex}}{X}_{i}\subset X.$$

Typically, the process of clustering (self-organization) in a system is not “one-shot”, but is constantly reproduced due to changes in external influences and critical changes in internal states. But at the same time we will assume that a change in the structure of external influences does not occur permanently, but in some time instants ${T}_{0},{T}_{1},{T}_{2},\dots .$ This leads to the necessity to consider the dynamic processes under condition of state space structure change over time.

So, when the structure of external influence changes, the discretization of spatial elements may change. Let us assume that this transient process takes a duration of time no longer than some $\delta \ge 0$ (see Figure 2). We will assume that $\delta $ is many times smaller than the intervals between successive changes in the structure of external influences:
In addition to discretizing spatial variables we obtain time sampling, neglecting duration of transient process intervals. After such discretization in many practical applications the system of differential equations describes the dynamics of changes in the original complex system over the time interval from ${T}_{k}+\delta $ to ${T}_{k+1}$:
where ${\overline{x}}_{i}$ is aggregated state $\left\{{x}_{i}\right\}$ or $\left\{{x}_{\gamma}\right\}$ from cluster ${X}_{i}$, $\overline{X}=col({\overline{x}}_{1},{\overline{x}}_{2},\dots ,{\overline{x}}_{{m}_{{s}_{k}}})$, ${\theta}_{{s}_{k}}$ is a finite set of current parameters at time interval $[{T}_{k}+\delta ,{T}_{k+1})$.

$$\delta <<\zeta =\underset{k}{min}|{T}_{k+1}-{T}_{k}|$$

$$\dot{\overline{{x}_{i}}}={\overline{g}}_{i}(\overline{X},u,w,{\theta}_{{s}_{k}}),\phantom{\rule{0.277778em}{0ex}}i=1,2,\dots ,{m}_{{s}_{k}},$$

Consider a control problem of choosing the strategy of control minimizing the cost function comprised of local cost functions computed at different parts of the system:
or, in case the system consists of continuum elements (3):

$$L\left(\left\{u\right\}\right)=\sum _{i\in \mathbb{M}}l({x}_{i},u)\to \underset{u}{min}$$

$$L\left(\left\{u\right\}\right)={\int}_{\mathbb{M}}l({x}_{\gamma},u)d\gamma \to \underset{u}{min}.$$

Previously it was assumed perturbation W has a “finite structure” at each moment, and the structure of ${s}_{k}$ changes at times ${T}_{0},{T}_{1},{T}_{2},\dots $, causing clustering of the state space:
where ${\overline{x}}_{i}$ is aggregated state $\left\{{x}_{i}\right\}$ or $\left\{{x}_{\gamma}\right\}$ of cluster ${X}_{i}$, $\overline{X}=col({\overline{x}}_{1},{\overline{x}}_{2},\dots ,{\overline{x}}_{{m}_{{s}_{k}}})$, ${\theta}_{{s}_{k}}$ is finite set of current parameters. We assume that for any k the dimension of ${\theta}_{{s}_{k}}$ is bounded by d. For $t\in [{T}_{k}+\delta ,{T}_{k+1})$ due to integral (sum) additive property the loss function could be changed to:
Assume that control strategy u are piece-wise constant and changing in the end of each time interval of length h,
The feedback u will be computed on the base of noised observations of the loss function $\tilde{L}$. After sampling in time and space, we obtain an observation model for the loss function:
where ${t}_{n}\in [{T}_{K}+\delta ,{T}_{k+1})$, ${\tilde{L}}_{k}(\xb7)$ are functions from ${\overline{X}}_{n}=\overline{X}\left({t}_{n}\right),{u}_{n}=u\left({t}_{n}\right),{\theta}_{{s}_{k}},$ ${\xi}_{n}={\xi}_{n}^{\prime}+{\xi}_{n}\left({s}_{k}\right)"$ is discrepancy (error) composed of some random noise ${\xi}_{n}^{\prime}$ independent of current system structure ${s}_{k}$, and systematic error ${\xi}_{n}\left({s}_{k}\right)"$ which is, in general, is some function of the current system state.

$${\mathcal{X}}_{{s}_{k}}=\{{X}_{1},{X}_{2},\dots ,{X}_{{m}_{{s}_{k}}}\}:\phantom{\rule{0.277778em}{0ex}}X={\cup}_{i=1,2,\dots ,{m}_{{s}_{k}}}{X}_{i},\phantom{\rule{0.277778em}{0ex}}{X}_{i}\subset X$$

$$L={\int}_{\mathbb{M}}l({x}_{\gamma},u)d{x}_{\gamma}\approx \sum _{i}^{{m}_{{s}_{k}}}{\overline{l}}_{k}({\overline{x}}_{k},u,{\theta}_{{s}_{k}})={\overline{L}}_{k}(\overline{X},u,{\theta}_{{s}_{k}}).$$

$$u\left(t\right)={u}_{n},\phantom{\rule{0.277778em}{0ex}}t\in [(n-1)h,nh).$$

$${y}_{n}={\tilde{L}}_{k}({\overline{X}}_{n},{u}_{n},{\theta}_{{s}_{k}})+{\xi}_{n},$$

Estimation of system parameters values can be formulated as an optimization problem. The discrepancy between estimated and real parameter values could be expressed in terms of some loss function value and thus to solve the system identification problem one has to minimize given loss function. To identify the system structure it is required to choose such control formation strategy $\left\{u\right\}$, which minimizes some loss function.

Suppose that for a given parameter vector ${\theta}_{{s}_{k}}\in {\mathbb{R}}^{d}$ the optimal control strategy
is known. After substituting it in (8), we get the problem of minimizing the function
when observing its values against under the influence of noise ${\xi}_{n}$. Under the assumptions made, the minimum of the function ${f}_{{s}_{k}}\left(\theta \right)$ is reached when $\theta ={\theta}_{{s}_{k}}$. To solve the formulated problem, the method from [5] can be used.

$$\left\{{u}_{n}\right\}=\mathcal{U}\left({\theta}_{{s}_{k}}\right)$$

$${f}_{{s}_{k}}\left(\theta \right)={\tilde{L}}_{k}({\overline{X}}_{n},\mathcal{U}\left(\theta \right),{\theta}_{{s}_{k}})$$

The distributed optimization task can be formulated in terms of finding the method of constructing cluster (meso-) control, in which the same control action is applied to all elements of the cluster. In this case, the discretized loss function can be represented as a distributed functional
This problem could be solved by appliying the stochastic optmimization type method from [15].

$${f}_{{s}_{k}}\left(\theta \right)=\sum _{i=1}^{{m}_{{s}_{k}}}{\tilde{l}}_{k}({\overline{X}}_{n}^{i},{\mathcal{U}}^{i}\left(\theta \right),{\theta}_{{s}_{k}}).$$

The considered problem setting is a particular case of more general problem of minimizing a non-stationary differentiable function with respect to $\theta $. Let ${\mathcal{F}}_{n-1}$ be the $\sigma $-algebra of all probabilistic events which happened during time interval $n-1$ before start of time interval n. Hereinafter ${\mathbb{E}}_{{\mathcal{F}}_{n-1}}$ is a symbol of the conditional mathematical expectation with respect to the $\sigma $-algebra ${\mathcal{F}}_{n-1}$, $\mathbb{E}$ is a symbol of the mathematical expectation. The minimum point ${\theta}_{{s}_{k}}$ of function
needs to be estimated.

$${F}_{n}\left(\theta \right)={\mathbb{E}}_{{\mathcal{F}}_{n-1}}{f}_{{s}_{k}}\left(\theta \right)\to \underset{\theta}{min}$$

More precisely, using the observations ${y}_{1},{y}_{2},\dots ,{y}_{n}$ and inputs ${\theta}_{1},{\theta}_{2},\dots ,{\theta}_{n}$, construct an estimate ${\widehat{\theta}}_{n}$ of an unknown vector ${\theta}_{{s}_{k}}$ minimizing the time-varying mean-risk functional.

Let us formulate Assumptions about disturbances and functions ${f}_{{s}_{k}}\left(\theta \right),{F}_{n}\left(\theta \right)$.

- For $n=1,2,\dots $, the successive differences ${\overline{\xi}}_{n}={\xi}_{n}^{+}-{\xi}_{n}^{-}$ of observation noise are bounded: $|{\overline{\xi}}_{n}|\le {c}_{\xi}<\infty $, or $\mathbb{E}{\overline{\xi}}_{n}^{2}\le {c}_{\xi}^{2}$ if a sequence $\left\{{\xi}_{n}\right\}$ is random, where ${\xi}_{n}^{+}$, ${\xi}_{n}^{-}$ are observation noises occurred during the same time interval n but at different time instants.
- Functions ${F}_{n}(\xb7)$ have unique minimum points ${\theta}_{{s}_{k}}$ and $\forall \theta \phantom{\rule{0.277778em}{0ex}}\langle \theta -{\theta}_{{s}_{k}},{\mathbb{E}}_{{\mathcal{F}}_{n-1}}\nabla {f}_{{s}_{k}}\left(\theta \right)\rangle \ge \mu {\parallel \theta -{\theta}_{{s}_{k}}\parallel}^{2}$ with a constant $\mu >0$. Here and further $\langle \xb7,\xb7\rangle $ is a scalar product of two vectors.
- The gradient $\nabla {f}_{{s}_{k}}$ is uniformly bounded in the mean-squared sense at the minimum points ${\theta}_{t}:E{\parallel \nabla {f}_{{s}_{k}}\left({\theta}_{t}\right)\parallel}^{2}\le {g}^{2}$
- $\forall {s}_{k}\in S$ the gradient $\nabla {f}_{{s}_{k}}\left(\theta \right)$ satisfies the Lipschitz condition: $\forall {\theta}^{\prime},{\theta}^{\prime \prime}$$$\parallel \nabla {f}_{{s}_{k}}\left({\theta}^{\prime}\right)-\nabla {f}_{{s}_{k}}\left({\theta}^{\prime \prime}\right)\parallel \le M\parallel {\theta}^{\prime}-{\theta}^{\prime \prime}\parallel $$
- $\forall n\ge 1$ random vector ${\Delta}_{n}$ does not depend on ${\overline{w}}_{n}$, random vectors ${\overline{w}}_{n},{\Delta}_{n}$ do not depend on ${\overline{w}}_{1},\dots ,{\overline{w}}_{n-1}$; if $\left\{{\overline{v}}_{n}\right\}$ are random variables, then ${\overline{w}}_{n},{\Delta}_{n}$ also do not depend on ${\overline{v}}_{1},\dots ,{\overline{v}}_{n}$.

Using available observations, it is necessary to construct a sequence of estimates ${\widehat{\theta}}_{n}$ of the unknown vector ${\theta}_{{s}_{k}}$ that minimizes the function ${f}_{{s}_{k}}\left(\theta \right)$. To solve the problem, we will use an iterative algorithm with two measurements.

Let the trial simultaneous disturbance ${\Delta}_{n},\phantom{\rule{0.277778em}{0ex}}n=1,2,\dots $ be an observable (set or user-controlled) sequence of independent random vectors with known distribution functions ${\mathrm{P}}_{n}(\xb7)$ — and specified vector functions ${\mathcal{K}}_{n}(\xb7):{\mathbb{R}}^{d}\times {\mathbb{R}}^{d}\to {\mathbb{R}}^{1}$, satisfy the conditions

$$\int {\mathcal{K}}_{n}\left(x\right){\mathrm{P}}_{n}\left(dx\right)=0;\int {\mathcal{K}}_{n}\left(x\right){x}^{\mathrm{T}}{\mathrm{P}}_{n}\left(dx\right)=\mathrm{I},$$

$$\underset{n}{sup}\int {\parallel {\mathcal{K}}_{n}\left(x\right)\parallel}^{2}{\mathrm{P}}_{n}\left(dx\right)<\infty ,n=1,2,\dots .$$

Let’s choose an arbitrary initial estimate vector ${\widehat{\theta}}_{0}\in {\mathbb{R}}^{d}$ and scalar parameters $\alpha $, $\beta $ for an iterative algorithm

$$\left(\right)$$

Let us introduce the following notation:
Here ${C}_{\tau}$, ${C}_{\sigma}$ could be chosen to satisfy inequation $\mathbb{E}\parallel \frac{1}{2\beta}({f}_{n}(\theta +\beta {\Delta}_{n})+{\xi}_{n}^{+}-{f}_{n}(\theta -\beta {\Delta}_{n})-{\xi}_{n}^{-}){\parallel}^{2}\le {C}_{\sigma}+{C}_{\tau}{\parallel \theta -{\theta}_{{s}_{k}}\parallel}^{2}$ and ${C}_{1}\ge \int \parallel {\mathcal{K}}_{n}{\left(x\right)\parallel M\parallel x\parallel}^{2}{\mathrm{P}}_{n}\left(dx\right)$ and $\alpha $ chosen to satisfy following conditions:

$$\nu =\alpha \left(\right)open="("\; close=")">0.5-\mu -\frac{L\alpha {C}_{\tau}}{2},\phantom{\rule{1.em}{0ex}}\psi =\frac{\varphi}{\nu}$$

$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& 0\le \alpha \le \frac{4\mu -2}{L{C}_{\tau}},\phantom{\rule{1.em}{0ex}}\alpha \le \frac{2\mu -1-\sqrt{{(2\mu -1)}^{2}-2L{C}_{\tau}}}{L{C}_{\tau}}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \alpha \ge \frac{2\mu -1+\sqrt{{(2\mu -1)}^{2}-2L{C}_{\tau}}}{L{C}_{\tau}}.\hfill \end{array}$$

Let Assumptions 1–5 and conditions for kernels $\mathcal{K}$ (9)-(10) and α (12) be satisfied. Set ${\widehat{\theta}}_{0}$, choose interval size parameter k

$$\mathbb{E}\{\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}{\parallel}^{2}\}\le \mathbb{E}\{\parallel {\widehat{\theta}}_{0}-{\theta}_{{s}_{k}}{\parallel}^{2}\}{(1-{\nu}_{i})}^{n}+\psi (1-{(1-{\nu}_{i})}^{n}).$$

$$V\left({\widehat{\theta}}_{n}\right)=\frac{1}{2}{\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}\parallel}^{2}$$

$${Y}_{n}=\frac{1}{2\beta}\mathcal{K}\left({\Delta}_{n}\right)({y}_{n}^{+}-{y}_{n}^{-})$$

For algorithm (11) we have
where the right hand side depends only on ${\widehat{\theta}}_{n}$ and n in the sense that ${\Delta}_{n}$ does not depend on any other random variables. Proposition is true.

$$\mathbb{E}\left\{{Y}_{n}\right\}=\mathbb{E}\left\{{\mathcal{K}}_{n}\left({\Delta}_{n}\right)\frac{{y}_{n}^{+}-{y}_{n}^{-}}{2\beta}\right|{\mathcal{F}}_{n-1}\},$$

$$\parallel \nabla V\left(x\right)-\nabla V\left(\theta \right)\parallel \le L\parallel x-\theta \parallel \phantom{\rule{1.em}{0ex}}\forall x,\theta \in {\mathbb{R}}^{d}.$$

The proposition is valid due to the choice of Lyapunov function (14).

$$\langle \nabla V({\widehat{\theta}}_{n}),\mathbb{E}\left\{{Y}_{n}\right\}\rangle \ge {\delta}_{n}V\left({\widehat{\theta}}_{n}\right)-{\gamma}_{n},\phantom{\rule{0.277778em}{0ex}}{\delta}_{n}>0,\phantom{\rule{0.277778em}{0ex}}{\gamma}_{n}\ge 0.$$

At first consider $\mathbb{E}\left\{{Y}_{n}\right\}$. Due to (11) pseudo-gradient (15) after using Assumption 5 becomes
Consider the expression under mathematical expectation. Using Taylor series representation it could be written as:
Estimate absolute value of the sum of integral elements in the obtained expression. After using (9), Assumption 4 we get
Substitute the estimate, elements containing gradients of ${F}_{n}$ and (14) into (16), regard the relation $\int (\xb7){P}_{n}\left(dx\right)\ge -|\int (\xb7){P}_{n}\left(dx\right)|$ and Cauchy–Bunyakovsky–Schwarz inequality:
Apply Assumption 2 and estimate $\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}\parallel {C}_{1}\beta \le \frac{1}{2}(\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}{\parallel}^{2}+{\left({C}_{1}\beta \right)}^{2}):$
Proposition is true for $\mu >1/2$.

$$\mathbb{E}\left\{{Y}_{n}\right\}=\mathbb{E}\{{\mathcal{K}}_{n}\left({\Delta}_{n}\right)\frac{1}{2\beta}\left(\right)open="("\; close=")">{f}_{{s}_{k}}\left({\theta}_{n}^{+}\right)-{f}_{{s}_{k}}\left({\theta}_{n}^{-}\right)\left|{\mathcal{F}}_{n-1}\right\}$$

$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& \frac{1}{2}\nabla {F}_{n}\left({\widehat{\theta}}_{n}\right)+\frac{1}{2\beta}\int {\mathcal{K}}_{n}\left(x\right){x}^{T}{\int}_{0}^{1}\left(\right)open="("\; close=")">{\nabla}_{x}{f}_{{s}_{k}}({\widehat{\theta}}_{n}+t\beta x)-\nabla {f}_{{s}_{k}}\left({\widehat{\theta}}_{n}\right)dt{P}_{n}\left(dx\right)+\hfill \end{array}$$

$$|\int (\xb7){P}_{n}\left(dx\right)|+|\int (\xb7){P}_{n}\left(dx\right)|\le \frac{2{\beta}^{2}}{2\beta}\int \parallel {\mathcal{K}}_{n}\left(x\right)\parallel \parallel x\parallel M\parallel x\parallel {P}_{n}\left(dx\right)\le {C}_{1}\beta .$$

$$\langle {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}},\mathbb{E}\left\{{Y}_{n}\right\}\rangle \ge \langle {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}},\nabla {F}_{n}\left({\widehat{\theta}}_{n}\right)\rangle -\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}\parallel {C}_{1}\beta .$$

$$\langle {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}},\mathbb{E}\left\{{Y}_{n}\right\}\rangle \ge (2\mu -1)\frac{1}{2}{\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}\parallel}^{2}-\frac{1}{2}{C}_{1}^{2}{\beta}^{2}$$

$$\mathbb{E}\{\parallel {Y}_{n}{\parallel}^{2}\}\le {\sigma}_{n}^{2}+\tau V\left(x\right),\phantom{\rule{0.277778em}{0ex}}{\sigma}_{n}\ge 0,\phantom{\rule{0.277778em}{0ex}}{\tau}_{n}\ge 0.$$

Using (10), Cauchy–Bunyakovsky–Schwarz inequality, Assumptions 1 and 3 it could be shown that
Proposition holds true with $\tau =2{C}_{2}{\beta}^{2}$, ${\sigma}_{n}^{2}={C}_{3}{\beta}^{4}+{C}_{4}{\xi}_{n}^{2}$.

$$\begin{array}{cc}\hfill \mathbb{E}\{\parallel {Y}_{n}{\parallel}^{2}\}& \le \frac{1}{2}\underset{x}{sup}{\mathcal{K}}_{n}{\left(x\right)}^{2}\mathbb{E}\{{\left({\xi}_{n}^{+}\right)}^{2}+{\left({\xi}_{n}^{-}\right)}^{2}|{\mathcal{F}}_{n-1}\}+\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& +\int {\left(\right)}^{{f}_{{s}_{k}}}2\parallel {\mathcal{K}}_{n}{\left(x\right)\parallel}^{2}{\mathrm{P}}_{n}\left(dx\right)\le {C}_{2}{\beta}^{2}(\parallel {\widehat{\theta}}_{n-1}-{\theta}_{{s}_{k}}{\parallel}^{2})+{C}_{3}{\beta}^{4}+{C}_{4}{\xi}_{n}^{2}\hfill \end{array}$$

Proposition is valid due to arbitrariness of initial approximation ${\widehat{\theta}}_{0}$ choice and an assumption regarding final order ${s}_{k}$ of the external disturbance W affecting the system and thus the final order of the system state vector.

The first inequality could be met by choice of α and the second one is true since ν is constant. Proposition is true.

The fulfillment of the given propositions allow to prove the theorem using the result in [10]. □

After the algorithm converges the parameter estimates ${\widehat{\theta}}_{n}$ continue to fluctuate around the true parameter value ${\theta}_{{s}_{k}}$ until the new change of system structure and value of ${\theta}_{{s}_{k}}$.

The most common problem setup for unknown-but-bound disturbances in existing works is formulated as follows. It is required to minimize the value of the objective function $f\left(\theta \right)$ with some adversarial deterministic noise $\xi \left(\theta \right)$ such that $\left|\xi \right(\theta \left)\right|\le \xi $ and $\xi >0$):
The considered setup implies that external noise depends on system state ${s}_{k}$ which is not a direct function of θ but rather is affected by values of ${\theta}_{t}$ during some time period $[t-d,t)$
Value ${s}_{k}$ is formed on basis of previous values of θ.

$$\tilde{f}\left(\theta \right)=f\left(\theta \right)+\xi \left(\theta \right).$$

$$\tilde{f}\left(\theta \right)=f\left(\theta \right)+\xi \left({s}_{k}\right).$$

For astronomers, an urgent task was to obtain images of objects that can’t be recorded using optical telescopes, which are situated on earth or in space. This problem was largely solved using radio telescopes [11]. The main tasks of the telescope are: to collect radiation that falls on the mirror system with minimal losses, and also obtain the most accurate image of the object [6]. There are various ways to solve this problem, consider for example, [8]. One of them is improving the quality of the device [12], which collects radiation for obtaining images. Another one is combining radio telescopes into systems [9]. If the radiation is collected with significant errors, then the image will have disturbances [27].

An important and time-consuming part of image acquisition is the precise tuning of the radio telescope antenna (or systems consisting of such antennas) [13]. The quality of the image obtained on a radio telescope directly depends on the quality of the construction of a reflecting system of mirrors that focuses the radiation coming from outside. To improve the image quality, it is necessary to focus the radiation of the device in such a way that it works as accurately as possible, especially if it is located in space [14]. Traditional antenna tuning algorithms are sufficient for the task. However, they lose their effectiveness under uncontrolled unpredictable external influences. We consider the case when these are deformations of the radio telescope shields that arise due to environmental influences such as temperature changes, wind and other influences.

In practice, radiation is subject to various distorting influences, and as a result the quality of the observed image decreases, despite the presence of various stabilizers and filters [11]. One of the ways to solve such a problem is using randomized stochastic optimization algorithms [15]. A method for improving image quality by improving the tuning characteristics of the radio telescope mirror system is considered in [25].

In an ideal antenna system, the signal is reflected from different mirror plates and assembled at one point. Deformations of radio telescope structures, external temperature, wind, mechanical influences lead to deviation of the optical path lengths of the rays from the required ones. As a result, the focus point on the plane of the receiver shifts [12]. To improve reflection accuracy on the surface of radio telescope mirrors, the following methods are used: autocollimation [22], telescope calibration by spectral density radiation flux, synchronous calibration method [21], laser geodetic measurements [6], improvement of the kinematics of antenna elements [19], radio holography physical method [24]. The goal is to develop a stochastic optimization type algorithm to improve the quality of the settings of the radio telescope mirror system model which could be used in a system similar to Radioastron [20,26]. The main criterion for efficiency is the recording power of the desired signal and the time required for adjusting the parameters.

The antenna segments can be set to the optimal position to improve the quality of image recording. Consider an irradiator (radiation generator), a receiver and a mirror system of a radio telescope, consisting of identical plates that reflect the incoming signal ($i=1,\dots ,N,N=895$ in the RATAN-600 installation). Radiation is created in the irradiator, which, falling on the plates and focused in the receiver. Let’s divide the number axis into time intervals of duration $\delta $, starting from some moment ${t}_{0}$, where ${T}_{k}$ is the k-th time interval k is the index of the time interval. Assume that we know:

1) the position (orientation) of each i-th plate, which is specified by the vector of parameters ${({a}_{i},{b}_{i},{c}_{i})}^{T}$, where ${a}_{i}$ is the rotation angle of the i-th reflective element horizontally; ${b}_{i}$ is the vertical rotation angle, ${c}_{i}$ is the forward horizontal displacement of the i-th reflecting element. Let ${\theta}_{k}$ be a vector that contains all parameters of the mirror system in a given time interval k, ${\theta}_{k}={({a}_{1},{a}_{2},...,{a}_{N},{b}_{1},{b}_{2},...,{b}_{N},{c}_{1},{c}_{2},...,{c}_{N})}^{T},$

2) radiation coming from each mirror (${z}_{i}$),

3) the common signal coming from all mirrors to the receiver $Z\left({\theta}_{k}\right)={\sum}_{i=1,\dots ,N}\left({z}_{i}\right)$,

4) characteristics of the signal in the generator.

The front of a signal is the sum of harmonics that comes to us with different phases from a certain direction. The perfect placement of the plates brings all radiation from the objects into focus. We obtain the signal as a sum of sines with phases. Different ${z}_{i}$ arrive at different times with different phases. Let us evaluate the difference between the signals from an ideal antenna and from a real one (with deformations). Signals reflected from ideal mirror segments will have the same phase ${\varphi}_{i}$ (${\varphi}_{1}={\varphi}_{2}=\cdots ={\varphi}_{N}$). The signals from segments with deformations will look like this:
The objective function of the problem $\left(F\right(\theta )$ is the signal power) is defined as follows:
It is required to maximize the objective function:

$${z}_{1}=sin(\omega t+{\varphi}_{1}),{z}_{2}=sin(\omega t+{\varphi}_{2}),\cdots {z}_{i}=sin(\omega t+{\varphi}_{i})$$

$$F\left(\theta \right)=\overline{{lim}_{k}}{\int}_{t\in {T}_{k}}{\left|Z(\theta ,t)\right|}^{2}dt.$$

$$F\left(\theta \right)\to \underset{\theta}{max}$$

We consider the problem of optimizing the position of mirrors in the limit over time (not over a specific time interval).

The reflective elements of the antenna are made exactly the same, so they provide equivalent observations in all directions.

At the same time, if one moves along an ideal reflective surface, its local characteristics change. Therefore, a real reflective surface composed of identical elements will repeat deviations from the ideal surface from element to element. These deviations will be greater, if the shape of the surface of the element differs from the shape of that portion of the ideal surface which this element should represent. If the size of the reflective elements increases, then the deviations naturally increase. These deviations are an error distributed over the reflector of a variable profile antenna, which, at large values, creates unacceptable distortions reducing the efficiency of the antenna. We will call ${v}_{k}$ the signal power measurement errors arising due to unknown and uncontrolled deformations in the reflective elements of the antenna caused by weather, wind, and temperature changes.

After conventional procedures for adjusting the inclination angles and positions of the radio telescope mirrors, we obtain an initial approximation ${\widehat{\theta}}_{0}$ to the optimal tuning values. Then there is still the possibility of “tuning” in a certain neighborhood T containing ${\widehat{\theta}}_{0}$ and the optimal value ${\theta}^{\u2605}$ corresponding to the maximum power of the received signal. We use a stochastic optimization algorithm with two measurements per iteration, which allows us to reduce the negative impact of various disturbances on the power of the recorded signal [7]:

Select ${\theta}_{0}.$

$n-1\to n$

We sequentially generate vectors ${\Delta}_{n}$ with components from +/-1, which are chosen with equal probability.

We measure the power values for two positions of the antenna system:

$({\theta}_{n}+\beta {\Delta}_{n})$ and $({\theta}_{n}-\beta {\Delta}_{n}).$

Measurements are obtained with noise ${v}_{2n}$ and ${v}_{2n-1}$ :
Next, we form the following estimate ${\theta}_{2n}$ according to the rule:
where $\alpha ,\beta $ are the parameters of the algorithm, ${\mathcal{P}}_{\mathcal{T}}$ is the projection onto the set $\mathcal{T}$.

$${y}_{2n}={P}_{2n}({\theta}_{2n}+\beta {\Delta}_{n})+{\xi}_{2n};$$

$${y}_{2n-1}={P}_{2n-1}({\theta}_{2n-1}-\beta {\Delta}_{n})+{\xi}_{2n-1};$$

$${\theta}_{2n}={\mathcal{P}}_{\mathcal{T}}({\theta}_{2n-1}+\frac{\alpha {K}_{n}\left({\Delta}_{n}\right)}{2\beta}({y}_{2n}-{y}_{2n-1})),$$

To construct the kernels ${K}_{0}(\xb7)$ and ${K}_{1}(\xb7)$ on the interval $[-1/2,1/2]$ orthogonal Legendre polynomials could be used. In this case, for initial values $\ell =1,2\phantom{\rule{0.277778em}{0ex}}\phantom{\rule{0.277778em}{0ex}}$ (i.e. $2\le \gamma \le 3$) the type of kernels is as follows: ${K}_{0}\left(q\right)=12q,\phantom{\rule{1.em}{0ex}}{K}_{1}\left(q\right)=1,\phantom{\rule{1.em}{0ex}}\mid q\mid \le 1/2,$

for $\ell =3,4\phantom{\rule{1.em}{0ex}}$ (i.e. $3<\gamma \le 5$):

${K}_{0}\left(q\right)=5q(15-84{q}^{2}),\phantom{\rule{0ex}{0ex}}\phantom{\rule{1.em}{0ex}}{K}_{1}\left(q\right)=9/4-15{q}^{2},\phantom{\rule{1.em}{0ex}}\mid q\mid \le 1/2,$

and for $\left|q\right|>1/2$ both functions are equal to zero.

The test disturbance is formed in such a way that $\forall n\ge 1$ random vector ${\Delta}_{n}$ does not depend on ${\overline{v}}_{1},\dots ,{\overline{v}}_{k}$ and $\mathrm{E}\{{({v}_{2n}-{v}_{2n-1})}^{2}/2\}\le {\sigma}_{2}^{2},\phantom{\rule{0.277778em}{0ex}}(\mathrm{E}\left\{{v}_{n}^{2}\right\}\le {\sigma}_{1}^{2});$

The main element that reflects the incoming signal is the segment of the mirror system. It is important to configure these shields so that the signal comes into focus. The system allows to customize the required dimensions of the reflective shield and its curvature. Stochastic optimisation algorithm (11) can be applied for tuning a system which consists of a large number of telescopes in space under conditions of interference and also when individual elements of the system are deformed. About 3000 parameters of the mirror surfaces are adjusted, during focusing of the mirror system.The faster setting up of the mirrors is very important for the accuracy of observations. For acceleration, it is proposed to use the Nesterov acceleration method in distributed form [16,17,18]. In Figure 3 on page 11, the dependence of the signal power on the number of iterations of the algorithm is given.

Conceptualization, O.G.; methodology, O.G.; software, K.D.; formal analysis, Y.I.; writing—original draft preparation, K.D. and Y.I.; writing—review and editing, O.G, and Y.I.; visualization, K.D.; All authors have read and agreed to the published version of the manuscript.

The work was supported by the IPME RAS by Russian Science Foundation (project no. 21-19-00516).

The authors declare no conflict of interest.

- Granichin, O.; Uzhva, D.; Volkovich, Z. Cluster Flows and Multiagent Technology. Mathematics
**2021**, 9, 22. [Google Scholar] [CrossRef] - Granichin, O. What is an Actual Structure of Complex Information-Control Systems? Stochastic optimization in informatics. 2016. Is. 12, No 1.
- Parilina, E.; Petrosyan, L. On a Simplified Method of Defining Characteristic Function in Stochastic Games. Mathematics
**2020**, 8, 1135. [Google Scholar] [CrossRef] - Achdou, Y. , Cardaliaguet, P., Delarue, F., Porretta, A. and Santambrogio, F., 2021. Mean Field Games: Cetraro, Italy 2019 (Vol. 2281). Springer Nature.
- Granichin, O. and Amelina, N., 2014. Simultaneous perturbation stochastic approximation for tracking under unknown but bounded disturbances. IEEE Transactions on Automatic Control, 60(6), pp.1653-1658. [CrossRef]
- Ermakov A., N. , Kovalev Yu. A. Project “RadioAstron”. Calibration of a space telescope in flight - automation of measurement processing of the ASC FIAN, Moscow, Russia, Proceedings of the Institute of Applied Astronomy of the Russian Academy of Sciences, vol. 54. 2020. [Google Scholar]
- Granichin, O.N. , Polyak B.T. Randomized estimation and optimization algorithms under almost arbitrary noise. M.: Nauka, 2003. 291 p.
- Droszcz, A. , J˛edrzejewski, K. , Kłos, J., Kulpa, K., Pozoga M. Beamforming of LOFAR Radio-Telescope for Passive Radiolocation Purposes. Remote Sens.
**2021**, 13, 810. [Google Scholar] [CrossRef] - Leonid I. Gurvits. Advances in Space Research Volume 65, Issue 2, 15 January 2020, Pages 868-876 Space VLBI: from first ideas to operational missions Author links open overlay panel.
- Polyak, B.T. Convergence and rate of convergence in iterative stochastic processes. I. The general case. 1976. Avtomatika i telemekhanika, No. 12, pp.83-94.
- Kardashev, N.S. and others. “RADIOASTRON”: results of the implementation of the scientific research program for 5 years of flight // Bulletin of NPO im. S.A. Lavochkina. 2016. No. 3. P. 4-24.
- Dubarenko, V.V. , Kuchmin A.Yu., Artemenko Yu.N., Shishlakov V.F. Millimeter-wave radio telescopes with adjustable mirror surfaces: monograph. – St. Petersburg: GUAP, 2019.
- Monakhova, U.V. , Ivanov D.S. Formation of a swarm of nanosatellites using decentralized aerodynamic control taking into account communication restrictions // Preprints of the Institute for Problems of Materials Science. M.V.Keldysh. 2018. No. 151. 32 p. [CrossRef]
- Bentum, M.J. , Verma M.K., Rajan R.T., Boonstra A.J., Verhoeven C.J.M., Gill E.K.A., van der Veen A.J., Falcke H., Klein Wolt M., Monna B., Engelen S., Rotteveel J., Gurvits L.I. A Roadmap towards a Space-based Radio Telescope for Ultra-Low Frequency Radio Astronomy. [CrossRef]
- Granichin, O.N. , Erofeeva V.A., Ivanskiy Y.V., Jiang Y. Simultaneous Perturbation Stochastic Approximation-Based Consensus for Tracking Under Unknown-But-Bounded Disturbances. IEEE Transactions on Automatic Control 2021, Vol. 66, Is. 8 PP. 3710–3717. [CrossRef]
- Rogozin, A. , Yarmoshik D., Kopylova K. Gasnikov A. P: Decentralized Strongly-Convex Optimization with Affine Constraints: Primal and Dual Approaches 2022.
- Nesterov, Y. A method of solving a convex programming problem with convergence rate o(1/k 2 ). Soviet Mathematics Doklady, 27(2):372–376, 1983.
- Nesterov, Y. Introductory Lectures on Convex Optimization: a basic course. Kluwer Academic Publishers, Massachusetts, 2004.
- Zharov, V.I. , Sotnikova Yu.V. Methodology for determining the kinematic characteristics of the elements of the main mirror of the RATAN-600 radio telescope using modern laser measuring systems. Astrophysical Bulletin, 2017. Vol. 72, No. 4, pp. 520–526.
- Moisheev, A.A. Creation of space segments of astrophysical observatories. Bulletin 2.
- Sotnikova, Yu.V. , Kovalev Yu.A., Erkenov A.K., Method of synchronous calibration of RATAN-600 using 2 of its sectors. Astrophysical Bulletin, 2019. Vol. 74, No. 4, pp. 535–543. [CrossRef]
- Khaikin, V.B. , Bursov N.N. Autocollimation automatic adjustment and control of the efficiency of elements of the RATAN-600 radio telescope. Journal of Radioelectronics, 2016, pp. 1684-1719.
- Mingaliev, M.G. RATAN-600 - current state and prospects. Abstracts of the report. Conf. RT-2002, in Pushchino, 2002. P. 80.
- Khaikin, V.B. , Lebedev M.K., Ripak A.M. A method for radio-holographic monitoring of the surface of the main mirror of the RATAN-600 radio telescope with radial movement of the support element. Journal of Radioelectronics, 2016, pp. 1684-1719.
- Kopylova, K.D. , Granichin O.N. Minimizing the systematic error of a radio astronomy telescope using a randomized stochastic optimization algorithm. Proceedings of the 13th multi-conference on management issues, 2020.
- Kovalev, Yu.A. , Sotnikova Yu.V., Erkenov A.K., Popkov A.V., Volvach L. N., Vasilkov V. I., Lisakov M. M., Semenova T. A., Tsybulev P. G. Features of calibration of the space radio telescope "RadioAstron" and the radio telescope RATAN-600. // Proceedings of the IPA RAS. – St. Petersburg: IPA RAS, 2018. Issue. 47.- pp. 38-42.
- Minchenko, B.S. Synthesis of radio images on the RATAN-600 radio telescope Izv. universities Radiophysics. 1983. T. 26, No. 11. P. 1463–1471.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Submitted:

06 December 2023

Posted:

07 December 2023

You are already at the latest version

Alerts

This version is not peer-reviewed

Submitted:

06 December 2023

Posted:

07 December 2023

You are already at the latest version

Alerts

Discretization in time and in the state space of the system leads to the necessity to solve the parameter identification problems for dynamic systems in a limited time (at a finite time interval) using observations obtained under the influence of unknown-but-bounded noise. Finding the solution in this case is more difficult compared to traditional identification problem setting which considers random independent zero-mean noise. For system parameter identification problem under unknown-but-bounded noise, a randomized stochastic optimization algorithm is given in the paper, estimates for the mean square values of the residuals for a finite observation interval are obtained. An example of application of the given method to the problem of tuning the parameters of a multi-mirror telescope is considered.

Keywords:

Subject: Computer Science and Mathematics - Mathematics

An exact solution to any problem is possible with an accurate formulation of the problem, but connections and relationships in the real world are so complex and diverse that it is almost impossible to describe many phenomena strictly in mathematical language. A typical approach in theory is to select a mathematical model close to real processes and include various noises related, on the one hand, to the roughness of the mathematical model and, on the other hand, characterizing uncontrolled external disturbances affecting the considered object or system. For all mathematical models, the result of the experiment is a mathematical object: a number, a set of numbers, a curve, etc. From a mathematical point of view, a significant range of applied problems aims to restore characteristics from experimental data (parameters) of the object. At the same time, real systems are rarely described thoroughly by limited mathematical models. When choosing a model to solve a real problem, it is common to consider so-called systematic error (model error), which can be quantified by the distance from the real operator to the selected model. Another type of error that an experimenter may encounter is associated with measurement errors. Such errors are called statistical errors (random errors). The process of selecting characteristics (parameters) of a model from a given class of models to describe the results in a best way is one of the general definitions of estimation. In practice the estimation process can often be related to some quantitative characteristic of the quality of estimation and, it is natural while choosing the estimates, to try to minimize the negative impact of errors, both statistical and, if possible, systematic [2]. Examples include:

- formation of multiscale vortex structures in turbulent fluid flows and plastic flow of solids under pulsed load;
- clustering in a stream of concentrated dispersive mixtures;
- propagation of the shock wave front inside the substance;
- transition layers near interphase boundaries;
- processes of protein formation in cells;
- hierarchy of structures in living systems;
- processes of fission of heavy elements nuclei;
- thermonuclear fusion;
- as well as the behavior of groups of people.

Among new directions in distributed systems research connections between distributed systems theory on the one hand, and canonical problems in turbulence and statistical mechanics on the other could be suggested. In one class of problems, spatio-temporal dynamical analysis clarifies old and complex questions in the theory of shear flow turbulence. In another class of problems, structured, distributed control design exhibits dimensionality-dependence and phase transition phenomena similar to those in statistical mechanics.

Assume that (unidirectional) time t is introduced, and consider non-isolated systems consisting of elements. Evolution of each system over time is determined by the current states of both the system itself and other elements of the system. The evolution is also affected by external disturbing influences W (the absence of influence can be interpreted as zero impact). External influences W can be formally included in the general set of system states. The inclusion of W into of the system state can significantly complicate the descriptive model. External influences W naturally fall into two groups:
controlled ones u (or simply, control) and uncontrolled ones w.

$$W=\left(\begin{array}{c}u\\ w\end{array}\right),$$

It is usually assumed that at time t the system state $X\left(t\right)$ is finite-dimensional, and to describe the dynamics of the system a system of differential equations is used
$${\dot{x}}_{i}={g}_{i}(X,W),\phantom{\rule{0.277778em}{0ex}}X=\left\{{x}_{i}\right\},\phantom{\rule{0.277778em}{0ex}}i\in \mathbb{M}=\{1,2,\dots ,m\}$$
with some functions ${g}_{i}(\xb7)$ and external disturbances W. In models of complex systems of this type, consisting of a huge number of components, it is customary to assume a large dimension n of state space. But, on the one hand, the choice of the threshold for n significantly limits the “upper bound” of complexity, and, on the other hand, does not allow to take possible “flexibility” of the system into account during its changing process. We will assume that the system consists of a continuum of elements $X=\left\{{x}_{\gamma}\right\}$, parameterized by $\gamma \in [0,1]$, and the evolution of each of the elements is described by equation
$${\dot{x}}_{\gamma}={g}_{\gamma}(X,W),\phantom{\rule{0.277778em}{0ex}}X=\left\{{x}_{\gamma}\right\},\phantom{\rule{0.277778em}{0ex}}\gamma \in \mathbb{M}=[0,1].$$
Such setups can occur, for instance, in stochastic games with many players [3] and mean-field games with almost infinitely many players [4]. Additionally, we assume that the external influence W for all its arbitrariness at each moment of time k has a structure ${s}_{k}$ of finite order. Finite structure of external influence after some transition process causes discretization of spatial elements (clustering) in the considered complex system
$${\mathcal{X}}_{{s}_{k}}=\{{X}_{1},{X}_{2},\dots ,{X}_{{m}_{{s}_{k}}}\}:\phantom{\rule{0.277778em}{0ex}}X={\cup}_{i=1,2,\dots ,{m}_{{s}_{k}}}{X}_{i},\phantom{\rule{0.277778em}{0ex}}{X}_{i}\subset X.$$
Discretization occurs due to the self-organization of groups of elements and their synchronization. For clusters $i=1,\dots {m}_{{s}_{k}}$, a set of ${m}_{{s}_{k}}\in N$ variables ${\overline{x}}_{i}$ averaged over cluster ${X}_{i}$ is naturally introduced. Set of ${\overline{x}}_{i}$, $i=1,\dots ,{m}_{{s}_{k}}$ could be generalized as a set of some integrals over the clusters. Such approach is usually applied to simplify physical models (dimension reduction). The general integral characteristics of clusters of elements with similar properties are introduced and dynamic models in reduced state spaces are considered. Experiments show that such simplifications are justified and often give good results.

Typically, the process of clustering (self-organization) in a system is not “one-shot”, but is constantly reproduced due to changes in external influences and critical changes in internal states. But at the same time we will assume that a change in the structure of external influences does not occur permanently, but in some time instants ${T}_{0},{T}_{1},{T}_{2},\dots .$ This leads to the necessity to consider the dynamic processes under condition of state space structure change over time.

So, when the structure of external influence changes, the discretization of spatial elements may change. Let us assume that this transient process takes a duration of time no longer than some $\delta \ge 0$ (see Figure 2). We will assume that $\delta $ is many times smaller than the intervals between successive changes in the structure of external influences:
In addition to discretizing spatial variables we obtain time sampling, neglecting duration of transient process intervals. After such discretization in many practical applications the system of differential equations describes the dynamics of changes in the original complex system over the time interval from ${T}_{k}+\delta $ to ${T}_{k+1}$:
$$\dot{\overline{{x}_{i}}}={\overline{g}}_{i}(\overline{X},u,w,{\theta}_{{s}_{k}}),\phantom{\rule{0.277778em}{0ex}}i=1,2,\dots ,{m}_{{s}_{k}},$$
where ${\overline{x}}_{i}$ is aggregated state $\left\{{x}_{i}\right\}$ or $\left\{{x}_{\gamma}\right\}$ from cluster ${X}_{i}$, $\overline{X}=col({\overline{x}}_{1},{\overline{x}}_{2},\dots ,{\overline{x}}_{{m}_{{s}_{k}}})$, ${\theta}_{{s}_{k}}$ is a finite set of current parameters at time interval $[{T}_{k}+\delta ,{T}_{k+1})$.

$$\delta <<\zeta =\underset{k}{min}|{T}_{k+1}-{T}_{k}|$$

Consider a control problem of choosing the strategy of control minimizing the cost function comprised of local cost functions computed at different parts of the system:
or, in case the system consists of continuum elements (3):
$$L\left(\left\{u\right\}\right)={\int}_{\mathbb{M}}l({x}_{\gamma},u)d\gamma \to \underset{u}{min}.$$

$$L\left(\left\{u\right\}\right)=\sum _{i\in \mathbb{M}}l({x}_{i},u)\to \underset{u}{min}$$

Previously it was assumed perturbation W has a “finite structure” at each moment, and the structure of ${s}_{k}$ changes at times ${T}_{0},{T}_{1},{T}_{2},\dots $, causing clustering of the state space:
$${\mathcal{X}}_{{s}_{k}}=\{{X}_{1},{X}_{2},\dots ,{X}_{{m}_{{s}_{k}}}\}:\phantom{\rule{0.277778em}{0ex}}X={\cup}_{i=1,2,\dots ,{m}_{{s}_{k}}}{X}_{i},\phantom{\rule{0.277778em}{0ex}}{X}_{i}\subset X$$
where ${\overline{x}}_{i}$ is aggregated state $\left\{{x}_{i}\right\}$ or $\left\{{x}_{\gamma}\right\}$ of cluster ${X}_{i}$, $\overline{X}=col({\overline{x}}_{1},{\overline{x}}_{2},\dots ,{\overline{x}}_{{m}_{{s}_{k}}})$, ${\theta}_{{s}_{k}}$ is finite set of current parameters. We assume that for any k the dimension of ${\theta}_{{s}_{k}}$ is bounded by d. For $t\in [{T}_{k}+\delta ,{T}_{k+1})$ due to integral (sum) additive property the loss function could be changed to:
$$L={\int}_{\mathbb{M}}l({x}_{\gamma},u)d{x}_{\gamma}\approx \sum _{i}^{{m}_{{s}_{k}}}{\overline{l}}_{k}({\overline{x}}_{k},u,{\theta}_{{s}_{k}})={\overline{L}}_{k}(\overline{X},u,{\theta}_{{s}_{k}}).$$
Assume that control strategy u are piece-wise constant and changing in the end of each time interval of length h,
The feedback u will be computed on the base of noised observations of the loss function $\tilde{L}$. After sampling in time and space, we obtain an observation model for the loss function:
where ${t}_{n}\in [{T}_{K}+\delta ,{T}_{k+1})$, ${\tilde{L}}_{k}(\xb7)$ are functions from ${\overline{X}}_{n}=\overline{X}\left({t}_{n}\right),{u}_{n}=u\left({t}_{n}\right),{\theta}_{{s}_{k}},$ ${\xi}_{n}={\xi}_{n}^{\prime}+{\xi}_{n}\left({s}_{k}\right)"$ is discrepancy (error) composed of some random noise ${\xi}_{n}^{\prime}$ independent of current system structure ${s}_{k}$, and systematic error ${\xi}_{n}\left({s}_{k}\right)"$ which is, in general, is some function of the current system state.

$$u\left(t\right)={u}_{n},\phantom{\rule{0.277778em}{0ex}}t\in [(n-1)h,nh).$$

$${y}_{n}={\tilde{L}}_{k}({\overline{X}}_{n},{u}_{n},{\theta}_{{s}_{k}})+{\xi}_{n},$$

Estimation of system parameters values can be formulated as an optimization problem. The discrepancy between estimated and real parameter values could be expressed in terms of some loss function value and thus to solve the system identification problem one has to minimize given loss function. To identify the system structure it is required to choose such control formation strategy $\left\{u\right\}$, which minimizes some loss function.

Suppose that for a given parameter vector ${\theta}_{{s}_{k}}\in {\mathbb{R}}^{d}$ the optimal control strategy
is known. After substituting it in (8), we get the problem of minimizing the function
$${f}_{{s}_{k}}\left(\theta \right)={\tilde{L}}_{k}({\overline{X}}_{n},\mathcal{U}\left(\theta \right),{\theta}_{{s}_{k}})$$
when observing its values against under the influence of noise ${\xi}_{n}$. Under the assumptions made, the minimum of the function ${f}_{{s}_{k}}\left(\theta \right)$ is reached when $\theta ={\theta}_{{s}_{k}}$. To solve the formulated problem, the method from [5] can be used.

$$\left\{{u}_{n}\right\}=\mathcal{U}\left({\theta}_{{s}_{k}}\right)$$

The distributed optimization task can be formulated in terms of finding the method of constructing cluster (meso-) control, in which the same control action is applied to all elements of the cluster. In this case, the discretized loss function can be represented as a distributed functional
$${f}_{{s}_{k}}\left(\theta \right)=\sum _{i=1}^{{m}_{{s}_{k}}}{\tilde{l}}_{k}({\overline{X}}_{n}^{i},{\mathcal{U}}^{i}\left(\theta \right),{\theta}_{{s}_{k}}).$$
This problem could be solved by appliying the stochastic optmimization type method from [15].

The considered problem setting is a particular case of more general problem of minimizing a non-stationary differentiable function with respect to $\theta $. Let ${\mathcal{F}}_{n-1}$ be the $\sigma $-algebra of all probabilistic events which happened during time interval $n-1$ before start of time interval n. Hereinafter ${\mathbb{E}}_{{\mathcal{F}}_{n-1}}$ is a symbol of the conditional mathematical expectation with respect to the $\sigma $-algebra ${\mathcal{F}}_{n-1}$, $\mathbb{E}$ is a symbol of the mathematical expectation. The minimum point ${\theta}_{{s}_{k}}$ of function
$${F}_{n}\left(\theta \right)={\mathbb{E}}_{{\mathcal{F}}_{n-1}}{f}_{{s}_{k}}\left(\theta \right)\to \underset{\theta}{min}$$
needs to be estimated.

More precisely, using the observations ${y}_{1},{y}_{2},\dots ,{y}_{n}$ and inputs ${\theta}_{1},{\theta}_{2},\dots ,{\theta}_{n}$, construct an estimate ${\widehat{\theta}}_{n}$ of an unknown vector ${\theta}_{{s}_{k}}$ minimizing the time-varying mean-risk functional.

Let us formulate Assumptions about disturbances and functions ${f}_{{s}_{k}}\left(\theta \right),{F}_{n}\left(\theta \right)$.

- For $n=1,2,\dots $, the successive differences ${\overline{\xi}}_{n}={\xi}_{n}^{+}-{\xi}_{n}^{-}$ of observation noise are bounded: $|{\overline{\xi}}_{n}|\le {c}_{\xi}<\infty $, or $\mathbb{E}{\overline{\xi}}_{n}^{2}\le {c}_{\xi}^{2}$ if a sequence $\left\{{\xi}_{n}\right\}$ is random, where ${\xi}_{n}^{+}$, ${\xi}_{n}^{-}$ are observation noises occurred during the same time interval n but at different time instants.
- Functions ${F}_{n}(\xb7)$ have unique minimum points ${\theta}_{{s}_{k}}$ and $\forall \theta \phantom{\rule{0.277778em}{0ex}}\langle \theta -{\theta}_{{s}_{k}},{\mathbb{E}}_{{\mathcal{F}}_{n-1}}\nabla {f}_{{s}_{k}}\left(\theta \right)\rangle \ge \mu {\parallel \theta -{\theta}_{{s}_{k}}\parallel}^{2}$ with a constant $\mu >0$. Here and further $\langle \xb7,\xb7\rangle $ is a scalar product of two vectors.
- The gradient $\nabla {f}_{{s}_{k}}$ is uniformly bounded in the mean-squared sense at the minimum points ${\theta}_{t}:E{\parallel \nabla {f}_{{s}_{k}}\left({\theta}_{t}\right)\parallel}^{2}\le {g}^{2}$
- $\forall {s}_{k}\in S$ the gradient $\nabla {f}_{{s}_{k}}\left(\theta \right)$ satisfies the Lipschitz condition: $\forall {\theta}^{\prime},{\theta}^{\prime \prime}$$$\parallel \nabla {f}_{{s}_{k}}\left({\theta}^{\prime}\right)-\nabla {f}_{{s}_{k}}\left({\theta}^{\prime \prime}\right)\parallel \le M\parallel {\theta}^{\prime}-{\theta}^{\prime \prime}\parallel $$
- $\forall n\ge 1$ random vector ${\Delta}_{n}$ does not depend on ${\overline{w}}_{n}$, random vectors ${\overline{w}}_{n},{\Delta}_{n}$ do not depend on ${\overline{w}}_{1},\dots ,{\overline{w}}_{n-1}$; if $\left\{{\overline{v}}_{n}\right\}$ are random variables, then ${\overline{w}}_{n},{\Delta}_{n}$ also do not depend on ${\overline{v}}_{1},\dots ,{\overline{v}}_{n}$.

Using available observations, it is necessary to construct a sequence of estimates ${\widehat{\theta}}_{n}$ of the unknown vector ${\theta}_{{s}_{k}}$ that minimizes the function ${f}_{{s}_{k}}\left(\theta \right)$. To solve the problem, we will use an iterative algorithm with two measurements.

Let the trial simultaneous disturbance ${\Delta}_{n},\phantom{\rule{0.277778em}{0ex}}n=1,2,\dots $ be an observable (set or user-controlled) sequence of independent random vectors with known distribution functions ${\mathrm{P}}_{n}(\xb7)$ — and specified vector functions ${\mathcal{K}}_{n}(\xb7):{\mathbb{R}}^{d}\times {\mathbb{R}}^{d}\to {\mathbb{R}}^{1}$, satisfy the conditions
$$\int {\mathcal{K}}_{n}\left(x\right){\mathrm{P}}_{n}\left(dx\right)=0;\int {\mathcal{K}}_{n}\left(x\right){x}^{\mathrm{T}}{\mathrm{P}}_{n}\left(dx\right)=\mathrm{I},$$
$$\underset{n}{sup}\int {\parallel {\mathcal{K}}_{n}\left(x\right)\parallel}^{2}{\mathrm{P}}_{n}\left(dx\right)<\infty ,n=1,2,\dots .$$

Let’s choose an arbitrary initial estimate vector ${\widehat{\theta}}_{0}\in {\mathbb{R}}^{d}$ and scalar parameters $\alpha $, $\beta $ for an iterative algorithm

$$\left(\right)$$

Let us introduce the following notation:
$$\nu =\alpha \left(\right)open="("\; close=")">0.5-\mu -\frac{L\alpha {C}_{\tau}}{2},\phantom{\rule{1.em}{0ex}}\psi =\frac{\varphi}{\nu}$$
Here ${C}_{\tau}$, ${C}_{\sigma}$ could be chosen to satisfy inequation $\mathbb{E}\parallel \frac{1}{2\beta}({f}_{n}(\theta +\beta {\Delta}_{n})+{\xi}_{n}^{+}-{f}_{n}(\theta -\beta {\Delta}_{n})-{\xi}_{n}^{-}){\parallel}^{2}\le {C}_{\sigma}+{C}_{\tau}{\parallel \theta -{\theta}_{{s}_{k}}\parallel}^{2}$ and ${C}_{1}\ge \int \parallel {\mathcal{K}}_{n}{\left(x\right)\parallel M\parallel x\parallel}^{2}{\mathrm{P}}_{n}\left(dx\right)$ and $\alpha $ chosen to satisfy following conditions:
$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& 0\le \alpha \le \frac{4\mu -2}{L{C}_{\tau}},\phantom{\rule{1.em}{0ex}}\alpha \le \frac{2\mu -1-\sqrt{{(2\mu -1)}^{2}-2L{C}_{\tau}}}{L{C}_{\tau}}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \alpha \ge \frac{2\mu -1+\sqrt{{(2\mu -1)}^{2}-2L{C}_{\tau}}}{L{C}_{\tau}}.\hfill \end{array}$$

Let Assumptions 1–5 and conditions for kernels $\mathcal{K}$ (9)-(10) and α (12) be satisfied. Set ${\widehat{\theta}}_{0}$, choose interval size parameter k
$$\mathbb{E}\{\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}{\parallel}^{2}\}\le \mathbb{E}\{\parallel {\widehat{\theta}}_{0}-{\theta}_{{s}_{k}}{\parallel}^{2}\}{(1-{\nu}_{i})}^{n}+\psi (1-{(1-{\nu}_{i})}^{n}).$$

$${Y}_{n}=\frac{1}{2\beta}\mathcal{K}\left({\Delta}_{n}\right)({y}_{n}^{+}-{y}_{n}^{-})$$

For algorithm (11) we have
$$\mathbb{E}\left\{{Y}_{n}\right\}=\mathbb{E}\left\{{\mathcal{K}}_{n}\left({\Delta}_{n}\right)\frac{{y}_{n}^{+}-{y}_{n}^{-}}{2\beta}\right|{\mathcal{F}}_{n-1}\},$$
where the right hand side depends only on ${\widehat{\theta}}_{n}$ and n in the sense that ${\Delta}_{n}$ does not depend on any other random variables. Proposition is true.

The proposition is valid due to the choice of Lyapunov function (14).

At first consider $\mathbb{E}\left\{{Y}_{n}\right\}$. Due to (11) pseudo-gradient (15) after using Assumption 5 becomes
$$\mathbb{E}\left\{{Y}_{n}\right\}=\mathbb{E}\{{\mathcal{K}}_{n}\left({\Delta}_{n}\right)\frac{1}{2\beta}\left(\right)open="("\; close=")">{f}_{{s}_{k}}\left({\theta}_{n}^{+}\right)-{f}_{{s}_{k}}\left({\theta}_{n}^{-}\right)\left|{\mathcal{F}}_{n-1}\right\}$$
Consider the expression under mathematical expectation. Using Taylor series representation it could be written as:
$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& \frac{1}{2}\nabla {F}_{n}\left({\widehat{\theta}}_{n}\right)+\frac{1}{2\beta}\int {\mathcal{K}}_{n}\left(x\right){x}^{T}{\int}_{0}^{1}\left(\right)open="("\; close=")">{\nabla}_{x}{f}_{{s}_{k}}({\widehat{\theta}}_{n}+t\beta x)-\nabla {f}_{{s}_{k}}\left({\widehat{\theta}}_{n}\right)dt{P}_{n}\left(dx\right)+\hfill \end{array}$$
Estimate absolute value of the sum of integral elements in the obtained expression. After using (9), Assumption 4 we get
$$|\int (\xb7){P}_{n}\left(dx\right)|+|\int (\xb7){P}_{n}\left(dx\right)|\le \frac{2{\beta}^{2}}{2\beta}\int \parallel {\mathcal{K}}_{n}\left(x\right)\parallel \parallel x\parallel M\parallel x\parallel {P}_{n}\left(dx\right)\le {C}_{1}\beta .$$
Substitute the estimate, elements containing gradients of ${F}_{n}$ and (14) into (16), regard the relation $\int (\xb7){P}_{n}\left(dx\right)\ge -|\int (\xb7){P}_{n}\left(dx\right)|$ and Cauchy–Bunyakovsky–Schwarz inequality:
$$\langle {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}},\mathbb{E}\left\{{Y}_{n}\right\}\rangle \ge \langle {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}},\nabla {F}_{n}\left({\widehat{\theta}}_{n}\right)\rangle -\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}\parallel {C}_{1}\beta .$$
Apply Assumption 2 and estimate $\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}\parallel {C}_{1}\beta \le \frac{1}{2}(\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}{\parallel}^{2}+{\left({C}_{1}\beta \right)}^{2}):$
$$\langle {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}},\mathbb{E}\left\{{Y}_{n}\right\}\rangle \ge (2\mu -1)\frac{1}{2}{\parallel {\widehat{\theta}}_{n}-{\theta}_{{s}_{k}}\parallel}^{2}-\frac{1}{2}{C}_{1}^{2}{\beta}^{2}$$
Proposition is true for $\mu >1/2$.

Using (10), Cauchy–Bunyakovsky–Schwarz inequality, Assumptions 1 and 3 it could be shown that
$$\begin{array}{cc}\hfill \mathbb{E}\{\parallel {Y}_{n}{\parallel}^{2}\}& \le \frac{1}{2}\underset{x}{sup}{\mathcal{K}}_{n}{\left(x\right)}^{2}\mathbb{E}\{{\left({\xi}_{n}^{+}\right)}^{2}+{\left({\xi}_{n}^{-}\right)}^{2}|{\mathcal{F}}_{n-1}\}+\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& +\int {\left(\right)}^{{f}_{{s}_{k}}}2\parallel {\mathcal{K}}_{n}{\left(x\right)\parallel}^{2}{\mathrm{P}}_{n}\left(dx\right)\le {C}_{2}{\beta}^{2}(\parallel {\widehat{\theta}}_{n-1}-{\theta}_{{s}_{k}}{\parallel}^{2})+{C}_{3}{\beta}^{4}+{C}_{4}{\xi}_{n}^{2}\hfill \end{array}$$
Proposition holds true with $\tau =2{C}_{2}{\beta}^{2}$, ${\sigma}_{n}^{2}={C}_{3}{\beta}^{4}+{C}_{4}{\xi}_{n}^{2}$.

Proposition is valid due to arbitrariness of initial approximation ${\widehat{\theta}}_{0}$ choice and an assumption regarding final order ${s}_{k}$ of the external disturbance W affecting the system and thus the final order of the system state vector.

The first inequality could be met by choice of α and the second one is true since ν is constant. Proposition is true.

The fulfillment of the given propositions allow to prove the theorem using the result in [10]. □

After the algorithm converges the parameter estimates ${\widehat{\theta}}_{n}$ continue to fluctuate around the true parameter value ${\theta}_{{s}_{k}}$ until the new change of system structure and value of ${\theta}_{{s}_{k}}$.

The most common problem setup for unknown-but-bound disturbances in existing works is formulated as follows. It is required to minimize the value of the objective function $f\left(\theta \right)$ with some adversarial deterministic noise $\xi \left(\theta \right)$ such that $\left|\xi \right(\theta \left)\right|\le \xi $ and $\xi >0$):
The considered setup implies that external noise depends on system state ${s}_{k}$ which is not a direct function of θ but rather is affected by values of ${\theta}_{t}$ during some time period $[t-d,t)$
Value ${s}_{k}$ is formed on basis of previous values of θ.

$$\tilde{f}\left(\theta \right)=f\left(\theta \right)+\xi \left(\theta \right).$$

$$\tilde{f}\left(\theta \right)=f\left(\theta \right)+\xi \left({s}_{k}\right).$$

For astronomers, an urgent task was to obtain images of objects that can’t be recorded using optical telescopes, which are situated on earth or in space. This problem was largely solved using radio telescopes [11]. The main tasks of the telescope are: to collect radiation that falls on the mirror system with minimal losses, and also obtain the most accurate image of the object [6]. There are various ways to solve this problem, consider for example, [8]. One of them is improving the quality of the device [12], which collects radiation for obtaining images. Another one is combining radio telescopes into systems [9]. If the radiation is collected with significant errors, then the image will have disturbances [27].

An important and time-consuming part of image acquisition is the precise tuning of the radio telescope antenna (or systems consisting of such antennas) [13]. The quality of the image obtained on a radio telescope directly depends on the quality of the construction of a reflecting system of mirrors that focuses the radiation coming from outside. To improve the image quality, it is necessary to focus the radiation of the device in such a way that it works as accurately as possible, especially if it is located in space [14]. Traditional antenna tuning algorithms are sufficient for the task. However, they lose their effectiveness under uncontrolled unpredictable external influences. We consider the case when these are deformations of the radio telescope shields that arise due to environmental influences such as temperature changes, wind and other influences.

In practice, radiation is subject to various distorting influences, and as a result the quality of the observed image decreases, despite the presence of various stabilizers and filters [11]. One of the ways to solve such a problem is using randomized stochastic optimization algorithms [15]. A method for improving image quality by improving the tuning characteristics of the radio telescope mirror system is considered in [25].

In an ideal antenna system, the signal is reflected from different mirror plates and assembled at one point. Deformations of radio telescope structures, external temperature, wind, mechanical influences lead to deviation of the optical path lengths of the rays from the required ones. As a result, the focus point on the plane of the receiver shifts [12]. To improve reflection accuracy on the surface of radio telescope mirrors, the following methods are used: autocollimation [22], telescope calibration by spectral density radiation flux, synchronous calibration method [21], laser geodetic measurements [6], improvement of the kinematics of antenna elements [19], radio holography physical method [24]. The goal is to develop a stochastic optimization type algorithm to improve the quality of the settings of the radio telescope mirror system model which could be used in a system similar to Radioastron [20,26]. The main criterion for efficiency is the recording power of the desired signal and the time required for adjusting the parameters.

The antenna segments can be set to the optimal position to improve the quality of image recording. Consider an irradiator (radiation generator), a receiver and a mirror system of a radio telescope, consisting of identical plates that reflect the incoming signal ($i=1,\dots ,N,N=895$ in the RATAN-600 installation). Radiation is created in the irradiator, which, falling on the plates and focused in the receiver. Let’s divide the number axis into time intervals of duration $\delta $, starting from some moment ${t}_{0}$, where ${T}_{k}$ is the k-th time interval k is the index of the time interval. Assume that we know:

1) the position (orientation) of each i-th plate, which is specified by the vector of parameters ${({a}_{i},{b}_{i},{c}_{i})}^{T}$, where ${a}_{i}$ is the rotation angle of the i-th reflective element horizontally; ${b}_{i}$ is the vertical rotation angle, ${c}_{i}$ is the forward horizontal displacement of the i-th reflecting element. Let ${\theta}_{k}$ be a vector that contains all parameters of the mirror system in a given time interval k, ${\theta}_{k}={({a}_{1},{a}_{2},...,{a}_{N},{b}_{1},{b}_{2},...,{b}_{N},{c}_{1},{c}_{2},...,{c}_{N})}^{T},$

2) radiation coming from each mirror (${z}_{i}$),

3) the common signal coming from all mirrors to the receiver $Z\left({\theta}_{k}\right)={\sum}_{i=1,\dots ,N}\left({z}_{i}\right)$,

4) characteristics of the signal in the generator.

The front of a signal is the sum of harmonics that comes to us with different phases from a certain direction. The perfect placement of the plates brings all radiation from the objects into focus. We obtain the signal as a sum of sines with phases. Different ${z}_{i}$ arrive at different times with different phases. Let us evaluate the difference between the signals from an ideal antenna and from a real one (with deformations). Signals reflected from ideal mirror segments will have the same phase ${\varphi}_{i}$ (${\varphi}_{1}={\varphi}_{2}=\cdots ={\varphi}_{N}$). The signals from segments with deformations will look like this:
$${z}_{1}=sin(\omega t+{\varphi}_{1}),{z}_{2}=sin(\omega t+{\varphi}_{2}),\cdots {z}_{i}=sin(\omega t+{\varphi}_{i})$$
The objective function of the problem $\left(F\right(\theta )$ is the signal power) is defined as follows:
$$F\left(\theta \right)=\overline{{lim}_{k}}{\int}_{t\in {T}_{k}}{\left|Z(\theta ,t)\right|}^{2}dt.$$
It is required to maximize the objective function:

$$F\left(\theta \right)\to \underset{\theta}{max}$$

We consider the problem of optimizing the position of mirrors in the limit over time (not over a specific time interval).

The reflective elements of the antenna are made exactly the same, so they provide equivalent observations in all directions.

At the same time, if one moves along an ideal reflective surface, its local characteristics change. Therefore, a real reflective surface composed of identical elements will repeat deviations from the ideal surface from element to element. These deviations will be greater, if the shape of the surface of the element differs from the shape of that portion of the ideal surface which this element should represent. If the size of the reflective elements increases, then the deviations naturally increase. These deviations are an error distributed over the reflector of a variable profile antenna, which, at large values, creates unacceptable distortions reducing the efficiency of the antenna. We will call ${v}_{k}$ the signal power measurement errors arising due to unknown and uncontrolled deformations in the reflective elements of the antenna caused by weather, wind, and temperature changes.

After conventional procedures for adjusting the inclination angles and positions of the radio telescope mirrors, we obtain an initial approximation ${\widehat{\theta}}_{0}$ to the optimal tuning values. Then there is still the possibility of “tuning” in a certain neighborhood T containing ${\widehat{\theta}}_{0}$ and the optimal value ${\theta}^{\u2605}$ corresponding to the maximum power of the received signal. We use a stochastic optimization algorithm with two measurements per iteration, which allows us to reduce the negative impact of various disturbances on the power of the recorded signal [7]:

Select ${\theta}_{0}.$

$n-1\to n$

We sequentially generate vectors ${\Delta}_{n}$ with components from +/-1, which are chosen with equal probability.

We measure the power values for two positions of the antenna system:

$({\theta}_{n}+\beta {\Delta}_{n})$ and $({\theta}_{n}-\beta {\Delta}_{n}).$

Measurements are obtained with noise ${v}_{2n}$ and ${v}_{2n-1}$ :
Next, we form the following estimate ${\theta}_{2n}$ according to the rule:
$${\theta}_{2n}={\mathcal{P}}_{\mathcal{T}}({\theta}_{2n-1}+\frac{\alpha {K}_{n}\left({\Delta}_{n}\right)}{2\beta}({y}_{2n}-{y}_{2n-1})),$$
where $\alpha ,\beta $ are the parameters of the algorithm, ${\mathcal{P}}_{\mathcal{T}}$ is the projection onto the set $\mathcal{T}$.

$${y}_{2n}={P}_{2n}({\theta}_{2n}+\beta {\Delta}_{n})+{\xi}_{2n};$$

$${y}_{2n-1}={P}_{2n-1}({\theta}_{2n-1}-\beta {\Delta}_{n})+{\xi}_{2n-1};$$

To construct the kernels ${K}_{0}(\xb7)$ and ${K}_{1}(\xb7)$ on the interval $[-1/2,1/2]$ orthogonal Legendre polynomials could be used. In this case, for initial values $\ell =1,2\phantom{\rule{0.277778em}{0ex}}\phantom{\rule{0.277778em}{0ex}}$ (i.e. $2\le \gamma \le 3$) the type of kernels is as follows: ${K}_{0}\left(q\right)=12q,\phantom{\rule{1.em}{0ex}}{K}_{1}\left(q\right)=1,\phantom{\rule{1.em}{0ex}}\mid q\mid \le 1/2,$

for $\ell =3,4\phantom{\rule{1.em}{0ex}}$ (i.e. $3<\gamma \le 5$):

${K}_{0}\left(q\right)=5q(15-84{q}^{2}),\phantom{\rule{0ex}{0ex}}\phantom{\rule{1.em}{0ex}}{K}_{1}\left(q\right)=9/4-15{q}^{2},\phantom{\rule{1.em}{0ex}}\mid q\mid \le 1/2,$

and for $\left|q\right|>1/2$ both functions are equal to zero.

The test disturbance is formed in such a way that $\forall n\ge 1$ random vector ${\Delta}_{n}$ does not depend on ${\overline{v}}_{1},\dots ,{\overline{v}}_{k}$ and $\mathrm{E}\{{({v}_{2n}-{v}_{2n-1})}^{2}/2\}\le {\sigma}_{2}^{2},\phantom{\rule{0.277778em}{0ex}}(\mathrm{E}\left\{{v}_{n}^{2}\right\}\le {\sigma}_{1}^{2});$

The main element that reflects the incoming signal is the segment of the mirror system. It is important to configure these shields so that the signal comes into focus. The system allows to customize the required dimensions of the reflective shield and its curvature. Stochastic optimisation algorithm (11) can be applied for tuning a system which consists of a large number of telescopes in space under conditions of interference and also when individual elements of the system are deformed. About 3000 parameters of the mirror surfaces are adjusted, during focusing of the mirror system.The faster setting up of the mirrors is very important for the accuracy of observations. For acceleration, it is proposed to use the Nesterov acceleration method in distributed form [16,17,18]. In Figure 3 on page 11, the dependence of the signal power on the number of iterations of the algorithm is given.

Conceptualization, O.G.; methodology, O.G.; software, K.D.; formal analysis, Y.I.; writing—original draft preparation, K.D. and Y.I.; writing—review and editing, O.G, and Y.I.; visualization, K.D.; All authors have read and agreed to the published version of the manuscript.

The work was supported by the IPME RAS by Russian Science Foundation (project no. 21-19-00516).

The authors declare no conflict of interest.

- Granichin, O.; Uzhva, D.; Volkovich, Z. Cluster Flows and Multiagent Technology. Mathematics
**2021**, 9, 22. [Google Scholar] [CrossRef] - Granichin, O. What is an Actual Structure of Complex Information-Control Systems? Stochastic optimization in informatics. 2016. Is. 12, No 1.
- Parilina, E.; Petrosyan, L. On a Simplified Method of Defining Characteristic Function in Stochastic Games. Mathematics
**2020**, 8, 1135. [Google Scholar] [CrossRef] - Achdou, Y. , Cardaliaguet, P., Delarue, F., Porretta, A. and Santambrogio, F., 2021. Mean Field Games: Cetraro, Italy 2019 (Vol. 2281). Springer Nature.
- Granichin, O. and Amelina, N., 2014. Simultaneous perturbation stochastic approximation for tracking under unknown but bounded disturbances. IEEE Transactions on Automatic Control, 60(6), pp.1653-1658. [CrossRef]
- Ermakov A., N. , Kovalev Yu. A. Project “RadioAstron”. Calibration of a space telescope in flight - automation of measurement processing of the ASC FIAN, Moscow, Russia, Proceedings of the Institute of Applied Astronomy of the Russian Academy of Sciences, vol. 54. 2020. [Google Scholar]
- Granichin, O.N. , Polyak B.T. Randomized estimation and optimization algorithms under almost arbitrary noise. M.: Nauka, 2003. 291 p.
- Droszcz, A. , J˛edrzejewski, K. , Kłos, J., Kulpa, K., Pozoga M. Beamforming of LOFAR Radio-Telescope for Passive Radiolocation Purposes. Remote Sens.
**2021**, 13, 810. [Google Scholar] [CrossRef] - Leonid I. Gurvits. Advances in Space Research Volume 65, Issue 2, 15 January 2020, Pages 868-876 Space VLBI: from first ideas to operational missions Author links open overlay panel.
- Polyak, B.T. Convergence and rate of convergence in iterative stochastic processes. I. The general case. 1976. Avtomatika i telemekhanika, No. 12, pp.83-94.
- Kardashev, N.S. and others. “RADIOASTRON”: results of the implementation of the scientific research program for 5 years of flight // Bulletin of NPO im. S.A. Lavochkina. 2016. No. 3. P. 4-24.
- Dubarenko, V.V. , Kuchmin A.Yu., Artemenko Yu.N., Shishlakov V.F. Millimeter-wave radio telescopes with adjustable mirror surfaces: monograph. – St. Petersburg: GUAP, 2019.
- Monakhova, U.V. , Ivanov D.S. Formation of a swarm of nanosatellites using decentralized aerodynamic control taking into account communication restrictions // Preprints of the Institute for Problems of Materials Science. M.V.Keldysh. 2018. No. 151. 32 p. [CrossRef]
- Bentum, M.J. , Verma M.K., Rajan R.T., Boonstra A.J., Verhoeven C.J.M., Gill E.K.A., van der Veen A.J., Falcke H., Klein Wolt M., Monna B., Engelen S., Rotteveel J., Gurvits L.I. A Roadmap towards a Space-based Radio Telescope for Ultra-Low Frequency Radio Astronomy. [CrossRef]
- Granichin, O.N. , Erofeeva V.A., Ivanskiy Y.V., Jiang Y. Simultaneous Perturbation Stochastic Approximation-Based Consensus for Tracking Under Unknown-But-Bounded Disturbances. IEEE Transactions on Automatic Control 2021, Vol. 66, Is. 8 PP. 3710–3717. [CrossRef]
- Rogozin, A. , Yarmoshik D., Kopylova K. Gasnikov A. P: Decentralized Strongly-Convex Optimization with Affine Constraints: Primal and Dual Approaches 2022.
- Nesterov, Y. A method of solving a convex programming problem with convergence rate o(1/k 2 ). Soviet Mathematics Doklady, 27(2):372–376, 1983.
- Nesterov, Y. Introductory Lectures on Convex Optimization: a basic course. Kluwer Academic Publishers, Massachusetts, 2004.
- Zharov, V.I. , Sotnikova Yu.V. Methodology for determining the kinematic characteristics of the elements of the main mirror of the RATAN-600 radio telescope using modern laser measuring systems. Astrophysical Bulletin, 2017. Vol. 72, No. 4, pp. 520–526.
- Moisheev, A.A. Creation of space segments of astrophysical observatories. Bulletin 2.
- Sotnikova, Yu.V. , Kovalev Yu.A., Erkenov A.K., Method of synchronous calibration of RATAN-600 using 2 of its sectors. Astrophysical Bulletin, 2019. Vol. 74, No. 4, pp. 535–543. [CrossRef]
- Khaikin, V.B. , Bursov N.N. Autocollimation automatic adjustment and control of the efficiency of elements of the RATAN-600 radio telescope. Journal of Radioelectronics, 2016, pp. 1684-1719.
- Mingaliev, M.G. RATAN-600 - current state and prospects. Abstracts of the report. Conf. RT-2002, in Pushchino, 2002. P. 80.
- Khaikin, V.B. , Lebedev M.K., Ripak A.M. A method for radio-holographic monitoring of the surface of the main mirror of the RATAN-600 radio telescope with radial movement of the support element. Journal of Radioelectronics, 2016, pp. 1684-1719.
- Kopylova, K.D. , Granichin O.N. Minimizing the systematic error of a radio astronomy telescope using a randomized stochastic optimization algorithm. Proceedings of the 13th multi-conference on management issues, 2020.
- Kovalev, Yu.A. , Sotnikova Yu.V., Erkenov A.K., Popkov A.V., Volvach L. N., Vasilkov V. I., Lisakov M. M., Semenova T. A., Tsybulev P. G. Features of calibration of the space radio telescope "RadioAstron" and the radio telescope RATAN-600. // Proceedings of the IPA RAS. – St. Petersburg: IPA RAS, 2018. Issue. 47.- pp. 38-42.
- Minchenko, B.S. Synthesis of radio images on the RATAN-600 radio telescope Izv. universities Radiophysics. 1983. T. 26, No. 11. P. 1463–1471.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Dynamic System Identification via Randomized Stochastic Optimization Under Unknown-but-Bounded Noise

Oleg Granichin

et al.

,

2023

A Hybrid Adaptive Unscented Kalman Filter Algorithm

Jun He

et al.

,

2017

© 2024 MDPI (Basel, Switzerland) unless otherwise stated