Statistical Models in Comparative Optinalysis through Induced Systematic Skewization Mechanisms

The effect of sensitivity points (sequence order and position of every element) of sequences following comparative optinalysis under two proposed mechanisms, the paranodic and synodic skewization, was studied to develop a modeling approach to comparative optinalysis of sequences. The results show that the outcomes of comparative optinalysis (similarity measurement) in a set of paranodically skewed sequences can be modeled deterministically by suitable line regression functions. The sensitivity points (nodes) of a sequence display two important and distinct zones, the K-zones and the B-zones. Continues paranodic skewization within these zones (KB zones) operates within probability space, but at hyperskewization level and at the K-zones only, the outcomes of comparative optinalysis operate outside the probability space. Moreover, the outcomes of comparative optinalysis by synodic skewization can be modeled deterministically by some regression line functions, but a general regression function cannot be identified. At certain limit of skewization value space, following paranodic and synodic skewization, the outcomes of comparative optinalysis at the left-sided sequence form a similar pattern with the right-sided sequence.


Introduction
Statistical functions, such as power, exponential, linear, polynomial, logarithmic functions roles a significant role in modeling and predicting real life natural laws and phenomena. They were essentially used with real life applications in biology, astronomy, language, economics, demography, computer, information theory, and others (Newcom, 1881;Shestopaloff, 2010;Pinto et al., 2012).
Comparative Optinalysis is a computational algorithm that intermetrically (between corresponding elements) measures a level of similarity between two sequences as a mirror-like reflection of each other (optics-like manner). Comparative optinalysis may involve preserving the inherent sequence order of all elements of a data set, or re-designing/shaping the sequence order based on empirical assumptions (Abdullahi, 2019).
The magnitudes of elements of a sequence can be altered (skewed) uniformly across the entire sequence elements or at one or more selected elements by certain magnitude called skewization values. These approaches of skewness exert a certain degrees of dissimilarity as compared with the reference sequence and thus different outcomes of comparative optinalysis. A systematic skewness by just a unit value of induced skewness across sequence, can lead to large range of possible outcomes of comparative optinalysis with the original sequence. It is therefore important to develop some mechanisms that would simplify the multiple approaches of comparative computations and to model a pattern of relationship among the variables and parameters, through a statistical modeling. Generating all the possible alterations or skewness and their modeled outcomes of comparisons from a given or defined sequence is the main central goals of induced systematic skewization mechanisms.
In this article, two important mechanisms: paranodic and synodic skewization were proposed to predict the relative contribution of each sensitive point in comparative optinalysis, the possible outcomes and the regression model pattern line they form under different assumptions. The outcomes of comparative optinalysis due to paranodic and synodic skewization were studied in this article and appropriate models were established.

Contextual Definitions of Terms Used 2.1.1. Skewization:
It is the process of deliberate or induced alteration(s) in the distribution of sequence elements or variables. It is also defined as a mechanism for moving (changing) a modeled relationship among set of organized variables or elements and parameters of a sequence from one regression function (model) to the others. The induced alteration (skewization) can be paranodic or synodic. 2.1.2. Paranodic skewization: refers to a position-specific skewization or deviation or alteration in a sequence. It occurs in a systematic and progressive manner at any individual and specific existing node's element (sequence element) of a given sequence. Each sequence driven from paranodic skewization is called a successful generation. Below is a more clarification.
Additive: the specific variable or element is added together by a certain magnitude of skewization value. iii. Subtractive: the specific variable or element is subtracted or deducted by a certain magnitude of skewization value. iv. Insertive: the specific variable or element is bounded together by another variable or element before or after it. v. Deletive: the specific variable or element is completely deleted from the sequence. 2.1.3. Synodic skewization: refers to a skewization of the entire existing nodes' elements (sequence elements) of a given sequence, and as such it is not position specific. Below is a more clarification. Suppose we have a uniform (similar) sequence of elements as: Sequence G0 = 10 1 2 , 10 2 2 , 10 3 2 , 10 4 2 , 10 5 2 , 10 6 2 , 10 7 2 … … … … . . 10 A synodic skewization of sequence G0 is defined as: Sequence G1 = 10 1 , 10 2 , 10 3 , 10 4 , 10 5 , 10 6 , 10 7 … … … … . . 10 Where 10 = all positive or negative natural numbers Similar with paranodic skewization, synodic skewization can be substitutive, additive, subtractive, insertive, or deletive. 2.1.4. Skewization factor: Skweization value simply tells us how close or far a given sequence is skewed as compared to its reflective pair in an optinalytic manner. It can range extremely low below the reference sequence or extremely high above the reference sequence. It tells us the magnitude and direction of imbalance (skewness) of a given sequence elements' orientation relative to its reflective (i.e, the comparing) sequence. It is expressed as the ratio of the sum of unpaired scalements of the reflector-sided sequence to the sum of unpaired scalements of the query-sided (mirror) sequence. In this case, each sequence (query or reflector sequence) would have its own separate calculated scalements, but not the in their paired strand. It is expressed by the equation as follows: Example: Let ( ) and ( ) be the elements of the reflector sided and query sided sequence QS-Unit ( )

Skewization value:
Skewization value is the actual amount of magnitude used to induce alteration paranodically or synodically in a given sequence. It is expressed as the difference between the magnitude of the sequence element(s) after skewization (i.e reflector value) and the magnitude of the sequence element(s) before skewization (i.e query value). 2.1.6. Skewization space: Skewization space refers to the considered range (minimal and maximal) of skewization value or factor. 2.1.7. Query (reflecter) value: refers to the actual value of the query element or variable of a query sequence. A set of query values forms a query sequence. 2.1.8. Reflector value: refers to the actual value of the reflector element or variable of a reflector sequence. A set of reflector values forms a reflector sequence. 2.1.9. Probability space: Comparative optinalysis is said to be within a probability space if the calculated Kabirian coefficients of similarity can be accurately translated into a valid probabilities and percentages. Kc= ≥ 0.666667 to ≤ 2 translates a positive probabilities or percentages. The outcomes of similarity function (Kc) within a probability space construct probability models. 2.1.10. Non-probability space: Comparative optinalysis is said to be within a probability space if the calculated Kabirian coefficients of similarity cannot be translated into a valid probabilities (percentage). Kc= < 0.666667 and >2 translates a positive probabilities or percentages. The outcomes of similarity function (Kc) within a probability space construct probability models. 2.1.11. Sensitivity points: According to Abdullahi (2019a), a sensitivity point is any node that when considered a variable can exert a certain degree of imbalances in the distribution of elements (variables) about a dividing line or plane. Each node has its own unique characteristic sensitivity which increases away from the central node and decreases towards the central node(s). Sensitivity of a point generally decreased with increase in sequence elements. Figure 14 is an illustrative example.
The nodes with components 1 , 3 , 1 and 7 , 3 , 1 are the most sensitive points of the upper and lower stems respectively. The node with components R4D0C1 is the central node.

Comparative optinalysis
Comparative optinalysis (Abdullahi, 2019a and2019b) was performed in an excel sheets (See supplementary files 1, 2 and 3). Following Comparative optinalysis, the sequence N0 (used as the query or mirror sequence) and the sequence M1 (used as the reflector sequence) from synodic skewization mechanism was analyzed in the following arguments: i. Let the reference sequence ( 0 ) optinalytically reflects head-to-head (H-H) with the reflector sequences ( ) with a normalization of zero unit, such that elements of sequence 0 are intermetrically similar to the elements of sequence with a resultant Kabirain coefficient . These results were considered as the outcomes from the left-sided sequence (of the reflector sequence).
� : ii. Reversely, let the reference sequence ( 0 ) optinalytically reflects head-to-head (H-H) with the reflector sequences ( ) with a normalization of zero unit, such that elements of sequence are intermetrically similar to the elements of sequence 0 with a resultant Kabirain coefficient . These results were considered as the outcomes from the rightsided sequence (of the reflector sequence).
� : The detail numerical calculations were presented in the supplementary files of the supplementary materials here attached with this article. Comparative optinalysis of the sequences generated from paranodic and synodic skewization were presented in supplementary files 1, 2 and 3.

Regression and Correlation Analysis
Regression analysis was performed using a excel sheet functions. The outcomes of comparative optinalysis (Kabirian coefficients) per unit skewization value from paranodic skewization mechanism were plotted against their generation number, and also the outcomes of comparative optinalysis (Kabirian coefficients) per unit skewization value from synodic skewization mechanism were plotted against their reflector values. The regression line patterns, equations and the correlations were recorded. Five (5) regression functions, linear, polynomial, exponential, logarithmic, and power were considered. The results of paranodic skewization mechanism were presented in Table 5 and 6. The results of synodic skewization mechanism were presented in Table 3 and 4.

Best Fit Line Selection
The best regression functions (best fit lines) were assessed by the highest correlation coefficient. Table A1 A2 of the Appendix presented the modeled patterns of outcomes of Kabirian coefficient of similarity in different sided sequences and KB-zones, by some well-known regression lines. The results from the left-sided sequence indicate that the perfect correlation was a changing model from linear/simple-polynomial/exponential, continues as simple-polynomial, shifts to logarithmic, and finally stagnates at power with a skewization value spaces [10 -12 ±100] to [99±100], [10 3 ±100] to [10 4 ±100], [10 5 ±100], [10 6 ±100] to [10 12 ±100] respectively. It is however observed that the logarithmic pattern serves as an intermediate function that separates the non-probability model (power) from the probability models (exponential, linear and polynomial).

Paranodic Skewization
From the right-sided sequence, results indicate that the function with perfect correlation starts as linear/simple-polynomial/exponential, continues as simple-polynomial, and finally stagnates at complex-polynomial with a skewization value spaces [10 -12 ±100] to [99±100], [10 3 ±100] to [10 4 ±100], [10 5 ±100] to [10 12 ±100] respectively. Therefore, within the studied probability space, linear, simple-polynomial, and exponential models are the only constructed models. While the complex-polynomial, logarithmic and power models are neither from a probability space nor from non-probability space, because both of the two spaces are involved partly in the models construction. Moreover, four distinct quaternary divisions called the KB-zones were observed, each zones formed a distinct perfect function, thus distinct sensitivity to skewness.

Sensitivity Zones (KB-Zones)
Sensitivity zones are organized sequence of nodes that share common characteristics, such as resistance or tolerance to extreme imbalances (hyperskewness). In any distribution, the paranodic skewization produces a sensitivity zones (strata) called the K-zones and B-zones (KBzones) which are quaternary divisions of functions across the distribution nodes with an exception of central node(s) (see Figure 2). At a certain skewization space, usually above [10 4 ±100] of skewization value (see Table A1 and A2, excel file 1 of the supplementary materials), each KB-zones generates a distinct functions (express by a regression lines). The KB-zones are joined by and rotates around the pericentral node. Those regression lines that fit within the probability space are described at probability models, while those regression lines that form outside the probability space are also described as non-probability models. i. The zones cover the first and the fourth quarter regions of the distribution nodes. ii.
The nodes are more sensitive to skewness than B-zones, iii. Within these zones, the outcome of similarity function (i.e, the similarity coefficient) by paranodic skewization is determined by the perfect regression line they formed which changes with changing skewization value, iv.
The outcome of similarity function is constrained within and outside the probability space, v. They are the zones of probability and non-probability models, vi.
The probability models shift to a non-probability models with hyperskewness (i.e, the probability models are insensitive to hyperskewness, while the non-probability models are sensitive to hyperskewness),

The B-Zones:
i. The zones cover the second and the third quarter regions of the distribution nodes. ii. The nodes are less sensitive to skewness than B-zone, iii. Within these zones, the outcome of similarity function (i.e, the similarity coefficient) by paranodic skewization is determined by the perfect regression line they formed which changes with changing skewization value and factor. iv.
The outcome of similarity function is constrained within the probability space, v. They are the zones of probability models, vi.
The probability models are very sensitive and resistance to hyperskewness,

Synodic Skewization
The results are presented in Table 1, A3 and Figure 3-9 below provided that the modeled outcomes of similarity functions against their reflector values, within the produced skewization space. The results show that the outcomes of comparative optinalysis through synodic skewization falls within probability space and cannot be systematically and perfectly determined and correlated by the five (5) considered regression functions. In addition, some patterns are undefined in a simple and single pattern. Nevertheless, one pair representation of the two ssynodically skewed sequences can be used in function of Kabirian coefficient of similarity to x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x draw a general solution to this deterministic regression. Moreover, it is critically observed that for synodic skewization, the comparative optinalysis of a uniform sequence of elements can be simplified by the representation of any corresponding pair of elements of the sequences. Thus, the sequence length or nodality does not affect the outcomes, unlike the paranodic skewization. 9

Relationship between Left-Sided and Right-Sided Sequence
Following the paranodic skewization, the outcomes of comparative optinalysis (Kabirian coefficient of similarity) at a particular skewization value, from right-sided sequence forms a visually similar and a reciprocal relationship with the corresponding left-sided sequence at that same skewization value. (See Figure 10). This similar relationship mainly exists at a skewization value below [10 4 -10 2 ].
However, following the synodic skewization, the outcomes of comparative optinalysis (Kabirian coefficient of similarity) of positive sequence elements from right-sided sequence looks visually symmetrical to the outcomes of left-sided sequence (See Figure 11 and Table A3 of the appendix).

Discussions
The change in regression line pattern or functions through an increasing skewization value following paranodic skewization from the left-sided and the right-sided sequences gives a clear and meaningful transition of models. This indicates that, the paranodic skewization mechanism is very suitable for dynamic modeling through a limited skewization space. In application, and for a uniform sequence of elements, knowing the skewization space, the regression pattern (model) can be assured and comparative outcomes optinalysis can be predicted.
The outcomes of similarity coefficient that lies within the probability space following paranodic or synodic skewization at the left-sided sequence is a reciprocal function to the rightsided sequences, but the probability and percentage outcomes that lies between them is generally translated as same magnitude. This reciprocity property of comparative optinalysis is a valid reason to recognize it as a unique statistical paradigm on its own right, because similarity measurement defines similarity in a similar pattern.
One further advantage of comparative optinalysis models is that, one or more possible changes in magnitude and position, of a sequence can be calculate or predicted at a time. But in contrast, the probability estimation is insensitive to variations that are position-specific and only one chance (change in sequence magnitude) at a time can be calculated.
Some natural phenomena were shown to take a unique regression patterns. Power law, also known as heavy tail distributions, Pareto-like laws, or Zipf-like laws have been largely reported in modeling of distinct real life phenomena in the area of biology, astronomy, language, economics, demography, computer, information theory, and others (Pinto el al., 2012). However, the significant-digit law of statistical folklore is the empirical observation that in many naturally occurring tables of numerical data, the leading significant digits are uniformly distributed, but instead follow a particular logarithmic distribution (Newcomb, 1881).
It can be concluded that some natural variations conform to the position-specific variations of the sensitivity points of a uniform sequence, and could be a hidden force or element that establish a natural variations among natural phenomena.

Conclusion
The mechanisms (paranodic and synodic) of induced skewization in a sequences is a suitable approach for statistical modeling in comparative optinalysis of sequences. Continues paranodic skewization is a mechanism that moves (changes) a deterministic model in comparative optinalysis from one function (regression line) to the others. While continues synodic skewization is a mechanism that moves (changes) a deterministic model in comparative optinalysis from one probability outcome to others. The role of sensitivity points of a sequence allows a modeling of a univariate or multi-clustered or multivariate sequence of variables or elements.