Computational Simulations of Similar Probabilistic Distributions to the Binomial and Poisson Distributions

This study has developed a Matlab application for simulating statistical models project (SMp) probabilistic distributions that are similar to binomial and Poisson, which were created by mathematical procedures. The simulated distributions are graphically compared with these legendary distributions. The application allows to obtain many probabilistic distributions, and shows the trend (τ ) for n trials with success probability p, i.e. the maximum probability as τ=np. While the Poisson distribution PD(x;μ) is a unique probabilistic distribution, where PD=0 in x=+∞, the application simulates many SMp(x;μ ,Xmax) distributions, where μ is the Poisson parameter and value of x with generally the maximum probability, and Xmax is upper limit of x with SMp(x;μ,Xmax) ≥ 0 and limit of the stochastic region of the random discrete variable X. It is shown that by simulation via, one can get many and better probabilistic distributions than by mathematical one.


Introduction
Nowadays, the probabilities of some stochastic processes/effects are calculated with experimental/observational data, by means of complex phenomenological/mechanistic models, or mathematical expressions, such as the binomial and Poisson, which are results of mathematical procedures. The probability can be calculated using computational simulators. This could become in the most important way of calculation.
The binomial theorem in its most general case, i.e., the binomial series of [1] states that for any positive integer n, the n th power of the sum of two numbers a and b may be expressed as follows The binomial distribution (BD) is used to calculate the probability of k successes in n trials with a parameter p (success probability in each trial) of a random variable X=x. The BD(k;n,p) expression was obtained from the Eq. (1) with a=p and b=1-p.
"The Poisson distribution (PD) is a probability distribution of a discrete random variable that stands for the number (count) of statistically independent events, occurring within a unit of time or space" [2].
The PD is the limiting case of the BD and was determined as follows: and used in P(X=x)=p(x) as PD(x;µ).
The BD and PD have been overestimated and considered almost exclusive for treating their respective stochastic processes, even they have been irrationally used in the ionizing radiation field, such as in the formulation of the tumor control probability (TCP) models and radiation interactions with living tissues.
The statistical models project (SMp) calculates probabilities of many stochastic processes/effects by means of its probabilistic-mechanistic models and simulators. These models are based on SMp function of [3]. The radiobiological simulator developed in [4] is a new and interesting way of calculating probabilities.
In [3] it was clearly described the mathematical origin and limitations of the popular BD and PD. Although it was shown that SMp function can play role of BD, it was done only for showing the diversity of SMp. It was well-explained the little or none importance of the BD that is more 300 years old of created and widely used in different scientific fields.
In this study, the SMp proposes a simulator of probabilistic distributions that are similar to BD and PD, where is shown that is more interesting to generate probabilistic discrete distributions via computational simulations than by mathematical procedures.

The "Binomial" module
For simulating probabilistic distributions that are similar to binomial, the Matlab function rand is used for generating n random positive numbers ≤ 1, which are compared with the binomial parameter p. If generated number is < p there is a success, but a failure. This allows to determine the k successes (kSs). The n generations, i.e. "the n trails" are repeated rptrial time. The probability for each value k, the SMp(k;n,p) simulated is calculated as The quality of the simulations is checked using the sum of the simulated distributions; i.e. ∑ ( ; , ) 0 , which should be around 1 for a good simulation.

The "Poisson" module
For simulating probabilistic distributions that are similar to Poisson distribution, the Matlab function rand is used for generating Xmax random positive numbers ≤ 1, which are compared with success probability p that is calculated as p=µ/Xmax. If generated number is < p there is a success, but a failure. This let to determine the k successes (kSs). The Xmax generations, i.e. "the Xmax trails" are repeated rptrial time. The probability for each value x, the SMp(x;µ,Xmax) simulated is determining as where µ is the Poisson parameter and value of x with generally the maximum probability, and Xmax is upper limit of x with SMp(x;µ,Xmax) ≥ 0 and limit of the stochastic region for random discrete variable X.
The quality of the simulations is checked using the sum of the simulated distributions; i.e. ∑ ( ; µ, ) 0 , which should be around 1 for a good simulation.

Description of the codes
The MatLab application SimPD developed in this study is available in https://gitlab.com/tfrometa/simpd. This application simulates the process of k successes of n trials of a SP/E with a success probability p, and obtains SMp(k;n,p) distributions. Also it obtains the probabilistic distributions for a random discrete variable X with maximum probability in x=µ or x≈µ, and equal to zero in x ≥ Xmax, where µ is the Poisson parameter, and Xmax is upper limit of x with probabilty ≥ 0. The probabilistic distributions simulated are graphically compared with the binomial and Poisson distributions.
The SimPD.m, Binomial.m and Poisson.m functions compose the MatLab application. The former is the main code, and its execution allows accessing to two others through their respective buttons, "P(k;n,p) distributions" and "P(x;µ,Xmax) distributions".

The "Binomial" module
At the application, the input values (IVs) p and n appear in yellow color, while outcome of the sum of simulated probabilities appears in green. One should press the "Enter" key placed at each field for introducing the IVs into the application.
The steps for the execution of this module are: a) Introduce binomial parameters p and n b) After the complete introduction of the input values, press the "Simulate" button

The "Poisson" module
The input values (IVs) µ and Xmax appear in yellow color, while outcome of the sum of simulated probabilities appears in green. One should press the "Enter" key placed at each field for introducing the IVs into the application.
The steps for the execution of this module are: a) Introduce Poisson parameter µ , and value Xmax. b) After the complete introduction of the input values, press the "Simulate" button

Discussion
There are many possible distributions for k successes in n trials of a stochastic process/effect (SP/E) with success probability p; and their trends calculated as np is always the value of k for the maximum probability.
While the binomial distribution (B(k;n,p) ) was derived of a mathematical theorem, the application developed in this study lets obtaining several distributions SMp(k;n,p). The B(k;n,p) is only one of the possible distribution P(k;n,p). This can be seen when one repeats the simulations for a determined value of p and n. The binomial is result of a power expression whose sum is always 1 of summing p and q=1-p. The most important is that each trial is performed with a success probability p and 1-p of failing.
Generally, the SP/Es are characterized with a success probability p as result of previous N experiments or observations with K successes, where p=K/N. For this reason, the trend (τ) for n further trials is τ=n*p.
If a SP/E is characterized with a success probability p, it is not necessary to create an additional second probability BD(k;n,p), which was result of a mathematic theorem that satisfies properties of only one of the P(k;n,p) distributions.
The creation of new probabilities depending of p, like BD(k;n,p) with a complex-mathematic expression, have little importance and generated confusions, how is shown in [5] where the tumour control probability (TCP) model is irrationally defined as a binomial function.
The Poisson distribution (PD) with parameter µ =4, PD(4;3)=P(4;4), it means that there is not maximum of probability in x=µ =4. The Figure 1 shows a simulated distribution SMp(x;4,7) and PD(x;4). The former has its maximum in x=4. According to [6], the use of Poisson statistics (PS) in TCP models has led to a negative exponential expression. Also the cell survival (S) has been described with PS. The TCP exponential expression does not have a strong radiobiological-probabilistic foundation, and the ways of describing S with the PS is probabilistically very complicated. Indeed, S is a complement of the cell kill (K); i.e. probabilistically K=1-S, and K can be modelled with the SMp function of [3] as a stochastic effects type P1.

Conclusion
The simulation of the binomial distribution (BD) is only a computational exercise, that shows the expected result for n trails of a stochastic process with success probability p, the trend equal to n*p. The BD is only a math exercise without probabilistic importance.
The simulator developed in this work probabilistically generates many distributions that are similar to BD, which was mathematically generated. Additional to each trials is performed with a success probability p and of failure 1-p, the simulations show the trend (τ) for n further trials is τ=n*p.
The application generates many probabilistic distributions SMp(x;µ ,Xmax) and better compared with the PD, which due to its mathematical origin, this is unique and with some limitations. Also, the simulator gives possibility of choosing a upper limit value of x, where SMp(x;µ,Xmax) ≥ 0; while PD=0 for x=+∞. The Xmax represent the limit of the stochastic region of a random discrete variable X.
The computational tool of this study can be used for whatever of the current applications of the PD. The generated similar-Poisson probabilistic distributions can be used for describing possible distributions of the normal tissue effects of the radiation in whatever stochastic activity, such as radiation therapies and other radiation activities.
Nowadays, the BD and PD have been irrationally used for modelling some stochastic processes/effects as result of it has been not considered that these distributions are associated to probabilities of a discrete random variable X, respectively function of k and x; as well as its mathematical relationship.