Computer Science and Mathematics

Sort by

Article

Probability and Statistics

Bayesian Elastic‑Net Cox Models for Time‑to‑Event Prediction: Application with Breast‑Cancer Cohort

Ersin Yılmaz

S. Ejaz Ahmed

Dursun Aydın

Abstract: High-dimensional survival analyses require calibrated risk and honest uncertainty, but standard elastic-net Cox models yield only point estimates. We develop a fully Bayesian elastic-net Cox (BEN–Cox) modelfor high-dimensional proportional hazards regression that places a hierarchical global–local shrinkage prior on coefficients and performs full Bayesian inference via Hamiltonian Monte Carlo. We represent the elastic–net penalty as a global–local Gaussian scale mixture with hyperpriors that learn the ℓ1/ℓ2 trade-off, enabling adaptive sparsity that preserves correlated gene groups and, using HMC on the Cox partial likelihood, yields full posteriors for hazard ratios and patient-level survival curves. Methodologically, we formalize a Bayesian analogue of the elastic-net grouping effect at the posterior mode and establish posterior contraction under sparsity for the Cox partial likelihood, supporting the stability of the resulting risk scores. On the METABRIC breast-cancer cohort (n = 1 , 903; 440 gene-level features from an Illumina array with ≈ 24,000 gene-level features (probes)), BEN–Cox achieves slightly lower prediction error, higher discrimination, and better global calibration than a tuned ridge Cox baseline on a held-out test set. Posterior summaries provide credible intervals for hazard ratios, identify a compact gene panel that remains biologically plausible. BEN–Cox provides a theory-backed, uncertainty-aware alternative to tuned penalised Cox models, improving calibration and yielding an interpretable sparse signature in correlated, high-dimensional survival data

Posted: 06 January 2026

https://doi.org/10.20944/preprints202601.0166.v1

Article

Computer Science and Mathematics

Probability and Statistics

New Issues About the Central Limit Theorem Based on a More Comprehensive Approach to Probability: Developments and Future Perspectives

Pierpaolo Angelini

Abstract: A more comprehensive approach to probability studies an infinite number of probability laws that are formally admissible. Thus, an infinite number of weighted averages can be handled. Areinterpretation of the central limit theorem is accordingly shown. The deviations or errors from a fixed value are calculated. It is proved that they are normally distributed. Furthermore, such deviations are invariant with respect to geometric translations identifying repeated samples. In this paper, the way of understanding the statistical model to which a specific and pragmatic distribution is compared is not a functional scheme in the continuum, but it is itself a specific and pragmatic distribution. It is possible to enlarge the reasoning, so developments and future perspectives that underlie the reinterpretation of the central limit theorem are discussed.

Posted: 05 January 2026

https://doi.org/10.20944/preprints202601.0209.v1

Article

Computer Science and Mathematics

Probability and Statistics

A Mathematical Analysis and Simulation-Based Evaluation of Local Decision Rules in Skyjo

Felix Reichel

Abstract: Skyjo is a simple stochastic card game with partial information, local replacement decisions, and score-reducing column removal events. This paper develops a formal mathematical model of the game, derives expected-score rules for turn-level actions, proves several dominance and threshold results, and evaluates a family of heuristic strategies through Monte Carlo simulation. The focus here lies on local optimality under explicit belief assumptions rather than a full equilibrium solution of the multiplayer game. Finally a simulation code is provided for reproducibility.

Posted: 04 January 2026

https://doi.org/10.20944/preprints202601.0112.v1

Article

Computer Science and Mathematics

Probability and Statistics

Using Vector Representations of Characteristic Functions and Vector Logarithms When Proving Asymptotic Statements

Wolf-Dieter Richter

Abstract: In this methodological-technical note, in addition to the well-known concepts of logarithms of positive real numbers and operators, we open a path for a mathematical treatment of the mathematical concept of the logarithm of a vector. We prove the most basic arithmetic operations for this new logarithm concept and demonstrate how it is applies to characteristic functions and limit theorems of probability theory. As a side result, we revise a formula for $i^i$ that is known from the literature.

Posted: 23 December 2025

https://doi.org/10.20944/preprints202512.2001.v1

Article

Computer Science and Mathematics

Probability and Statistics

The Stingray Copula for Negative Dependence

Alecos Papadopoulos

Abstract: We present a new single-parameter bivariate copula, called the Stingray, that is dedicated to represent negative dependence and it nests the Independence copula. The Stingray copula is generated in a novel way, it has a simple form, and it is always defined over the full support, which is not the case for many copulas that model negative dependence. We provide visual incarnations of the copula, we derive a number of dependence properties and we compute basic concordance measures. We compare it to other copulas and joint distributions as regards the extend of dependence that it can reflect, and we find that the Stingray copula outperforms most of them while remaining competitive towards well-known and widely used copulas like the Gaussian and the Frank copula. Moreover, we show by simulation that the structure of dependence that it represents has individuality that cannot be captured by these copulas, since it is negative and also able to model asymmetries. We also show how the non-parametric Spearman’s rho measure of concordance can be used to formally test the hypothesis of statistical independence. As an illustration, we apply it to a financial data sample from the buildings construction sector, in order to model the negative relation between the level of capital employed and its gross rate of return.

Posted: 19 December 2025

https://doi.org/10.20944/preprints202512.1777.v1

Article

Computer Science and Mathematics

Probability and Statistics

House of Mirrors: Monotone Nonlinear Transformations for Modeling and Quantifying Perceptual Distortion in Data-Driven and Psychometric Systems

Indika Dewage

Austin Webber

Abstract: This work presents a unified mathematical framework for understanding how monotone nonlinear transformations reshape data and generate structural forms of distortion, even when order is preserved. We model perception and algorithmic processing as the action of a monotone mapping h(x) applied to an underlying truth variable, showing that curvature alone can alter scale, emphasis, and information content. Using synthetic data drawn from uniform, normal, and bimodal distributions, we evaluate power, root, logarithmic, and logistic transformations and quantify their effects through four complementary measures: Truth Drift for positional change, Differential Entropy Difference for information content, Confidence Distortion Index for confidence shifts, and Kullback–Leibler Divergence for structural variation. Across all experiments, power functions with large exponents and steep logistic curves produced the strongest distortions, particularly for bimodal inputs. Even moderate transformations resulted in measurable changes in entropy, confidence, and positional truth, with strong correlations among the four metrics. The findings provide a geometric interpretation of bias, demonstrating that distortion arises naturally whenever a system curves the input space—whether in human perception or algorithmic pipelines. This framework offers a principled foundation for evaluating the hidden effects of scaling, compression, and saturation, and highlights how the appearance of neutrality can conceal systematic informational shifts.

Posted: 16 December 2025

https://doi.org/10.20944/preprints202512.1283.v1

Article

Computer Science and Mathematics

Probability and Statistics

Median Based Unit Weibull Distribution (MBUW): Do the Higher Order Probability Weighted Moments (PWM) Add More Information over the Lower Order PWM in Parameter Estimation

Iman Attia

Abstract: In the present paper, Probability weighted moments (PWMs) method for parameter estimation of the median based unit weibull (MBUW) distribution is discussed. The most widely used first order PWMs is compared with the higher order PWMs for parameter estimation of (MBUW) distribution. Asymptotic distribution of this PWM estimator is derived. This comparison is illustrated using real data analysis.

Posted: 15 December 2025

https://doi.org/10.20944/preprints202412.0926.v4

Article

Computer Science and Mathematics

Probability and Statistics

Efficient Statistical Significance Approximation for Local Similarity Analysis of High-Throughput Time Series Data Using the Circular Moving Block Bootstrap

Yu Yang

Zhen Yang

Wei Shen

Zhiying Cheng

Xing Liu

Shaowen Liu

Abstract: To address the challenges of statistical inference for non-stationary traffic flow, this paper proposed an improved block permutation framework tailored to the correlation analysis requirements of traffic volume time series, and developed a statistical significance assessment method for local similarity scores based on the Circular Moving Block Bootstrap (CMBBLSA). This method avoided the distortion of the statistical distribution caused by non-stationarity, thereby enabling the estimation of the statistical significance of local similarity scores. Simulation studies were conducted under different parameter settings in the AR(1) and ARMA(1,1) models, and the results demonstrated that the Type I error probability of CMBBLSA under the null hypothesis is closer to the preset significance level α. An empirical analysis was also carried out using traffic flow monitoring data from main roads in first-tier cities, and the results indicated that CMBBLSA can reduce more false positive relationships and more accurately capture real correlations.

Posted: 12 December 2025

https://doi.org/10.20944/preprints202512.1112.v1

Article

Computer Science and Mathematics

Probability and Statistics

The Empirical Bayes Estimators of the Variance Parameter of the Normal Distribution with a Normal-Inverse-Gamma Prior under Stein's Loss Function

Ying-Ying Zhang

Abstract: For the hierarchical normal and normal-inverse-gamma model, we derive the Bayesian estimator of the variance parameter in the normal distribution under Stein's loss function---a penalty function that treats gross overestimation and underestimation equally---and compute the associated Posterior Expected Stein's Loss (PESL). Additionally, we determine the Bayesian estimator of the same variance parameter under the squared error loss function, along with its corresponding PESL. We further develop empirical Bayes estimators for the variance parameter using a conjugate normal-inverse-gamma prior, employing both the method of moments and Maximum Likelihood Estimation (MLE). Through numerical simulations, we examine five key aspects: (1) the consistency of moment-based and MLE-based hyperparameter estimators; (2) the influence of κ₀ on quantities of interest as functions of the most recent observation; (3) two inequalities involving the Bayesian estimators and their respective PESL values; (4) the model's goodness-of-fit to simulated data; and (5) graphical representations of marginal densities under different hyperparameter settings. The simulation results demonstrate that MLEs outperform moment estimators in estimating hyperparameters, particularly with respect to consistency and model fit. Finally, we apply our methodology to real-world data on poverty levels---specifically, the percentage of individuals living below the poverty line---to validate and illustrate our theoretical findings.

Posted: 12 December 2025

https://doi.org/10.20944/preprints202512.1148.v1

Review

Computer Science and Mathematics

Probability and Statistics

Review of the Problems Solved Using the Constrained Bayesian Methods

Kartlos Kachiashvili

Joseph Kachiashvili

Abstract: A new philosophy of hypothesis testing - the constrained Bayesian method (CBM) and its application for testing different types of statistical hypotheses such as: simple, composite, asymmetric, multiple hypotheses, are considered in the work. The advantage of the CBM over existing classical methods is theoretically proven in the form of theorems and practi-cally demonstrated by the results of numerous example computations. Examples of the use of CBM to solve some practically important problems are presented, which confirm the flexibility of the method and its great ability to deal with difficult problems.

Posted: 11 December 2025

https://doi.org/10.20944/preprints202512.0836.v1

Article

Computer Science and Mathematics

Probability and Statistics

The Relentless Two-Envelope Conundrum: A Paradox or Misapplication of Probability Theory

Aris Spanos

Abstract: The primary objective of the paper is to make a case that the evaluation of the expected returns in the Two-Envelope Paradox (TEP) is problematic due to the ill-defined framing of X and Y as random variables representing two identical envelopes, where one contains twice as much money as the other. In the traditional literature, when X is selected, Y is defined in terms of the amount of money x in X using the values y= x and y=2x with equal probability .5, and vice versa when Y is selected. The problem is that the event X=x stands for two distinct but unknown values representing the money in the two envelopes, say $θ and $2θ. This renders X and Y ill-defined random variables whose spurious probabilities are used to evaluate the traditional expected returns. The TEP is resolved by applying formal probability-theoretic reasoning to frame the two random variables in terms of the two unknown values {θ, 2θ}, giving rise to sound probability distributions, whose expected returns leave a player indifferent between keeping and switching the chosen envelope.

Posted: 11 December 2025

https://doi.org/10.20944/preprints202512.0933.v1

Article

Computer Science and Mathematics

Probability and Statistics

Nonparametric Functional Least Absolute Relative Error Regression: Application to Econophysics

Ali Laksaci

Ibrahim M. Almanjahi

Mustapha Rachdi

Abstract: In this paper, we propose an alternative kernel estimator for the regression operator of a scalar response variable S given a functional random variable T that takes values in a semi-metric space. The new estimator is constructed through the minimization of the least absolute relative error (LARE). The latter is characterized by its ability to provide a more balanced and scale-invariant measure of prediction accuracy compared to traditional standard absolute or squared error criterion. The LARE is an appropriate tool for reducing the influence of extremely large or small response values, enhancing robustness against heteroscedasticity or/and outliers. This feature makes LARE suitable for functional or high-dimensional data, where variations in scale are common. The high feasibility and strong performance of the proposed estimator is theoretically supported by establishing its stochastic consistency. The latter is derived with precision of the converge rate under mild regularity conditions. The ease implementation and the stability of the estimator are justified by simulation studies and an empirical application to near-infrared (NIR) spectrometry data. Of course the to explore the functional architecture of this data, we employ random matrix theory (RMT) which is a principal analytical tool of econophysics.

Posted: 10 December 2025

https://doi.org/10.20944/preprints202512.0947.v1

Article

Computer Science and Mathematics

Probability and Statistics

Inforpower: Quantifying the Informational Power of Probability Distributions

Hening Huang

Abstract: In many scientific and engineering fields (e.g., measurement science), a probability density function often models a system comprising a signal embedded in noise. Conventional measures, such as the mean, variance, entropy, and informity, characterize signal strength and uncertainty (or noise level) separately. However, the true performance of a system depends on the interaction between signal and noise. In this paper, we propose a novel measure, called "inforpower", for quantifying the system’s informational power that explicitly captures the interaction between signal and noise. We also propose a new measure of central tendency, called “information-energy center”. Closed-form expressions for inforpower and information-energy center are provided for ten well-known continuous distributions. Moreover, we propose a maximum inforpower criterion, which can complement the Akaike information criterion (AIC), the minimum entropy criterion, and the maximum informity criterion for selecting the best distribution from a set of candidate distributions. Two examples (synthetic Weibull distribution data and Tana River annual maximum streamflow) are presented to demonstrate the effectiveness of the proposed maximum inforpower criterion and compare it with existing goodness-of-fit criteria.

Posted: 05 December 2025

https://doi.org/10.20944/preprints202512.0563.v1

Article

Computer Science and Mathematics

Probability and Statistics

Capability of New Modified EWMA Control Chart for Integrated and Fractionally Integrated Time Series: Application to US Stock Prices

Kotchaporn Karoon

Yupaporn Areepong

Abstract: Among various statistical process control (SPC) methods, control charts are widely employed as essential instruments for monitoring and improving process quality. This study focuses on a new modified exponentially weighted moving average (New Modified EWMA) control chart that enhances detection capability under integrated and fractionally integrated time series processes. Special attention is given to the effect of symmetry on the chart structure and performance. The proposed chart preserves a symmetric monitoring configuration, in which the two-sided design (LCL>0) establishes control limits that are equally spaced around the center line, enabling balanced detection of both upward and downward shifts. Conversely, the one-sided version (LCL=0) introduces a deliberate asymmetry to increase sensitivity to upward mean shifts, which is particularly useful when downward deviations are physically implausible or less critical. The efficacy of the control chart utilizing both models is assessed through Average Run Length (ARL). Herein, the explicit formula of ARL is derived and compared to the ARL obtained from the numerical integral equation (NIE) in terms of both accuracy and computational time. The efficacy of the control chart employing both models is evaluated via Average Run Length (ARL). The explicit formula for ARL is derived and compared to the ARL produced by the numerical integral equation (NIE) regarding accuracy and processing time. The accuracy of the analytical ARL expression is validated by its negligible percentage difference (%diff) in comparison to the results derived using the NIE approach and the display processing time not exceeding 3 seconds. To confirm the highest capability, the suggested method is compared to both the classic EWMA and the modified EWMA charts using evaluation metrics such as ARL and SDRL (standard deviation run length), as well as RMI (relative mean index) and PCI (performance comparison index). Finally, Its examination of US stock prices illustrates performance, employing a symmetrical two-sided control chart for the rapid detection of changes through the new modified EWMA, in contrast to standard EWMA and modified EWMA charts.

Posted: 02 December 2025

https://doi.org/10.20944/preprints202512.0072.v1

Article

Computer Science and Mathematics

Probability and Statistics

Bayesian Causal Inference for Credit Default Risk

Sello Dalton Pitso

Taryn Michael

Abstract: Background: Banks often assume that higher credit limits increase customer default risk because greater exposure appears to imply greater vulnerability. This reasoning, however, conflates correlation with causation. Whether increasing a customer's credit limit truly raises the likelihood of default remains an open empirical question which this work aims to answer. Methods: We applied Bayesian causal inference to estimate the causal effect of credit limits on default probability. The analysis incorporated Directed Acyclic Graphs (DAGs) for causal structure, d-separation for identification, and Bayesian logistic regression using a dataset of 30,000 credit card holders in Taiwan (April--September 2005). Twenty-two confounding variables were adjusted for, covering demographics, repayment history, and billing and payment behavior. Continuous covariates were standardized, and posterior inference was performed using NUTS sampling with posterior predictive simulations to compute Average Treatment Effects (ATEs). Results: We found that a one standard deviation increase in credit limit reduces default probability by 1.44 percentage points (95% HDI: [-2.0%, -1.0%]), corresponding to a 6.3% relative decline. The effect was consistent across demographic subgroups and remained robust under sensitivity analysis addressing potential unmeasured confounding. Conclusion: The findings suggest that increasing credit limits can causally reduce default risk, likely by enhancing financial flexibility and lowering utilization ratios. These results have practical implications for credit policy design and motivate further investigation into mechanisms and applicability across broader lending environments.

Posted: 24 November 2025

https://doi.org/10.20944/preprints202511.1790.v1

Article

Computer Science and Mathematics

Probability and Statistics

Edgeworth Expansions When the Parameter Dimension Increases with Sample Size

Christopher Stroude Withers

Abstract: Suppose that we have a statistical model with $q=q_n$ unknown parameters, $w=(1_1,\dots,w_q)'$, estimated by $\hat{w}$, based on a sample of size $n$. Suppose also, that we have Edgeworth expansions for the density and distribution of $X_n=n^{1/2} (\hat{w}-w)$. %We ask the question: How fast can $q=q_n$ increase with $n$ for the three main Edgeworth expansions to remain valid? We show that it is sufficient that $q_n=o(n^{1/6})$, if the estimate $\hat{w}$ is a standard estimate. That is, $E\ \hat{w}\rightarrow w$ as $n\rightarrow w$, and for $r\geq 1$, its $r$th order cumulants have magnitude $n^{1-r}$ and can be expanded in powers of $n^{-1}$. This very large class of estimates has a huge range of potential applications. When $\hat{w}=t(\bar{X})$ for $t:R^q\rightarrow R^p$ a smooth function of a sample mean $\bar{X}$ from a distribution on $R^q$, and $p_nq_n=pq\rightarrow\infty$ as $n\rightarrow\infty$, I show that the Edgeworth expansions for $\hat{w}$ remain valid if $q_n^8 p_n^6=o(n)$. For example, this holds for fixed $p=p_n$ if $q_n=o(n^{1/8})$. I also give a method that greatly reduces the number of terms needed for the 2nd and 3rd order terms in the Edgeworth expansions, that is, for the 1st and 2nd order corrections to the Central Limit Theorems (CLTs).

Posted: 20 November 2025

https://doi.org/10.20944/preprints202511.1584.v1

Review

Computer Science and Mathematics

Probability and Statistics

Review on Isotonic and Convex Regression

Eunji Lim

Abstract: Shape-restricted regression provides a framework for estimating an unknown regression function $f_0: \Omega \subset \mathbb{R}^d \rightarrow \mathbb{R}$ from noisy observations $(\boldsymbol{X}_1, Y_1), \ldots, (\boldsymbol{X}_n, Y_n)$ when no explicit functional relationship between $\boldsymbol{X}$ and $Y$ is known, but $f_0$ is assumed to satisfy structural constraints such as monotonicity or convexity. In this work, we focus on these two shape constraints (monotonicity and convexity), and provide a review on the isotonic regression estimator, which is a least squares estimator under monotonicity, and the convex regression estimator, which is a least squares estimator under convexity. We review existing literature with an emphasis on the following key aspects: quadratic programming formulations of isotonic and convex regression, statistical properties of these estimators, efficient computational algorithms for computing them, their practical applications, and current challenges. Finally, we conclude with a discussion of open challenges and possible directions for future research.

Posted: 20 November 2025

https://doi.org/10.20944/preprints202511.1569.v1

Article

Computer Science and Mathematics

Probability and Statistics

Trustless – Participation in ROSCA Games

Humphrey Takunda Muchapireyi

Abstract: In this study, we propose a mechanism for rotational savings and credit associations (ROSCAs) by matching players into pools based on an anonymity rating while privacy and regulatory auditability are preserved; and fees and penalties guarantee collateral. We replace the conventional local trust, reputation, and social enforcement of these games with actuarially manufactured trust. We posit the generalization of cycle length from the usual lunar cadence to variable arbitrary periods. In fact, in Zimbabwe, ‘Rounds’ now vary the payout avenue itself, from regular cash contributions to formal bank transfers, mobile money and the dispensing of goods and groceries. We explore ex‑ante solvency via concentration bounds, budget non‑deficit under simple collateral schedules and individual rationality. Our study hints at the premise that actuarially mediated trust enables scalability, anonymity, and resilience to default.

Posted: 19 November 2025

https://doi.org/10.20944/preprints202511.1427.v1

Article

Computer Science and Mathematics

Probability and Statistics

Close Form Design Quantiles Under Skewness and Kurtosis: A Hermite Approach to Structural Reliability

Zdeněk Kala

Abstract: A Hermite-based framework for reliability assessment within the limit state method is developed in this paper. Closed-form design quantiles under a four-moment Hermite density are derived by inserting the Gaussian design quantile into a calibrated cubic translation. Admissibility and implementation criteria are established, including a monotonicity bound, a positivity condition for the platykurtic branch, and a balanced Jacobian for the leptokurtic branch. Material data for the yield strength and ductility of structural steel are fitted using moment-matched Hermite models and validated through goodness-of-fit tests. A truss structure is then analysed to quantify how non-Gaussian input geometry influences structural resistance and its corresponding design value. Variance-based Sobol sensitivity analysis demonstrates that departures of the radius distribution towards negative skewness and higher kurtosis increase the first-order contribution of geometric variables and thicken the lower tail of the resistance distribution. Closed-form Hermite design resistances are shown to agree with numerical integration results and reveal systematic deviations from FORM estimates, which rely solely on the mean and standard deviation. Monte Carlo simulation studies confirm these trends and highlight the slow convergence of tail quantiles and higher-order moments. The proposed approach remains fully compatible in the Gaussian limit and offers a practical complement to EN 1990 verification procedures when skewness and kurtosis have a significant influence on design quantiles.

Posted: 18 November 2025

https://doi.org/10.20944/preprints202511.1298.v1

Article

Computer Science and Mathematics

Probability and Statistics

Mathematical Model of Sustainable Resource Allocation Taking into Account Transaction Costs and Equilibrium Prices Under Technological Constraints

Anna V. Aleshina

Andrey L. Bulgakov

Yanliang Xin

Larisa S. Skrebkova

Abstract: A mathematical model of sustainable resource allocation in a competitive economy is developed and studied, taking into account transaction costs and technological constraints. The model describes the interaction of producers and consumers, introduces a technological set and price dynamics through demand–supply imbalance. Using the theory of covering mappings and variational methods, the existence of equilibrium prices is proven. Issues of stability, numerical algorithms, and macroeconomic interpretation of the obtained results are considered.

Posted: 18 November 2025

https://doi.org/10.20944/preprints202511.1248.v1

of 27