Preprint
Article

This version is not peer-reviewed.

Skewness and Kurtosis of mRNA Distributions in Stochastic Gene Transcription with Promoter Switching

Submitted:

14 January 2026

Posted:

15 January 2026

You are already at the latest version

Abstract
Gene expression is inherently stochastic, and promoter switching–induced transcriptional bursting generates substantial cell-to-cell variability in mRNA abundance. Such variability is commonly characterized by the mean and variance; however, these low-order statistics fail to capture the geometric features of mRNA copy number distributions and may obscure mechanistic differences in promoter dynamics. In this work, we analyze a two-state stochastic gene transcription model and derive explicit analytical expressions for higher-order moments of mRNA abundance. We show that skewness and kurtosis provide mechanistically informative signatures of transcriptional bursting, explicitly depending on promoter switching kinetics and burst size. In particular, positive skewness increases with slower promoter switching and larger burst sizes, even when the mean expression level is fixed, while elevated kurtosis distinguishes burst-dominated, low-expression regimes from Gaussian-like high-expression regimes. Our results demonstrate that distinct promoter dynamics can produce identical mean expression levels and variances while exhibiting markedly different skewness and kurtosis. Incorporating higher-order statistics, therefore, extends conventional mean–variance analyses and enables improved discrimination between competing stochastic gene expression mechanisms in single-cell data.
Keywords: 
;  ;  ;  ;  

1. Introduction

Gene expression is an inherently stochastic process, arising from the small copy numbers of DNA templates, transcription factors, and mRNA molecules involved in transcriptional regulation. Early theoretical and experimental studies have established that intrinsic molecular fluctuations can lead to pronounced cell-to-cell variability, even in genetically identical populations under homogeneous environmental conditions [1,2,3,4]. With the advent of single-cell and single-molecule experimental techniques, such variability has been directly observed at the mRNA level, revealing substantial heterogeneity in transcriptional output across individual cells [5,6,7]. Quantifying and interpreting this variability has therefore become a central problem in systems biology and mathematical biology.
Variance and the Fano factor are among the most widely used statistical measures for characterizing transcriptional noise. These quantities effectively describe the overall dispersion of gene expression levels and have played an important role in elucidating the relationship between mean expression and noise [8,9,10,11]. However, variance-based measures provide only limited information about the geometric structure of the underlying mRNA copy number distributions. In particular, they are insensitive to distributional asymmetry and tail behavior, which naturally arise from transcriptional bursting driven by stochastic promoter switching dynamics [12,13,14,15]. A detailed understanding of the shape of mRNA distributions is therefore essential for uncovering the molecular mechanisms governing gene regulation and for accurately interpreting experimental data.
In principle, the full probability mass function of mRNA copy numbers contains complete information about distributional geometry. For simple stochastic gene expression models, exact or approximate stationary distributions can sometimes be obtained [15,16]. However, for biologically realistic models with explicit promoter switching, feedback regulation, or time-dependent synthesis rates, analytical expressions for the full distribution are often unavailable or unwieldy [17,18]. Even when closed-form solutions exist, they do not necessarily yield intuitive or easily comparable summaries for experimental analysis. This motivates the introduction of reduced statistical descriptors that capture essential geometric features of the distribution while remaining analytically tractable and experimentally accessible.
In this work, we focus on two such higher-order statistical measures: skewness and kurtosis. Skewness quantifies the degree of asymmetry of the mRNA distribution and indicates whether transcriptional fluctuations are biased toward rare, high-expression events or toward low-expression states. Kurtosis characterizes the sharpness of the distribution peak and the heaviness of its tails relative to a Gaussian distribution, thereby reflecting the prevalence of extreme transcriptional events. Recent studies have demonstrated that these higher-order moments provide sensitive indicators of transcriptional bursting and gene expression heterogeneity, particularly in large single-cell datasets [6,8].
The skewness is defined as the third standardized moment
Skew [ X ] = E X μ σ 3 ,
where μ and σ denote the mean and standard deviation of the random variable X, respectively. The kurtosis is defined as the fourth standardized moment
Kurt [ X ] = E X μ σ 4 3 .
By construction, these quantities are dimensionless and therefore enable direct comparison across genes, experimental conditions, and modeling frameworks.
The remainder of this article is organized as follows. In Section 2, we introduce a two-state stochastic gene transcription model with explicit promoter switching and present its mathematical formulation. Section 3 is devoted to the analytical derivation of skewness and kurtosis for the model. In Section 4, we perform numerical simulations and apply our theoretical results to open-source experimental data to investigate how skewness, kurtosis, and the coefficient of variation depend on key kinetic parameters. Finally, concluding remarks and biological implications are discussed in Section 5.

2. The Model and Expressions of Skewness and Kurtosis

2.1. Two-State Gene Transcription Model

We consider a stochastic two-state gene transcription model, which has been widely used to characterize promoter-mediated transcriptional bursting and has also been adopted in our previous studies. A schematic representation of the model is shown in Figure 1.
In this framework, the promoter of a gene stochastically switches between two discrete states: an inactive (OFF) state and an active (ON) state. Transitions from the OFF state to the ON state occur at a constant rate λ , whereas transitions from the ON state back to the OFF state occur at a rate γ . As a result, the residence times in both promoter states are exponentially distributed. When the promoter is in the ON state, mRNA molecules are synthesized at a constant transcription rate ν , while no transcription occurs when the promoter is in the OFF state. Independent of the promoter dynamics, each mRNA molecule is degraded at a constant rate δ . This minimal model captures the essential features of transcriptional bursting induced by stochastic promoter switching.
Let M ( t ) denote the total number of mRNA transcripts present in a cell at time t, where M ( t ) takes values in the set of nonnegative integers. We introduce an indicator variable I ( t ) to represent the promoter state at time t, with I ( t ) = 0 corresponding to the gene OFF state and I ( t ) = 1 corresponding to the gene ON state. The joint probability distributions are defined as
P 0 ( m , t ) = Prob M ( t ) = m , I ( t ) = 0 , P 1 ( m , t ) = Prob M ( t ) = m , I ( t ) = 1 .
The time evolution of the joint probabilities P 0 ( m , t ) and P 1 ( m , t ) is governed by the following system of chemical master equations:
d P 0 ( m , t ) d t = γ P 1 ( m , t ) λ P 0 ( m , t ) + δ ( m + 1 ) P 0 ( m + 1 , t ) m P 0 ( m , t ) , d P 1 ( m , t ) d t = λ P 0 ( m , t ) γ P 1 ( m , t ) + δ ( m + 1 ) P 1 ( m + 1 , t ) m P 1 ( m , t )
+ ν P 1 ( m 1 , t ) P 1 ( m , t ) ,
for m 0 , with the convention that P 1 ( 1 , t ) = 0 .
The marginal probability mass function of the mRNA copy number is then given by
P ( m , t ) = P 0 ( m , t ) + P 1 ( m , t ) ,
which represents the probability that m mRNA molecules are present in the cell at time t.
The terms in Eqs. (2.1)–(2.2) account, respectively, for stochastic switching of the promoter state, mRNA synthesis in the active state, and first-order degradation of mRNA molecules. This coupled system of master equations fully characterizes the stochastic dynamics of the two-state transcription process.
In principle, the probability mass function P ( m , t ) provides a complete description of the distribution of mRNA copy numbers across a cell population. However, for transcription models of even moderate complexity, obtaining explicit analytical expressions for P ( m , t ) is often intractable. Even for the two-state model considered here, closed-form solutions are available only under restrictive assumptions and are typically expressed in terms of integrals or special functions. Consequently, rather than relying exclusively on the full probability mass function, we focus on higher-order statistical descriptors—such as skewness and kurtosis—to characterize the geometric features of the mRNA copy-number distribution. These quantities provide tractable and informative measures of transcriptional variability when explicit distributional forms are unavailable.

2.2. Moments, Skewness, and Kurtosis of Transcript Distributions

Let μ k ( t ) denote the kth moment of the mRNA transcript number at time t. By definition,
μ k ( t ) = E M k ( t ) = m = 0 m k P ( m , t ) ,
where P ( m , t ) is the marginal probability mass function of the mRNA copy number.
The first moment μ 1 ( t ) corresponds to the mean mRNA copy number and characterizes the average transcription level across the cell population. The second central moment defines the variance,
σ 2 ( t ) = E ( M ( t ) μ 1 ( t ) ) 2 = μ 2 ( t ) μ 1 2 ( t ) ,
which quantifies the magnitude of transcriptional fluctuations around the mean.
To further characterize the geometric properties of the transcript number distribution, we consider higher-order standardized moments. The skewness is defined as
Skew ( t ) = E ( M ( t ) μ 1 ( t ) ) 3 σ 3 ( t ) ,
which measures the degree of asymmetry of the distribution. The excess kurtosis is defined by
Kurt ( t ) = E ( M ( t ) μ 1 ( t ) ) 4 σ 4 ( t ) 3 ,
and quantifies the heaviness of the distribution tails relative to a Gaussian distribution.
Equations (2.5) and (2.6) show that both skewness and kurtosis are fully determined by the first four moments of the transcript number distribution. Consequently, analyzing the dynamics of these moments provides direct insight into the temporal evolution of the shape of mRNA distributions arising from stochastic gene transcription.

2.3. Moment Expressions and Higher-Order Statistics

We now derive explicit expressions for the moments of the transcript number distribution. In the following analysis, we count only newly synthesized mRNA molecules and assume that the promoter initially resides in the gene OFF state at time t = 0 . Accordingly, the initial condition is
M ( 0 ) = 0 , I ( 0 ) = 0 .
Condition (2.7) implies
P 0 ( 0 , 0 ) = 1 , P 0 ( m , 0 ) = 0 for m 1 , P 1 ( m , 0 ) = 0 for m 0 .
Consequently, all moments vanish at the initial time,
μ k ( 0 ) = 0 , k 1 .
Using calculations analogous to those in previous studies, the time-dependent mean mRNA copy number is obtained as
μ 1 ( t ) = ν λ δ ( λ + γ ) ν λ δ ( λ + γ δ ) e δ t + ν λ ( λ + γ ) ( λ + γ δ ) e ( λ + γ ) t .
The last two terms in (2.10) decay to zero exponentially as t . Hence, as t , the mean mRNA copy number converges to the steady-state value
μ 1 = ν λ δ ( λ + γ ) .
Similarly, the second moment of the transcript number distribution at steady state is given by
μ 2 = μ 1 1 + ν ( δ + λ ) δ ( δ + λ + γ ) .
The variance of the transcript number distribution then follows as
σ 2 = μ 2 μ 1 2 = μ 1 1 + ν γ ( λ + γ ) ( δ + λ + γ ) ,
where σ denotes the standard deviation.
The squared coefficient of variation (noise) and the Fano factor are therefore given by
CV 2 = σ 2 μ 1 2 = 1 μ 1 + δ γ λ ( δ + λ + γ ) , Φ = σ 2 μ 1 = 1 + ν γ ( λ + γ ) ( δ + λ + γ ) .
These quantities characterize the dispersion of transcript numbers across single cells but do not fully describe the shape of the underlying distribution.
To further elucidate the distributional geometry of mRNA copy numbers, we consider the skewness and kurtosis. By definition,
Skew = μ 3 3 μ 1 μ 2 + 2 μ 1 3 σ 3 ,
Kurt = μ 4 4 μ 1 μ 3 + 6 μ 1 2 μ 2 3 μ 1 4 σ 4 3 ,
where μ 3 and μ 4 denote the third and fourth moments of the transcript number distribution, respectively. Explicit expressions for these higher-order moments are given by
μ 3 = μ 1 1 + 3 ν ( δ + λ ) δ ( δ + λ + γ ) + ν 2 ( δ + λ ) ( 2 δ + λ ) δ 2 ( δ + λ + γ ) ( 2 δ + λ + γ ) , μ 4 = μ 1 1 + 7 ν ( δ + λ ) δ ( δ + λ + γ ) + 6 ν 2 ( δ + λ ) ( 2 δ + λ ) δ 2 ( δ + λ + γ ) ( 2 δ + λ + γ ) + ν 3 ( δ + λ ) ( 2 δ + λ ) ( 3 δ + λ ) δ 3 ( δ + λ + γ ) ( 2 δ + λ + γ ) ( 3 δ + λ + γ ) .
These expressions explicitly reveal how skewness and kurtosis depend on transcriptional kinetics, promoter switching dynamics, and mRNA degradation rates.
Biological interpretation of skewness and kurtosis. Skewness and kurtosis provide biologically meaningful information about the shape of the mRNA copy number distribution beyond what is captured by the mean and variance. A positive value of Skew indicates a pronounced right tail, corresponding to rare cells exhibiting exceptionally high transcript abundance. Such asymmetry is a characteristic feature of stochastic gene expression and is commonly associated with transcriptional bursting.
Kurtosis quantifies the weight of the distribution tails relative to a Gaussian distribution. A large positive value of Kurt implies an elevated probability of extreme transcriptional events, reflecting strong cell-to-cell heterogeneity. Together, skewness and kurtosis offer intuitive and quantitative measures of how transcriptional noise manifests at the population level and how strongly the mRNA distribution deviates from symmetric, light-tailed behavior.

3. Discussion

The explicit moment expressions derived above enable a systematic asymptotic analysis of the skewness and kurtosis in biologically relevant limits. In this section, we summarize the leading-order behavior of these higher-order statistics in three canonical parameter regimes that are commonly encountered in stochastic gene expression models. All kinetic parameters are nondimensionalized by the mRNA degradation rate δ . Under this rescaling, both the numerical values and qualitative features of the moments remain unchanged. Therefore, without loss of generality, we set δ = 1 in the following analysis.
The asymptotic structure of the skewness and kurtosis is primarily governed by the relative timescales of promoter switching and mRNA degradation. Depending on whether promoter switching is fast or slow compared to transcript turnover, qualitatively distinct transcriptional behaviors emerge, each associated with characteristic distributional signatures [19,20,21,22]. Accordingly, we classify the asymptotic regimes into fast promoter switching, slow promoter switching, and a bursty transcription limit. For each regime, we derive the leading-order asymptotic expressions for skewness and kurtosis and interpret their implications for transcriptional variability and distributional shape.
Fast promoter switching regime.      Assume
λ + γ 1 ,
so that promoter switching is much faster than mRNA degradation [23]. In this limit, the promoter rapidly equilibrates on the timescale of mRNA dynamics, and the system effectively reduces to a single birth–death process with mean
μ 1 = ν λ λ + γ .
Expanding the higher-order moments in powers of ( λ + γ ) 1 yields, to leading order,
μ 2 μ 1 2 + μ 1 , μ 3 μ 1 3 + 3 μ 1 2 + μ 1 , μ 4 μ 1 4 + 6 μ 1 3 + 7 μ 1 2 + μ 1 .
Consequently, the skewness and excess kurtosis scale asymptotically as
Skew 1 μ 1 , Kurt 1 μ 1 .
Thus, both skewness and excess kurtosis decrease as the mean expression level increases, and the mRNA copy number distribution approaches a Poisson-like form with weak asymmetry and light tails.
Slow promoter switching regime.      Assume
λ + γ 1 ,
so that mRNA molecules degrade rapidly relative to promoter switching [24,25]. In this regime, transcription occurs in well-separated episodes, leading to pronounced cell-to-cell heterogeneity. Retaining the leading-order terms in ( λ + γ ) , we obtain the asymptotic expressions
μ 2 μ 1 2 1 + γ λ , μ 3 μ 1 3 1 + 3 γ λ + 2 γ 2 λ 2 , μ 4 μ 1 4 1 + 6 γ λ + 11 γ 2 λ 2 + 6 γ 3 λ 3 .
Consequently, the skewness and excess kurtosis scale as
Skew 2 γ λ , Kurt 6 γ λ .
Both quantities increase with the ratio γ / λ , reflecting strong distributional asymmetry and heavy-tailed behavior driven by long inactive periods and rare promoter activation events. Although the distribution may exhibit bimodality in this regime, the large excess kurtosis originates from intermittent transcription bursts and long OFF periods, which generate strong deviations from Poisson-like statistics.
Bursty transcription regime.      Assume
λ γ , ν 1 ,
while keeping the burst size
B = ν γ
finite. In this limit, transcription occurs in rare but intense bursts. The mean mRNA copy number is
μ 1 = λ B .
The higher-order moments are asymptotically given by
μ 2 μ 1 2 ( 1 + B ) , μ 3 μ 1 3 ( 1 + 3 B + 2 B 2 ) , μ 4 μ 1 4 ( 1 + 6 B + 11 B 2 + 6 B 3 ) .
From these, the skewness and excess kurtosis scale as
Skew 2 + B μ 1 , Kurt 6 + 6 B + B 2 μ 1 .
These expressions show that increasing the burst size amplifies both skewness and excess kurtosis, even when the mean expression level is fixed, providing a natural mechanism for generating highly asymmetric, heavy-tailed mRNA distributions.
The asymptotic behavior of the mean, variance, skewness, and kurtosis in different promoter switching and transcription regimes is summarized in Table 1.
Table 1. Asymptotic expressions for steady-state mRNA statistics in different promoter switching and transcription regimes, including the squared coefficient of variation (CV2).
Table 1. Asymptotic expressions for steady-state mRNA statistics in different promoter switching and transcription regimes, including the squared coefficient of variation (CV2).
Regime Mean μ 1 Variance σ 2 CV2 Skewness Kurtosis
Fast switching ν λ λ + γ μ 1 μ 1 1 μ 1 1 / 2 μ 1 1
Slow switching ν λ λ + γ μ 1 2 1 + γ λ 1 + γ λ 2 γ λ 6 γ λ
Bursty transcription λ B μ 1 ( 1 + B ) 1 + B μ 1 2 + B μ 1 6 + 6 B + B 2 μ 1
Connection to negative binomial distributions. In bursty transcription limits, the steady-state mRNA copy number distribution converges to a negative binomial distribution with mean μ 1 and variance
σ 2 = μ 1 ( 1 + B ) ,
where B denotes the burst size. The negative binomial probability mass function (PMF) can be written as
P ( m ) = m + r 1 m p r ( 1 p ) m ,
with parameters
r = μ 1 2 σ 2 μ 1 = μ 1 B , p = μ 1 σ 2 = 1 1 + B .
This negative binomial distribution reproduces the asymptotic expressions for skewness and excess kurtosis derived above:
Skew = 2 + B μ 1 , Kurt = 6 + 6 B + B 2 μ 1 .
This explicit correspondence provides a theoretical justification for the widespread use of negative binomial models in fitting single-cell RNA-sequencing data, where transcriptional bursting and promoter switching naturally generate transcript count distributions that are overdispersed relative to a Poisson process. The relative timescale of promoter switching and mRNA degradation critically determines transcriptional variability and distributional shape. In the fast-switching regime, the promoter rapidly equilibrates between active and inactive states, so that transcription effectively proceeds at an averaged rate [20,26]. Consequently, transcriptional bursts are weak and the mRNA copy number distribution approaches a Poisson-like form with reduced skewness and kurtosis. In contrast, slow promoter switching gives rise to long inactive periods punctuated by rare activation events that generate intense transcriptional bursts [7,14,27]. This temporal clustering leads to highly asymmetric and heavy-tailed mRNA distributions, characterized by large positive skewness and elevated kurtosis. Importantly, these higher-order features persist even when mean expression levels are fixed, demonstrating that slow promoter switching encodes mechanistic information beyond that captured by mean–variance statistics alone.

4. Numerical Simulations

In this section, all parameters are nondimensionalized by the mRNA degradation rate. We define
ν = 200 min 1 , λ = 0.5 min 1 , γ = 0.5 min 1 , δ = 1 min 1 .
This normalization does not affect the values or qualitative properties of the mean, skewness, or kurtosis of the mRNA distribution.
To identify the dominant regulatory factors underlying the skewness and kurtosis of mRNA copy number distributions, we examined their joint dependence on the promoter activation rate λ and the mRNA degradation rate γ using three-dimensional representations. In Figure 2, the three-dimensional surfaces display the values of the skewness (A) and kurtosis (B) of the steady-state mRNA distribution as functions of λ and γ , both varying within the range [ 0.05 , 5 ] . Red dots mark the locations of the minimum values on each surface. Slow promoter switching, corresponding to small λ or γ , gives rise to large skewness and kurtosis due to pronounced transcriptional bursting, whereas faster switching reduces higher-order variability. Interestingly, when λ and γ are comparable in magnitude and both are small, a global minimum of both skewness and kurtosis is observed. These surfaces provide a clear visualization of how promoter kinetics shape mRNA variability across a biologically relevant parameter space.
To further assess the impact of promoter activation dynamics, we compare two representative regimes: a slow-switching promoter ( λ = 0.5 ) and a fast-switching promoter ( λ = 5 ), while keeping all other parameters fixed. Figure 3 shows the steady-state distributions of mRNA copy numbers obtained from exact Gillespie simulations of the two-state gene transcription model.
For slow promoter activation ( λ = 0.5 , Figure 3A), the gene spends long periods in the inactive state, interrupted by rare but extended activation events. During each active period, mRNA molecules are produced rapidly, giving rise to pronounced transcriptional bursts. As a consequence, the distribution is strongly right-skewed. Moreover, the coexistence of low- and high-expression subpopulations leads to a bimodal structure, in which probability mass is distributed between two modes rather than concentrated in extreme tails, resulting in a reduced kurtosis compared to burst-dominated, heavy-tailed distributions.
In contrast, for fast promoter activation ( λ = 5 , Figure 3B), frequent switching between the active and inactive states effectively averages transcriptional activity over time. This suppresses burstiness and yields a narrower and more symmetric mRNA distribution with reduced skewness and kurtosis.
In both regimes, the vertical lines denote the mean mRNA copy number. Analytical predictions derived from the exact expressions of the first four moments are in quantitative agreement with the simulation results, validating the theoretical analysis of mean expression levels and higher-order fluctuations.

5. Conclusions

Stochastic gene expression is most often characterized using low-order statistical descriptors such as the mean expression level, variance, or related noise measures including the Fano factor. While these quantities effectively quantify the overall magnitude of transcriptional variability, it has long been recognized that they are insufficient to capture the full geometric structure of mRNA copy number distributions generated by promoter switching and transcriptional bursting [3,11]. In this work, we demonstrate that skewness provides essential information about distributional asymmetry that cannot be inferred from variance alone.
Positive skewness emerges naturally from bursty transcription dynamics, in which rare promoter activation events give rise to intense transcriptional episodes [7,14,27]. Our analytical results show that the skewness Skew depends explicitly on promoter switching kinetics and burst size, increasing with the ratio γ / λ and with the burst size B = ν / γ , even when the mean expression level is held fixed. This establishes skewness as a mechanistically informative statistic that reflects the underlying regulatory architecture rather than merely the overall noise magnitude. From a biological perspective, skewness quantifies the contribution of rare, high-expression cells within a population. Such cells can play a disproportionate role in downstream regulatory processes, including threshold-dependent responses, stress adaptation, and phenotypic diversification [25,28]. Variance-based measures treat fluctuations symmetrically around the mean and therefore obscure the statistical weight of these rare transcriptional events, whereas skewness directly captures the asymmetry induced by transcriptional bursting and slow promoter switching.
Beyond skewness, kurtosis further characterizes the sharpness and tail heaviness of mRNA distributions. Our results indicate that low mRNA abundance regimes are associated with elevated skewness and kurtosis, reflecting highly asymmetric and heavy-tailed distributions dominated by rare transcriptional bursts. As mRNA abundance increases, both skewness and kurtosis decrease, and the distribution approaches a Gaussian-like form. This behavior is consistent with effective averaging over many transcription events and aligns with previous theoretical analyses of noise–mean relationships in gene expression models [8,29].
Finally, our analysis provides a theoretical foundation for extending negative binomial–based analyses of single-cell transcriptomic data beyond conventional mean–variance fitting [6]. Distinct promoter dynamics may generate identical mean expression levels and variances while exhibiting markedly different skewness and kurtosis, a limitation that has been highlighted in recent studies on model identifiability and model reduction [26,30]. Incorporating higher-order statistics therefore offers a principled route toward improved discrimination between competing transcriptional mechanisms and establishes a tighter quantitative link between statistical inference and the underlying molecular regulation of gene expression.
Limitations and future directions. The present study focuses on a minimal two-state promoter model with constant kinetic parameters, which, while analytically tractable, represents an idealized description of transcriptional regulation. In particular, real promoters often exhibit multiple inactive or active states, transcription factor–dependent switching rates, and memory effects that can give rise to non-exponential waiting times between transcriptional events. Extending the analytical framework developed here to multi-state promoter architectures or non-Markovian switching processes would provide a more comprehensive characterization of transcriptional heterogeneity.
Moreover, our analysis is restricted to intrinsic noise arising from stochastic transcription and promoter switching and does not explicitly account for extrinsic sources of variability such as cell-to-cell differences in transcription factor abundance, cell cycle stage, or global physiological state. Incorporating extrinsic fluctuations, either through additional stochastic variables or through hierarchical modeling approaches, represents an important direction for future work.
Finally, while the present results establish skewness and kurtosis as mechanistically informative statistics at the mRNA level, an important open question concerns how these higher-order features propagate through translation and post-transcriptional regulation to shape protein-level variability. Systematic analysis of higher-order moments in coupled mRNA–protein models, together with direct comparisons to single-cell transcriptomic and proteomic data, will be essential for further bridging stochastic gene expression theory with experimental inference.

Author Contributions

ST and QS conceived the study, ST carried out the analysis, and ST and QS wrote the draft, all authors revised the manuscript critically and approved it for publishing.

Funding

This work was supported by the National Natural Science Foundation of China (No. 12171113) and the China Scholarship Council (No. 202409940012).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no competing interests.

Appendix A Detailed Derivation of Moments and Asymptotic Formulas

In this appendix, we give a step-by-step derivation of the mean, variance, skewness, and excess kurtosis of mRNA distributions in the two-state promoter model. The derivation proceeds from the chemical master equations (CME) (2.1)–(2.2) for the OFF/ON states.

Appendix A.1. Promoter State Probabilities

Summing over all transcript numbers m, define
P 0 ( t ) = m = 0 P 0 ( m , t ) , P 1 ( t ) = m = 0 P 1 ( m , t ) ,
which give the probability that the promoter resides in the OFF and ON states, respectively. Since these form a complete set of promoter states,
P 0 ( t ) + P 1 ( t ) = 1 , t 0 .
Differentiating and using the CME, we obtain
P 0 ( t ) = γ P 1 ( t ) λ P 0 ( t ) , P 1 ( t ) = λ P 0 ( t ) γ P 1 ( t ) .
Solving with initial conditions P 0 ( 0 ) = 1 , P 1 ( 0 ) = 0 , we get
P 0 ( t ) = γ λ + γ + λ λ + γ e ( λ + γ ) t , P 1 ( t ) = λ λ + γ λ λ + γ e ( λ + γ ) t .
The steady-state probabilities are
P 0 = γ λ + γ , P 1 = λ λ + γ .

Appendix A.2. First Moment (Mean)

Define the k-th moment of transcript numbers as
μ k ( t ) = m = 0 m k P ( m , t ) , with P ( m , t ) = P 0 ( m , t ) + P 1 ( m , t ) .
Decomposing by promoter state,
μ k ( t ) = μ k 0 ( t ) + μ k 1 ( t ) , μ k 0 ( t ) = m = 0 m k P 0 ( m , t ) , μ k 1 ( t ) = m = 0 m k P 1 ( m , t ) .
For k = 1 (the mean),
μ 10 ( t ) = γ μ 11 ( t ) ( λ + δ ) μ 10 ( t ) ,
μ 11 ( t ) = ν P 1 ( t ) + λ μ 10 ( t ) ( γ + δ ) μ 11 ( t ) .
Step 1: Steady-state solution.
Setting derivatives to zero, the equilibrium moments are
μ 10 = ν λ γ δ ( λ + γ ) ( δ + λ + γ ) , μ 11 = ν λ ( δ + λ ) δ ( λ + γ ) ( δ + λ + γ ) .
The total mean is
μ 1 = μ 10 + μ 11 = ν λ δ ( λ + γ ) .
Step 2: Time-dependent solution.
Solving (A1)–(A2) with initial condition μ 10 ( 0 ) = μ 11 ( 0 ) = 0 yields
μ 10 ( t ) = μ 10 A 1 e δ t A 2 e ( λ + γ ) t A 3 e ( δ + λ + γ ) t , μ 11 ( t ) = μ 11 B 1 e δ t B 2 e ( λ + γ ) t + B 3 e ( δ + λ + γ ) t ,
with constant coefficients A i , B i . The sum gives the time-dependent mean μ 1 ( t ) .

Appendix A.3. Second Moment (Variance)

Similarly, for k = 2 ,
μ 20 ( t ) = δ μ 10 ( t ) + γ μ 21 ( t ) ( 2 δ + λ ) μ 20 ( t ) , μ 21 ( t ) = ν P 1 ( t ) + 2 ν μ 11 ( t ) + δ μ 11 ( t ) + λ μ 20 ( t ) ( 2 δ + γ ) μ 21 ( t ) .
Step 1: Steady-state solution.
Setting derivatives to zero,
μ 20 = ν λ γ [ ν ( δ + λ ) + δ ( 2 δ + λ + γ ) ] δ 2 ( λ + γ ) ( δ + λ + γ ) ( 2 δ + λ + γ ) , μ 21 = ν λ ( δ + λ ) [ ν ( 2 δ + λ ) + δ ( 2 δ + λ + γ ) ] δ 2 ( λ + γ ) ( δ + λ + γ ) ( 2 δ + λ + γ ) .
The total second moment is μ 2 = μ 20 + μ 21 , and the variance is
σ 2 = μ 2 μ 1 2 .

Appendix A.4. Third and Fourth Moments

Third moment:
μ 30 ( t ) = γ μ 31 ( t ) δ μ 10 ( t ) + 3 δ μ 20 ( t ) ( 3 δ + λ ) μ 30 ( t ) , μ 31 ( t ) = ν [ P 1 ( t ) + 3 μ 11 ( t ) + 3 μ 21 ( t ) ] + λ μ 30 ( t ) δ μ 11 ( t ) + 3 δ μ 21 ( t ) ( 3 δ + γ ) μ 31 ( t ) .
At steady state,
μ 3 = μ 30 + μ 31 .
Fourth moment:
μ 4 ( t ) = ν P 1 ( t ) + 4 μ 11 ( t ) + 6 μ 21 ( t ) + 4 μ 31 ( t ) + δ μ 1 ( t ) 4 μ 2 ( t ) + 6 μ 3 ( t ) 4 μ 4 ( t ) .
Steady-state solution μ 4 is obtained by setting derivative to zero and solving algebraically.

Appendix B Derivation of Asymptotic Formulas

In this appendix, we present the detailed derivation of the asymptotic expressions for the moments, skewness, and excess kurtosis in different transcription regimes. We start from the exact expressions for the first four moments derived in Appendix A.

Appendix B.1. Fast-Switching Regime

In the fast-switching limit, the promoter switching rates are much larger than mRNA degradation,
λ + γ 1 .
The exact moments contain terms proportional to ( λ + γ ) 1 and transient exponential terms e ( λ + γ ) t . Since these decay rapidly, we can neglect them at steady state. Expanding the exact expressions in powers of ( λ + γ ) 1 , we obtain
μ 1 = ν λ λ + γ , μ 2 = μ 1 2 + μ 1 + O ( ( λ + γ ) 1 ) , μ 3 = μ 1 3 + 3 μ 1 2 + μ 1 + O ( ( λ + γ ) 1 ) , μ 4 = μ 1 4 + 6 μ 1 3 + 7 μ 1 2 + μ 1 + O ( ( λ + γ ) 1 ) .
From the definitions of skewness and excess kurtosis,
Skew = μ 3 3 μ 1 μ 2 + 2 μ 1 3 σ 3 , Kurt = μ 4 4 μ 1 μ 3 + 6 μ 1 2 μ 2 3 μ 1 4 σ 4 ,
we obtain to leading order
Skew 1 μ 1 , Kurt 1 μ 1 .
Thus, in the fast-switching regime, the distribution is close to Poisson, with weak asymmetry and light tails.

Appendix B.2. Slow-Switching Regime

In the slow-switching limit, mRNA degradation is fast relative to promoter switching,
λ + γ 1 .
In the exact expressions for μ 2 , μ 3 , μ 4 , the terms 1 + λ + γ 1 and 2 + λ + γ 2 , etc. Retaining leading-order terms in λ + γ yields
μ 2 μ 1 2 1 + γ λ , μ 3 μ 1 3 1 + 3 γ λ + 2 γ 2 λ 2 , μ 4 μ 1 4 1 + 6 γ λ + 11 γ 2 λ 2 + 6 γ 3 λ 3 .
Plugging into the skewness and excess kurtosis formulas gives
Skew 2 γ λ , Kurt 6 γ λ .
Derivation step:
1. Start from the exact moments (see Appendix A), e.g.,
μ 2 = μ 1 1 + ν ( 1 + λ ) 1 + λ + γ μ 1 2 1 + γ λ ,
using 1 λ + γ . 2. Similarly, expand μ 3 and μ 4 in powers of λ + γ , keeping the leading terms. 3. Compute central moments to obtain skewness and excess kurtosis.

Appendix B.3. Bursty Transcription Regime

The bursty limit corresponds to rare but intense transcription events,
λ γ , ν δ ,
while keeping the burst size
B = ν γ = finite .
Step 1: Mean and variance
In this limit, the mean and variance reduce to
μ 1 = λ B δ , σ 2 = μ 1 ( 1 + B ) ,
which coincide with a negative binomial distribution.
Step 2: Higher-order moments
Expanding the exact expressions of μ 3 , μ 4 in powers of B and keeping leading-order terms,
μ 3 μ 1 ( 1 + 3 B + 2 B 2 ) , μ 4 μ 1 2 ( 1 + 6 B + 11 B 2 + 6 B 3 ) ,
which after standardization yield
Skew 2 + B μ 1 , Kurt 6 + 6 B + B 2 μ 1 .
Step 3: Connection to negative binomial
The negative binomial PMF with parameters
r = μ 1 2 σ 2 μ 1 , p = μ 1 σ 2
reproduces the above skewness and excess kurtosis, confirming that bursty transcription naturally leads to overdispersed transcript count distributions.
Taken together, the slow-switching and bursty limits illustrate how promoter kinetics and burst size control the higher-order statistics of mRNA distributions, including skewness and excess kurtosis.

References

  1. Elowitz, M; Levine, A; Siggia, E; et al. Stochastic gene expression in a single cell. Science 2002, 297(5584), 1183–1186. [Google Scholar] [CrossRef]
  2. Hansen, A; Zechner, C. Promoters adopt distinct dynamic manifestations depending on transcription factor context. Molecular Systems Biology 2021, 17, MSB20209821. [Google Scholar] [CrossRef] [PubMed]
  3. Paulsson, J. Models of stochastic gene expression. Physics of Life Reviews 2004, 1(3), 157–175. [Google Scholar] [CrossRef]
  4. Swain, P; Elowitz, M; Siggia, E. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proceedings of the National Academy of Sciences 2002, 99(20), 12795–12800. [Google Scholar] [CrossRef]
  5. Aitken, S; Robert, M; Alexander, R; et al. Processivity and coupling in messenger RNA transcription. PLoS One 2010, 5, e8845. [Google Scholar] [CrossRef]
  6. Church, B; Williams, H; Mar, J. Investigating skewness to understand gene expression heterogeneity in large patient cohorts. BMC Bioinformatics 2019, 20, 668. [Google Scholar] [CrossRef] [PubMed]
  7. Golding, I; Paulsson, J; Zawilski, S; et al. Real-time kinetics of gene activity in individual bacteria. Cell 2005, 123(6), 1025–1036. [Google Scholar] [CrossRef]
  8. Kuang, J; Tang, M; Yu, J. The mean and noise of protein numbers in stochastic gene expression. Journal of Mathematical Biology 2013, 67, 261–291. [Google Scholar] [CrossRef]
  9. Sun, Q; Tang, M; Yu, J. Temporal profile of gene transcription noise modulated by cross-talking signal transduction pathways. Bulletin of Mathematical Biology 2012, 74, 375–398. [Google Scholar] [CrossRef]
  10. Sun, Q; Jiao, F; Yu, J. The dynamics of gene transcription with a periodic synthesis rate. Nonlinear Dynamics 2021, 104, 4477–4492. [Google Scholar] [CrossRef]
  11. Tang, M. The mean and noise of stochastic gene transcription. Journal of Theoretical Biology 2008, 253, 271–280. [Google Scholar] [CrossRef]
  12. Blake, W; Kærn, M; Cantor, C. Noise in eukaryotic gene expression. Nature 2003, 422(6932), 633–637. [Google Scholar] [CrossRef]
  13. Lin, G; Yu, J; Zhou, Z; et al. Fluctuations of mRNA distributions in multiple pathway activated transcription. Discrete and Continuous Dynamical Systems - B 2019, 24(4), 1543–1568. [Google Scholar] [CrossRef]
  14. Liu, F; Wang, W; Ni, T; et al. Gene transcription in bursting: a unified mode for realizing accuracy and stochasticity. Biological Reviews 2019, 94(1), 248–258. [Google Scholar]
  15. Peccoud, J; Ycart, B. Markovian modeling of gene product synthesis. Theoretical Population Biology 1995, 48(2), 138–164. [Google Scholar] [CrossRef]
  16. Mackey, M; Tyran-Kamińska, M; Yvinec, R. Dynamic behavior of stochastic gene expression models in the presence of bursting. SIAM Journal on Applied Mathematics 2013, 73, 1830–1852. [Google Scholar] [CrossRef]
  17. Bokes, P. Stationary and time-dependent molecular distributions in slow-fast feedback Circuits. SIAM Journal on Applied Dynamical Systems 2022, 21(2), 903–931. [Google Scholar] [CrossRef]
  18. Jia, C; Li, Y. Analytical Time-Dependent Distributions for Gene Expression Models With Complex Promoter Switching Mechanisms. SIAM Journal on Applied Mathematics 2023, 83(4), 1572–1602. [Google Scholar] [CrossRef]
  19. da Silva, L; Yvinec, R; Prata, G; et al. Two-state stochastic model of in vivo observations of transcriptional bursts. Brazilian Journal of Physics 2025, 55, 150. [Google Scholar] [CrossRef]
  20. Jia, C; Grima, R. Dynamical phase diagram of an auto-regulating gene in fast switching conditions. The Journal of Chemical Physics 2020, 152, 174110. [Google Scholar] [CrossRef]
  21. Pedraza, J; Paulsson, J. Effects of molecular memory and bursting on fluctuations in gene expression. Science 2008, 319(5867), 1056–1060. [Google Scholar] [CrossRef] [PubMed]
  22. Zhang, Q; Cao, W; Wang, J; et al. Transcriptional bursting dynamics in gene expression. Frontiers Genetics 2024, 15, 1451461. [Google Scholar] [CrossRef]
  23. Sepúlveda, L; Xu, H; Zhang, J; et al. Measurement of gene regulation in individual cells reveals rapid switching between promoter states. Science 2016, 351, 1218–1222. [Google Scholar] [CrossRef] [PubMed]
  24. Lin, Y; Buchler, N. Efficient analysis of stochastic gene dynamics in the non-adiabatic regime using piecewise deterministic Markov processes. Journal of The Royal Society Interface 2018, 15(138), 20170804. [Google Scholar] [CrossRef]
  25. Thomas, P; Popović, N; Grima, R. Phenotypic switching in gene regulatory networks. Proceedings of the National Academy of Sciences 2014, 111(19), 6994–6999. [Google Scholar] [CrossRef]
  26. Holehouse, J; Grima, R. Revisiting the reduction of stochastic models of genetic feedback loops with fast promoter switching. Biophysical Journal 2019, 117(7), 1311–1330. [Google Scholar] [CrossRef] [PubMed]
  27. Hebenstreit, D; Karmakar, P. Transcriptional bursting: from fundamentals to novel insights. Biochemical Society Transactions 2024, 52(4), 1695–1702. [Google Scholar] [CrossRef]
  28. Fraser, D; Kærn, M. A chance at survival: gene expression noise and phenotypic diversification strategies. Molecular Microbiology 2009, 71(6), 1333–1340. [Google Scholar] [CrossRef]
  29. Hornung, G; Bar-Ziv, R; Rosin, D; et al. Noise–mean relationship in mutated promoters. Genome Research 2012, 22(12), 2409–2417. [Google Scholar] [CrossRef]
  30. Jiao, F; Li, J; Liu, T; et al. What can we learn when fitting a simple telegraph model to a complex gene expression model? PLoS Computational Biology 2024, 20(5), e1012118. [Google Scholar] [CrossRef]
Figure 1. Schematic of the two-state telegraph model for mRNA bursty transcription. The gene promoter stochastically switches between an inactive state and an active state. No mRNA is produced when the promoter is inactive, whereas during the active state, mRNA molecules are synthesized in bursts, with burst initiation events occurring as a Poisson process at rate ν . Each mRNA molecule degrades independently at rate δ .
Figure 1. Schematic of the two-state telegraph model for mRNA bursty transcription. The gene promoter stochastically switches between an inactive state and an active state. No mRNA is produced when the promoter is inactive, whereas during the active state, mRNA molecules are synthesized in bursts, with burst initiation events occurring as a Poisson process at rate ν . Each mRNA molecule degrades independently at rate δ .
Preprints 194248 g001
Figure 2. Skewness and kurtosis of the steady-state mRNA distribution as functions of the promoter switching rates λ and γ. Red dots indicate the minimum values on each surface.
Figure 2. Skewness and kurtosis of the steady-state mRNA distribution as functions of the promoter switching rates λ and γ. Red dots indicate the minimum values on each surface.
Preprints 194248 g002
Figure 3. Steady-state mRNA distributions for different promoter activation rates. (A) Slow promoter activation ( λ = 0.5 ). (B) Fast promoter activation ( λ = 5 ).
Figure 3. Steady-state mRNA distributions for different promoter activation rates. (A) Slow promoter activation ( λ = 0.5 ). (B) Fast promoter activation ( λ = 5 ).
Preprints 194248 g003
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated