Introduction
Adverse drug reactions (ADRs) represent a significant source of morbidity and mortality. Therefore, these medication-based complications negatively impact both the cost and quality of health care [
1,
2].
The Zipf-Mandelbrot (ZM) law has been applied to mathematically model a wide range of phenomena including linguistics [
3], insurance risk [
4], scientific citations [
5], web hits [
6], economics [
7], and urban population [
8]. Furthermore, it is frequently referred to as the Pareto-Zipf distribution [
9].
This paper examines the applicability of the ZM law to mathematically model US FDA reported ADRs for the following medications: fentanyl, propofol, albumin, succinylcholine, ketamine, and isoflurane. Although these six commonly used medications are pharmacologically dissimilar, their ADRs are collectively represented using this model with medication-specific coefficients. The ZM law is an appropriate choice for modeling these data because ADRs are highly skewed and inherently rank-based. Consequently, a small number of common ADRs account for the majority of reports, while a multitude of less frequent reactions constitute the long tail of the distribution.
For the purposes of this research, the ZM law will be represented as:
where
represents the percentage occurrence of each FDA adverse event associated with a sequential rank,
. Moreover,
is a dimensionless natural integer ranging from 1 to
N. Where
N represents the total number of unique medication-specific ADRs.
Note that the maximum value for always occurs at which is the most frequently reported adverse reaction. Whereas represents the percentage occurrence of the least reported adverse reaction which occurs at
Furthermore, , , and are all positive real numbers. Also, and . It should be noted that small positive values for which are less than one do not generate a sufficient skewness to adequately represent these clinical data.
In addition, coefficients and are both dimensionless and are determined using curve-fitting. When the ZM law reduces to a basic power law distribution. Furthermore, the inclusion of improves the fit for the initial-ranked observations compared to a basic power law. Notably, coefficient is referred to as the Mandelbrot shift parameter.
The following limits are therefore straightforward and are fundamental in understanding the properties of the ZM law:
and:
Specifically, these limits reinforce the behavior of the “long tail” of the ZM law where less common and rare ADRs occur. In addition, the behavior of the ZM law, within its initial ranks, is extremely important given that the most common and clinically important ADRs occur within this region.
By definition, the sum of the total occurrences of ADRs expressed in percent form is:
However, the reader should be cognizant that the original US FDA data is supplied or downloaded in decimal form:
For this study, the non-linear ZM law is statistically evaluated by comparing it to the medication-specific reported data:
Therefore, the reported adverse events for each medication’s ADRs, , will be compared to the predicted adverse events by utilizing the ZM law and by curve fitting coefficients and .
Additionally, the ZM law can be differentiated with respect to rank:
The above interrelationship has dimensionless units of and readily explains the slope of at each rank. Note that is consistently negative.
Inspection of the above derivative also demonstrates that for a given value of increasing values of and/or yields an overall diminution in the magnitude of the slope of
Furthermore, for given values of both
and
an increasing value of
will also generate an overall diminution in the magnitude of the slope of
:
To compare different medications and their distributions of ADRs, it is also helpful to quantitate a medication-specific average slope within a specific range of ranks,
Note that:
and
Note that the area under a curve is approximately proportional to the summation of the points along a curve. Therefore, the definite integral of the ZM law,
is an analogous function to the summation of the points along the
curve:
Thus, illustrates how coefficients and interact with the summation process. The above equation also has the following clinical constraints: , and Inspection of demonstrates that, for a given value of decreasing values of will lead to a greater value of and therefore a greater summation over the specified range of to . The value of and the range-based summation will also increase with decreasing values of for a given value of .
In addition to
, the summation of sequential points along a specific ZM distribution,
, provides useful information when comparing different medications with different distributions of ADRs. Note again that:
and
An equivalent expression is:
Examination of
Figure 1 demonstrates that the medication-specific values for
tend to coalesce at approximately
. Therefore, the use of both the sum and average derivative, based upon the first ten ranks, is particularly beneficial when comparing the various medications’ different ADR distributions (see
Results). Furthermore, the majority of clinically significant ADRs occur within this initial region.
A natural logarithmic transformation has also been utilized for additional statistical analysis:
Inspection of the above equations demonstrates that the natural logarithm function operates on the unitless decimal form of both the reported data and the predicted quantities. This transformation also results in an approximate linearization of the predicted vs. the reported values.
Thus, with increasing rank,
therefore:
Consequently, the natural logarithm of a reported ADR and the natural logarithm of its rank will become approximately proportional as
increases:
Moreover, the ZM law can also be represented using an exponential equation. This helps to explain its fundamental “exponential like” mathematical properties: