ACE-1 I / D polymorphism associated with COVID-19 incidence and mortality : an ecological study

A literature review was conducted to summarize the frequency of the D-allele of the angiotensin-converting enzyme-1 in all countries with available data. Using an ecological study design limited to high income countries, we found that the country-level frequency of the D-allele was associated with increased COVID-19 incidence and mortality.


Background
Despite the SARS-CoV-2 virus first emerging in Asian countries, European countries and particularly Southern European countries appear to have experienced higher incidence and mortality rates [1][2][3]. The reasons underpinning this differential spread and mortality are incompletely understood [4]. A recent ecological study found a positive association within Europe between COVID-19 incidence and a 287-bp alu repeat sequence deletion (D) in the angiotensin-converting 1 (ACE1) gene [5]. A previous systematic review found the D-allele was less prevalent in East Asian populations [6]. This Insertion/Deletion (I/D) polymorphism has been shown to account for around 50% of the observed variance in ACE1 expression, with DD homozygotes having 65% more, and ID heterozygotes 31% more ACEI than II homozygotes [7]. The D-allele has been shown to be associated with an increased risk for hypertension, heart failure, cerebrovascular disease, diabetic nephropathy, severe hypoglycaemia in diabetes, gastric cancer asthma and acute respiratory distress syndrome (ARDS) [7,8]. Hypertension, diabetes, cancer and heart failure are well-established risk factors for progression to severe COVID-19 disease, which typically includes ARDS, and death [9]. Although case control studies did not find an association between the Dallele and incident SARS-CoV-1 infection [10,11], one study found it to be associated with progression to ARDS in those infected [11]. Since the D-allele has been found to a risk factor for the development of ARDS irrespective of aetiology [8], it is possible that it may also have this effect in those infected with SARS-CoV-2.
These considerations provided the motivation for this study where we assessed the association between the prevalence of the D-allele and the mortality and incidence of COVID-19 in high income countries.

Methods
On 8 April 2020, we used PubMed to conduct a literature review to obtain countrylevel ACE1 D-allele frequency estimates. Over 5000 publications were identified.
These included a large number of systematic reviews of case control studies assessing the associations of various diseases with I/D polymorphism. These reviews typically tabulated the prevalence of the II, ID and DD genotypes in the control groups in all the studies included. To expedite the review process, we limited our search to systematic reviews that provided such tables and we extracted the prevalence of the D-allele in the control groups only. These were used as estimates of country level Dallele frequency and used to compute median country level D-allele frequency -which was used as the exposure variable in all analyses. All duplicates were excluded as were sample sizes smaller than 50. Full details of the search strategy are provided in the online supplementary file.
Multiple linear regression was used to evaluate the country-level association between both COVID-19 incidence (cumulative cases/100 000 inhabitants as of 8/4/2020) and mortality (cumulative mortality/100 000 as of 8/4/2020) and the prevalence of the Dallele controlling for age of each countries COVID-19 epidemic and COVID-19 testing intensity (tests/100 000 -see supplementary online file for data sources and definitions of these variables). Initially, the countries most affected by COVID-19 were highincome countries in East Asia, Europe and North America. Assessing SARS-CoV-2's subsequent spread to the rest of the world is complicated by widespread lockdowns and lower testing capabilities in lower-income countries [4,12,13]. For these reasons the analyses were limited to high income countries and in sensitivity analyses to high-and upper-middle-income countries. The COVID-19 incidence and mortality data were log transformed to closer approximate normal distributions for linear regression.
Differences in gene allele frequency by region were assessed via the Wilcoxon ranksum test. The analyses were performed in STATA version 16 (Stata Corp, College Station, Tx).

Results
We identified 21 systematic reviews, with 223 studies providing unique ACE1 D-allele frequency estimates from 50 countries, including 31 high income countries (Tables S1   & S2). The frequency of the D-allele was found to be lower in Asia than Europe (P<0.005; Figure S2), and higher in Southern Europe compared to other European regions (P<0.05; Figure S2).  Table 1; Fig. S1) as well as the cumulative number of COVID-19 deaths/100 000 (coef. 0.141, 95% CI 0.063-0.220; Table 1, Fig.   S1). Repeating the analyses including upper-middle income countries did not substantively affect the results (Table S3).

Discussion
Controlling for the age of the national epidemics and testing intensity, we found that the prevalence of the ACE1 D-allele was positively associated with the cumulative incidence of COVID-19 cases and deaths. This finding provides a possible explanation for some of the differential spread of SARS-CoV-2 and attributable mortality between countries.
The study is purely ecological and therefore susceptible to the ecological inference fallacy. The fact that the presence of the D-allele has been found to be a risk factor for some of the diseases noted to be risk factors for severe-COVID-19-disease does however provide some biological plausibility for the association [9]. Individual-level case-control-type studies that characterize patients ACE1, ACE2 and other relevant genes in association with COVID-19 status/severity, will be crucial to verify or refute this association [14]. We did not control for a number of factors that likely affect the spread of SARS-CoV-2 such as the speed and intensity of contact tracing, isolation and quarantine [4,12,13]. Whilst these likely explain part of the differential spread of SARS-CoV-2, it is possible that they interact with differential host susceptibility such that controlling spread requires more intensive measures in populations more susceptible to SARS-CoV-2. If this were found to be true, it would be of considerable relevance to decide how intensively each specific population would need to intervene to control the initial and subsequent waves of SARS-CoV-2 infection.

Authors' contributions
CK conceptualized the study, was responsible for the acquisition, analysis and interpretation of data and wrote the analysis up as a manuscript.

Conflict of interest
The author declares that he/she has no competing interests.

Ethical approval
The analysis involved a secondary analysis of public access ecological level data. As a result, no ethics approval was necessary.

Informed consent
Not applicable Table 1. Multiple linear regression assessing the country-level association between COVID-19 cumulative incidence (log number of cases/100 000 inhabitants)/COVID-19 cumulative mortality (log number of deaths/100 000) and the prevalence of the Dallele of the ACE1 gene, the age of the COVID-19 epidemic and testing intensity (tests conducted/100 000).

Cumulative Cases
Cumulative Deaths Coef.