Human SLC11A1 gene polymorphism has the propensity to confer susceptibility to M. africanum TB disease in Ghana

: 42 Human tuberculosis (TB) is caused mainly by Mycobacterium tuberculosis (MTB) and M. africanum (MAF) remains a major global 43 health threat. The varying response of different host to contact with the TB bacteria, indicates the importance of host genetics in 44 susceptibility to TB disease. We explored the association among selected human/host genomic variants and disease caused by the two 45 causative pathogens in Ghana through a case control study. MTBC isolates (323) recovered from pulmonary TB patients recruited 46 between 2016 and 2018 were genotyped using spoligotyping. A selection of 29 SNPs from MTB-related genes with high frequency 47 among African populations were genotyped using a TaqMan® SNP Genotyping Assay and iPLEX Gold Sequenom Mass Genotyping 48 Array. Associations between MTBC lineages and host variables were assessed using univariate and multivariate logistic regression. The prevalence of MTB and Maf among the participants were 79% and 21% respectively. Association analysis between the controls 50 and MAF showed that rs2695342 variant on the SIC11A1 gene have the propensity to confer susceptibility to MAF infections ( P = 51 0.093, OR = 8.35, 95% CI = 0.70 – 99.24) whilst the rs17048476 ( P = 0.088, OR = 1.57, 95% CI = 0.93 – 2.63) and rs1482868 ( P = 52 0.095, OR = 0.60, 95% CI = 0.33 – 1.09) were observed to be only suggestive. Our findings implicate SLC11A1 as a potential 53 susceptibility gene of substantial interest for TB caused by MAF which is an important pathogen in West Africa and highlight the 54 need for in-depth host pathogen studies in West Africa.


1.INTRODUCTION 63
Pulmonary tuberculosis (TB), is a significant public health burden worldwide, with 10 million cases and an estimated 1.5 million 64 deaths in 2018 (1). As a disease transmitted by the inhalation of aerosolized droplets, [1], TB is expected that susceptible individuals 65 who come into contact with aerosol(s) containing viable bacteria will be equally infected. In about 90% of infected individuals, 66 between 3 to 8 weeks after MTB contained in inhaled aerosols becomes implanted in alveoli (1), the host immune system 67 comprising of both the innate and adaptive arm will wall off the site if infection in a granuloma (ghon complex), such individuals are 68 asymptomatic and are latent TB infection (LTBI) [2][3][4]. Only 5% of infected immune competent individuals will potentially develop 69 active TB disease within 2-3 years of infection while the remaining 5% develop TB later in their life [5][6][7][8]. This difference potentially 70 depends on the interplay between the environment, the bacteria and most importantly the host genetic factors associated with TB 71

pathogenesis. 72
For many years, the diverse impact of several important genes such as the major histocompatibility complex (MHC), and non-HLA 73 genes like killer immunoglobulin-like receptor (KIR), toll-like receptors (TLRs), cytokine/chemokines and their receptors, vitamin D 74 receptor (VDR), SLC11A1 and C-type lectins on the susceptibility to TB or otherwise has been demonstrated by several studies (8)(9)(10). susceptibility to mycobacterial disease, and candidate gene studies performed in the 20th century leading to the realization that 80 susceptibility to TB disease has a substantial host genetic component [2][3][4][5][6][7][8]. 81 Although anatomical modern humans evolved from Africa, very few genetic studies have been conducted in ancestral Africans with 82 regards to TB host genetic susceptibility. Therefore, our knowledge of even the most fundamental information on the genetic basis 83 of susceptibility or otherwise to TB disease in Africa is quite limited. Moreover, only few studies have taken into consideration the 84 potential influence of different genotypes of the TB pathogen on such interactions. Human TB is mainly caused by Mycobacterium 85 africanum (MAF) and M. tuberculosis sensu stricto (MTB), both members of M. tuberculosis complex (MTBC), which also includes 86 several sub-species adapted to a variety of wild and domestic animals [2-3]. Of special interest to Africa is MAF, which is restricted to 87 West Africa and may cause up to 50% of human TB in some of the countries unlike the globally ubiquitous MTB. In Ghana for 88 instance, MAF causes about 20% of all TB cases [2]. Thus, host genetics and susceptibility to distinct MTBC lineages cannot be 89 overlooked. Findings from two independent molecular epidemiological studies by our group showed a strong association between 90 MAF (driven by lineage 5) and a native West African ethnic group [11][12]. This current study thence explored potential host and 91 pathogen interactions towards understanding MAF and MTB infections to enhance our understanding of TB pathogenesis To confirm the initial diagnosis at the health facility and to identify the infecting mycobacterial species, sputum specimen was 121 collected from each TB study participant, following the National Tuberculosis Control Program guidelines. Samples were taken only 122 after a detailed explanation of the study and written, or thumb-printed consent have been obtained for participation. Clinical 123 characteristics (previous history of TB, HIV) as well demography and epidemiological data including age, sex and ethnic origin were 124 obtained from each participant. In addition, each patient was screened for Diabetes mellitus (DM) using random blood glucose 125 level. Based on the American Diabetes Association (ADA) criteria, a glucometer (ACCU-CHEK) Active Glucose Monitoring System, 126 Roche Diabetes Care Limited, Burgess Hill, UK which uses finger prick test OneTouch Ultra test strips was used to screen all patients 127 for DM irrespective of known DM status. If the blood glucose level was less than 7mmol/L, no further action was taken, however if 128 blood glucose level was above 13 mmol/L, test was confirmed as having Diabetes. 5ml of blood was collected from each TB patient 129 for host genetics analysis. 130 Non-TB patients control group (NTB) and chest X-ray screening. 131 The screening of eligible control individuals was done in a stepwise manner. Firstly, Adults (>18 years) presenting at any of the 132 outreach centers were verbally screened for fever, diabetes and hypertension. Individuals with no evidence of fever, diabetes or 133 elevated hypertension were then taken through chest X-ray (CXR) screening using CAD4TB (version 3.07, Diagnostic Image Analysis 134 Group, The Netherlands) for abnormalities suggestive of pulmonary TB. The software has two abnormality detection systems that is 135 textural abnormality and shape abnormality systems, which analyze the abnormalities in the unobscured lung fields that have been 136 segmented automatically [14]. A higher score is suggestive of TB. A CAD4TB threshold score of 60 was used for this population 137 determined using previously collected CXR data in a similar population. Sputum samples was collected from all individuals with high 138 CAD4TB scores (60 or greater) and transported to the laboratory for further analysis. Individuals with high CAD4TB scores were 139 referred to the hospitals for further clinical evaluation. 5ml of blood was collected from each TB patient for host genetics analysis. 140

Laboratory analysis 141
Isolation and characterization of Mycobacterium spp. 142 Sputum samples obtained were decontaminated using 5% oxalic acid [15-16] and inoculated on two pairs of Lowenstein Jensen (LJ) 143 slants; one supplemented with 0.4% sodium pyruvate to enhance the isolation of MAF and M. bovis, and the other with glycerol for 144 the growth of MTB. The cultures were incubated at 37 °C and were observed weekly for growth for a maximal duration of 16 weeks. 145 MTBC strains were identified by PCR detection of insertion sequence IS6110 as previously described [16]. Colonies from positive 146 cultures were purified and stored at -80 o C in 2 ml Middlebrook 7H9 supplemented with ADC enrichment media until use. Pure 147 Bacteria DNA was extracted for genotyping using a modified protocol [17] and stored at -20 o C until further use. 148 All MTBC isolates were further typed by spoligotyping [18]. This was performed according to the manufacturer's instructions, using 149 commercially available kits (Isogen Bioscience BV Maarssen, The Netherlands). Briefly, The DR containing region was amplified by 150 PCR using primers, DRa and DRb (GGTTTTGGGTCTGACGAC, and CCGAGGGGACGGAAAC). The amplified products were hybridized to 151 set of 43 oligonucleotides each corresponding to one spacer, immobilized on a nylon membrane. Detection of hybridization was 152 achieved using chemiluminescent ECL (Amersham) liquid followed by X-ray exposure. The Spoligotyping patterns obtained were 153 defined according to SITVITWEB database (http://www.pasteur-guadeloupe.fr:8081/SITVIT_ONLINE). SITVITWEB assigned shared 154 types numbers were used whenever a spoligotyping pattern was found in the database while families and subfamilies were assigned 155 based on the MIRU-VNTRplus database (http://www.miru-vntrplus) (http://www. miru-vntrplus.org). Shared types were defined as 156 patterns common to at least two or more isolates. All patterns that could not be assigned were considered orphan spoligotypes.  (Table 1). SNPs (Table 1)

Data Analysis 192
Epidemiological Data for this study was double entered using Microsoft © Access and validated to remove duplicates. To analyze the 193 population structure, we computed principal component (PC) using PLINK version 1.9. Ten (10) principal components were 194 computed using PLINK1.9's --pca command and the top two principal components (PC1 vs PC2) were plotted using R. A generalized 195 linear model (glm) was also performed using the R statistical package to evaluate correlation between the PCs. The quality of the 196 variants was assessed, and association analysis was run with 28 high-quality variants using PLINK version 1.9 for the controls 197 against all the cases put together. A total of 333 participants (113 cases; 181 controls), of which 161 were males and 172 were 198 females were included in the association analysis. What was the P values for statistical significance?

Host genotype population structure and its association with Pathogen Lineages 255
The SNPs investigated in our current study had similar allele frequencies compared to the global and African populations (Table 1). 256 The principal component analysis (PCA) plots coloured by case-control status, and by ethnicity show that the samples are 257 homogeneous based on the typed SNPs (Figure 1 and 2). Furthermore, the generalized linear model (glm) showed no significant 258 correlation between the PCs and the case-control phenotype (Table 3). However, a glm for case-control status against age, weight, 259 and height revealed that the variables were significantly correlated with the phenotype based on the typed SNPs (Table 5)   The variables that were correlated with the case-control phenotypes as earlier mentioned were included as covariates analysing the 267 association between the controls and all cases (MTBss and MAF). Two variants (rs17048476 and rs1482868) emerged after 268 adjusting for covariates. The variant rs17048476 is an intron variant on the AK124857 gene that has been previously found to be 269  The association analysis between the controls and all MTB cases identified the rs1482868 variant with a suggestive association to MTBss 277 (P = 0.0681) This suggestive association was however absent when the two species were analyzed together (Table 4). It appears that possession 278 of this variants confers protection to being infected with MTBss (OR 0.55) 279  T  C  C  ADD  283  1,31  0,29  0,75  2,30  0,94  0,3464   16  7445185  rs2346943  T  G  G  ADD  293  1,04  0,33  0,55  1,96  0,

4.0.Discussion 302
Host factors are increasingly being recognized as critical for TB control considering the diversity of the outcome of interaction 303 between the MTBC and distinct human host populations. This study sought to explore potential host genetic factor (s) that may 304 confer susceptibility or protection to distinct MTBC lineages, towards understanding crucial mechanisms of host-pathogen 305 interaction. The main findings from this study were: 1) SNP rs2695342 on SIC11A1 gene has the propensity to confer susceptibility 306 to MAF; 2) In addition, we also found that 7.1% of 393 adult TB patients studied had DM, three-old higher than the general 307 population average of 2.0% and finally young patients less than 35 years and patients older than 65 years are associated with active 308

TB in Ghana 309
In our study, rs2695342 (G/A) SIC11A1 gene had the propensity to confer susceptibility to TB caused by Maf. This variant is in 310 introns of SIC11A1 gene (solute carrier family 11 member A1, previously known as natural resistance-associated macrophage protein 311 1 (Nramp1). This gene is a member of a family of metal ion-transport proteins whose cellular expression is restricted to phagocytic 312 cells. SIC11A1 is a bivalent antiporter located on chromosome 2q35 that delivers metal cations from the cytosol into acidic 313 endosomal and lysosomal compartments where Fenon and Haber-Weiss reaction generates toxic antimicrobial radicals for direct 314 antimicrobial activity against infectious microorganisms such as mycobacteria [19][20][21]. Basically, SLC11A1 may influence the survival 315 of the TB pathogen after phagocytosis. We therefore suspect that the intronic position of this polymorphism might affect post-316 transcriptional modification of the affected gene hence potentially affecting the resulting SLC11A1 protein. Previous SLC11A1 studies 317 in humans with TB in West Africans have primarily focused on four or five polymorphisms distributed across the gene: a GT n repeat 318 in the 5′ promoter region, a four base-pair (TGTG) insertion/deletion (rs17235416) in the 3′ untranslated region (UTR), and two 319 single nucleotide polymorphisms (SNPs) in intron 4 (rs3731865) and exon 15 (rs17235409, D543N). These mutations were found to 320 be significantly associated with pulmonary TB. This association has been replicated in studies from Guinea-Conakry [21] and Gambia 321 [21]. Our analysis shows that rs2695342 might actually be a promotor/repression gene mutation which even though its synonymous 322 could lead to significant phenotypic consequences. Mutations in this gene might eventually make the phagocytic cells less toxic thus 323 making the patients more prone to infections by Maf. 324 Associations between particular MTBC lineages and human ethnicities have been observed before. . Moreover, this latter study also found that Maf bound human 333 recombinant MBL more efficiently, perhaps leading to an improved uptake of Maf by macrophages and selection of deficient MBL 334 variants among human populations exposed to Maf. Although our study did not find any significant association between ethnicity 335 and MTBC lineages, our study suggests that host genetics play an important role in TB pathogenesis hence the need for newer 336 approaches to TB therapy such as host directed immune-therapy, which have the potential to shorten the TB treatment and prevent 337 resistance by promoting autophagy. 338 M. africanum is an important cause of human TB in West Africa, causing nearly 50% of all TB cases reported in West Africa. Our 339 finding from this study that the MAF causes 21% of human TB in Ghana confirms our previous report from Ghana [27] and this 340 finding is in agreement with agreement with existing findings in sub-Saharan Africa where MAF was previously found to cause 39% 341 of human TB cases in Benin, Burkina Faso 18%, Cameroon 56%, The Gambia 39%, Guinea Bissau 47%, Ivory Coast 55%, Nigeria 8%, 342 Senegal 20%, Sierra Leone 24%. Indeed, one potential reason for the stability of Maf in these countries is that the bacteria might 343 have adapted to (some) human populations along the Gulf of Guinea. In addition, findings from our comparative genomics analysis 344 of MAF from Ghana suggested potential adaptation of MAF L5 to a definitive host whereas MAF L6 exhibited traits of a pathogen 345 with a wide host-range [28]. Therefore, the observation of MAF with these ethnic groups which mainly driven by L5 seems to 346 suggest that L5 has indeed adapted to causing TB among the said ethnic groups. The Ga and the Ewe speaking ethnic groups