A Family Based Whole Exome Sequence Study to Indentify Modifier Genes for Phenotype Heterogeneity Between Severe and Non-Severe Thalassemia Patients

* Corresponding author: Prof. Anupam Basu, Department of Zoology, The University of Burdwan, PurboBarddhaman, West Bengal- 713104, India, Email- abasu@zoo.buruniv.ac.ina & b are equal contribution Whole Expanded Exome Sequencing Study of two families father, mother and index cases (trio) was undertaken for two E/ beta thalassemia subjects with same HBB genotype. Approximately 200ng of DNA was taken from each individual and shared into 300-400bp fragment. Then the shared fragments are end repair. Klenow exonuclease was used to add an adapter. After adapter ligation 10 cycle PCR amplification was done for each sample. The targeted Exome was captured by the Agilent Sure select XT Human all Exome V6+ UTR kit as per the manufacturer’s protocol. Captured library was then amplified 10 cycles with 8 bp index sequence for each sample. Then the indexed capture library was pooled together. Pair end sequencing of the pooled library was performed in Illumina HiSeq2500 using Illumina HiSeq SBS kit. Finally, both the genes, inherited and denovo, from both the subjects were separately functionally annotated by DAVID online tools 6.8. Functionally annotation result shows that in case of subject-1, 6 KEGG pathway were involved. These are Adherent junction, Protein digestion and absorption, Inflamatory Bowel Disease, Amoebiasis, PPAR signaling pathway and glycolysis or gluconeogenesis. Interestingly in case of subject-2, only 2 KEGG pathway were found, Thyroid hormone synthesis and carbon metabolism.

Thalassemia is a monogenic disease caused by the mutation in the beta globin (HBB) gene. There are more than 400 disease causing mutation is responsible for the causing beta thalassemia worldwide. (hbvar ref). HbE beta thalassemia is a compound heterozygous mutation of the HBB gene, found in the South Esat Asia. In Eastern India there are almost 17 HBB mutation are responsible for the disease. The most common mutations are IVS 1-5(G>C) [HBB:c.92+5G>C] and CD26(G>A) [HBB:c.79G>A], responsible for the HbE-beta thalassemia (1). The clinical severity of the HbE-beta thalassemia is varying from very severe to very mild. Some subjects are presenting the disease at very early age and needing regular transfusion for survival, but some are presenting late onset and not requiring regular transfusion or required occasional transfusion (2,3).Several studies are done to find out the factors responsible for the clinical severity of the HbE-beta thalassemia subjects. Several studies are showing the modifiers like beta mutation type, fetal hemoglobin level, genetic polymorphism, like -158(C>T) on Gγ globin gene(4), HBS1L-MYB intergenic polymorphism, BCL11A gene polymorphism, (5,6,7) responsible for the HbF inducer and also the alpha globin gene deletions. The result shows some are statistically significant, but this is not enough to understand the reason of such clinical diversity. So, we hypothesized that there are several other genetic loci, other than the globin cluster are present, which may modify the disease severity in the different way other than the modulation of the globin production.
To finding out the genetic modifiers, responsible for the disease severity of the HbE-Beta thalassemia subject, with similar genotype [IVS1-5(G>C)/CD16(G>A)] we did Whole Expanded Exom Sequencing Study of two family (father, mother and affected patients).

Materials and Methods:
Subjects Information: We select two HbE-Beta thalassemia subjects with same HBB genotype IVS1-5(G>C) and CD26(G>A) with different phenotype: One is transfusion dependent (TD) and another one is non transfusion dependent (NTD).
To check whether any de novo mutations have any role as modifier loci, parents of the both the subjects were also included in this study.. Clinicopathological information of both the subjects have been presented table-1.
Subject-1, which is TDT present the disease at 10 month of age and taking regular transfusion of 2-month interval for last 13 year. Steady state haemoglobin level was 6.1 g/dl . Subject-2, which is NTDT present the disease at 14 years of age. She has not taking any transfusion for last 16 years. Steady state haemoglobin was 7.8 g/dl. She has a splenomegaly of 9 cm from the left costal margin.
A total of 6 subjects were included in this study as described above including both the parents of each index cases. The study has been approved by the Institutional Cinical Ethics Committee, The university of Burdwan Sample Collection and DNA extraction: Pre transfused 3 ml peripheral blood samples were collected from individual subjects in EDTA from each subjects with proper phlebotomy procedure. DNA was extracted by using the commercial DNA extraction kit.

Whole Exome sequencing:
Approximately 200ng of DNA was taken from each individual and shared into 300-400bp fragment. Then the shared fragments are end repair. Klenow exonuclease was used to add an adapter. After adapter ligation 10 cycle PCR amplification was done for each sample. The targeted Exome was captured by the Agilent Sure select XT Human all Exome V6+ UTR kit as per the manufacturer's protocol. Captured library was then amplified 10 cycles with 8 bp index sequence for each sample. Then the indexed capture library was pooled together. Pair end sequencing of the pooled library was performed in Illumina HiSeq2500 using Illumina HiSeq SBS kit V4 as per manufacturer's protocol. Generated FASTQ file was used for the variant calling and in house data analysis.

WES Data Analysis:
Variant calling: All the raw reads were checked and aligned with hg19 human reference genome using Burrows-Wheeler Aligner (BWA) and variant call was made by Genomic Analysis Tool Kit (GATK). The VCF was annotated by using the VarAFT tools.

Hunting of responsible variants and gene for clinical significance:
De novo variants: Variants of the Index cases were compared with the variants of the father and mother and inherited variants were filtered out.
Inherited variants: After annotating the VCF file using VarAFT tools we filter out all Upstream, downstream, intergenic, UTR, nocoding and non splicing related intronic variants, only exonic and splicing variants are considered for further analysis. To hunt the responsible Inherited variants two broad scheme were applied. In the first scheme, only minor variants were targeted. In the second scheme, homozygous pathogenic variants were targeted as follows:

Scheme 1:
We considered only nonsynonymous exonic variant and splicing variants with frequency of <0.01 (1%) in 1000 Genome database, for the next step of analysis. We align the individual's variants with the father, mother variants to detect the inherited and de novo variants. The variants were again filter based on the SIFT and PolyPhen2 score. The variants showing pathogenic either any of one were taken for the final variants. Then the variants were maped into gene and the genes were functionally annotated with the DAVID online tools 6.8 to searching out the biological pathway, KEGG pathway involved with these genes.

Scheme 2:
All the synonymous, unknown and heterozygous variants are further filtrated. To predict the effect of mutation, SIFT and POLYPHEN2 scoring were applied on all the homozygous exonic and splice variants only deleterious variants by this two software were considered. The genes containing these variants are further functionally clustered by KEGG path way using DAVID online tools 6.8.

Results
A total of 91194 variants were identified in the TDT subject and 91196, 90530 variants were identified in the father and mother respectively. On the other hand a total of 90226 variants were identified in the NTDT subject and 89396, 88275 variants were identified in the father and mother respectively.

De novo Variants
After filtering out the finally we got the 520, 551 and 503 non synonymous exonic and splicing variants in subject-1, father and mother respectively. Out of 520 variants, 514 are inherited variants and only 6 are denovo variants for TDT subject . On the other hand, 538,530 and 541 variants were found in NTDT subject, father and mother respectively. Out of 538 total variants in index case, 534 were inherited variants and 4 were de novo variants ( Table 2) .

Inherited variants and functional annotation:
As mentioned in the method section, for hunting of the functional clinical responsible variants and genes, using minor allele approach (Fig 1) and homozygous variants approach (Fig 2) available genes and variants are listed in Table 3 -6.

Discussion
We performed a trio based study with 2 beta haemoglobinopathy patients of same age, sex, and harbouring the same primary mutation, however, one was transfusiondependent (TDT) and the other nontransfusion-dependent (NTDT). We focussed on single-nucleotide variations and short indels in a whole-exome sequencing approach, and then tried to explain their differing clinical phenotypes through their differing mutations with expected most severe impact. After associating mutations of gene products to the metabolic pathways they have most significant roles in, we found that the mutations and their related individual phenotypes belonged to either of 2 categories those related to general defects of growth and development, and those related to anaemia, either by reduced/ineffective erythropoiesis or by increased RBC haemolysis.
In our cases of interest, we found that mutations of genes associated with growth and development were present in both patients, while those possibly involved in erythrocyte membrane or even general cytoskeletal integrity were affected in only the TDT individual. Also interesting was that the NTDT patient displayed a much lower mutational load than the TDT patient (about 1/3 rd of the TDT patient's mutational load). Patient 1 (addressed as the TDT patient), was diagnosed at 10 months of age and was on once in 2-3 months transfusion regimen from then. She was maintaining and average pre transfusion haemoglobin level of 6.05 gm/dl. Her height and weight for age were both below the 3 rd centile against age and sex matched subjects. Both the liver and spleen were palpably enlarged to the extent of 4.5 and 7 cm below the right and left costal margins respectively. The level of lactate Dehydrogenase, serum haptoglobin and unconjugated bilirubin remained significantly above normal range, during this period (8). Provisionally attributing the increase of such parameters to excessive ineffective erythropoiesis she was put on regular monthly transfusion from 7 years of age. Attempting to maintain average pre-transfusion haemoglobin level of above 9 gm/dl with 15 ml/kg of 60% packed cell transfusion. Though the biochemical parameters of ineffective erythropoiesis improved but never touched baseline, this hinting at intravascular haemolysis as a probable cause (9). Upon starting on the new transfusion regime, minimal improvement in growth was noted. Though the average pretransfusion haemoglobin level remained around 8 gm/dl, with 3 weeks transfusion interval regime. The iron overload increased at a rate more than expected. After follow up till 18 years of age, it was noted that she was having a number of comorbidities, not unusual for thalassaemics but at an earlier age. She was detected as bio-chemically hypothyroid with free triiodothyronine level of 0.8 ng/ml and Thyroid stimulating hormone level of 7.02 ng/ml, which was corrected with levothyroxine replacement therapy. She had thelarche at 16 years and menarche at 18 years of age, both delayed by about 2 years, following the secular trend. The bone age at 18 years was lagging by 2 years and the Z Score of bone mineral density at trochanteric level was -1.0, signifying Osteopenia. She was detected as having multiple small gall bladder stones at the same age. We also contributed this delay of catch up growth and development to a chronic allergic state and chronic ill health due frequent abdominal cramps and diarrhoea leading to malabsorption and micronutrient deficiency. After replacement therapy with essential mineral (calcium and zinc) and vitamin (Vit D) replacement and probiotics, her condition has improved, but her growth still remains below the 3 rd centile with moderately enlarged liver and spleen.
After the data of the NGS was made available, it was noted that she was compound heterozygous for an ACTN3 mutation, which is thought to be involved in attachment of actin filaments to intracellular structures. This mutation could have far-reaching consequences, though in RBC it would most likely lead to structural destabilization and increased erythrolysis, thus exacerbating the anaemic condition and enforcing a transfusion-dependent state (10). Also present was a mutation in FARP2, a guanine exchange factor involved in cytoskeletal remodelling by RAC1, which has been implicated in erythrocyte maturation (12) and is known to be dysregulated in sickling RBCs (11). These mutations could explain the extra burden of ineffective erythropoiesis and intravascular haemolysis, evidenced by hyperbilirubinemia, high LDH and Haptoglobin and compensatory increase in size of liver and spleen and formation of gall stones at such an early age. Outside of anaemic conditions, though, there were mutations identified in both patients relating to various pathways of growth and development, with a common motif being genes involved in cellular respiration and fatty acid metabolism. Outside of genes involved in carbon metabolism, the TDT subject possessed mutations in IL12RB1 and TLR4, which may be contributory to defective T Helper cell-mediated immune response and gamma interferon production and its related antiviral response (15)the propensity of this subject to fall sick very often, mostly to influenza like viral illness, wrongly interpreted as allergic hyperactive airway disorder (14, 15). More over IL12RB1 mutation may be responsible for inflammatory bowel disease like state (13, 16), which could be responsible for the malabsorption like symptoms leading to lactose intolerance and micronutrient deficiency. Reduced bone mineral density and resultant osteopenia may, among other things, be attributable to the menagerie of collagen deficiencies. (17).
Learning that this patient also presented with very early inset hypothyroidism, which is unlikely in well chelated thalassaemic individuals association with homozygous mutations in the HLA-DR3, HLA-DPB1 and the TPO genes may be corroborative. It has been shown that individuals with an amino acid substitution at position 74 of the DR beta 1 chain of HLA-DR3 (DRb1-Arg74), susceptibility to autoimmune thyroid disorder increases [18]. The analysis of amino acid variants of HLA molecules of HLA-DPB1 were strongly associated with Graves Disease, especially amino-acid signatures of the HLA-DP β chain, might contribute to the molecular pathogenesis of early-onset auto-immune thyroid disease (AITD) [19]. TPO gene mutations result in disruption of thyroid hormone synthesis and are classified as thyroid dyshormonogenesis. The combined effects of these genetic aberrations may have contributed to the individual presenting as clinically and biochemically hypothyroid at such an early age, to require thyroid hormone replacement therapy.
[20] Presence of homozygous mutation in the FUT3 gene may be of significance in relation to increased transfusion frequency. Mutation in FUT3 gene is actually reflected as the gene for Lewis Antigen (Le a+ or Le b+ ) a minor blood group antigen, so the person who will not be able to synthesize the antigen (protein product of the Le gene) and will be Le Negative (Le a-Le b-), as the transfusable blood comprising of packed red cells are generally not screened for such minor blood groups, to in individuals who receive frequent transfusions, may develop Anti Lewis antibodies to such seemingly innocuous antigen, thus causing partial haemolysis of the transfused blood which is positive for such antigen. This small amount of haemolysis is not severe enough to cause catastrophic transfusion reaction, but will decrease the transfusion interval subtly, increasing the levels of bilirubin and LDH. This could be another reason for increased haemolysis which persisted even after starting the regular transfusions, which should have ideally got rid of the endogenously produced defective red cells. Generally, in sporadically transfused individuals Lewis blood group is seldom responsible for such haemolysis, however in multi and regular transfusion scenarios, it compromises transfusion efficiency, hence decreasing the frequency between transfusions.
[21] Though the mutations in the FUT3 may be associated with H. Pylori infection causing frequent abdominal cramps, etc, this has not been substantiated in our subject. [Ref Table 4] Apart from a considerably larger mutational load on fatty acid metabolic pathway genes in the TDT patient, thyroid hormone production was almost equally affected in both. Other than this pathway in the NTDT patient, angiogenesis, limb and connective tissue (collagen) formation and immune development were severely affected in the TDT individual, sometimes even multifactorial.
When, we compared the clinical condition of Index case 2, the NTDT, it was certainly different. She presented at the clinical set up at the age of 12 years, with and average baseline haemoglobin of 6.0 gm/dl, with no hepatomegaly and palpable splenomegaly of 6 cm below the left costal margin. The LDH, unconjugated bilirubin and haptoglobin were only more than 2 upper limit of normal (ULN), rest of the parameters being normal and documenting only 20% Nucleated red cells in peripheral circulation. These findings are truly correlated with the absence of mutations contributing defects in the RBC cytoskeleton, when compared to the TDT subject.
She was detected as suffering from biochemical hypothyroidism at the age of 11 years with TSH of 6.0 ng.ml and was put on thyroid replacement therapy had spontaneous thelarche at 12 years and menarche at the age of 12 years 7 months. She has not been put on regular transfusion regimen, though her height for age was at 3 rd centile, as her midparental height was also at the third centile and her growth velocity was satisfactory. There was definite increase in average baseline haemoglobin and growth spurt after thyroid hormone replacement therapy was initiated. The NGS data verifies the rampant hypothyroidism by noting mutations in thyroglobulin and iodide-chloride transporters.
At least in the case of the NTDT patient, the thyroid mutations also indicate a possible route of therapy via thyroid supplements for ameliorating any other issues of growth and development which in absence of this exome profiling may have remained attributed to the classical comorbidities of thalassaemia.
With reference to Table 6, where the homozygous mutations for the NTDT subject has been cited, no correlation could be drawn with the identified mutations and the current status of the patient.
The homozygous genetic mutation load was also higher and significant in the TDT individual than the NTDT subject.
This study identifies a need for a widespread genetic profiling of thalassaemia patients such that clinicians can better understand the genetic spectrum of common comorbidities of thalassaemia and be better equipped to identify and treat specific symptoms of thalassaemia with context of the whole.

Conflict of interest:
Authors declare there is no conflict of interest in the present study.