Characterization of Mutations Causing CYP21A2 Deficiency in Brazilian and Portuguese Populations

Deficiency of 21-hydroxylase enzyme (CYP21A2) represents 90% of cases in congenital adrenal hyperplasia (CAH), an autosomal recessive disease caused by defects in cortisol biosynthesis. Computational prediction and functional studies are often the only way to classify variants to understand the links to disease-causing effects. Here we investigated the pathogenicity of uncharacterized variants in the CYP21A2 gene reported in Brazilian and Portuguese populations. Physicochemical alterations, residue conservation, and effect on protein structure were accessed by computational analysis. The enzymatic performance was obtained by functional assay with the wild-type and mutant CYP21A2 proteins expressed in HEK293 cells. Computational analysis showed that p.W202R, p.E352V, and p.R484L have severely impaired the protein structure, while p.P35L, p.L199P, and p.P433L have moderate effects. The p.W202R, p.E352V, p.P433L, and p.R484L variants showed residual 21OH activity consistent with the simple virilizing phenotype. The p.P35L and p.L199P variants showed partial 21OH efficiency associated with the non-classical phenotype. Additionally, p.W202R, p.E352V, and p.R484L also modified the protein expression level. We have determined how the selected CYP21A2 gene mutations affect the 21OH activity through structural and activity alteration contributing to the future diagnosis and management of CYP21A2 deficiency.


Introduction
Congenital adrenal hyperplasia (CAH) is an autosomal recessive disease caused by defects in steroid biosynthesis [1]. More than 90% of reported CAH cases are due to 21hydroxylase (CYP21A2) deficiency (OMIM # 201910). The CYP21A2 is a member of the cytochrome P450 superfamily and has 495 amino acids forming thirteen α-helix (A-M) and nine β-sheets [2,3]. This protein is located in the endoplasmic reticulum of the adrenal hydroxylations of 17-hydroxyprogesterone into 11-deoxycortisol and progesterone into 11-deoxycorticosterone, which are then converted into cortisol and aldosterone ( Figure 1) [1]. Therefore, defects in CYP21A2 affect both the mineralocorticoid and glucocorticoid biosynthesis, besides the increase in sex steroids biosynthesis due to changes in the steroidogenesis pathway by elevated levels of sex steroid precursors [4,5]. Figure 1. Role of CYP21A2 in human steroid biosynthesis. CYP21A2 is a heme containing cytochrome P450 protein that requires cytochrome P450 reductase as a redox partner for its metabolic reactions [6]. Together with CYP11A1 in mitochondria and CYP17A1 and CYP19A1 in the endoplasmic reticulum, CYP21A2 directs the biosynthesis of steroids.
Glucocorticoid and mineralocorticoid levels determine the CYP21A2 deficiency phenotype, which can be (1) salt wasting, when both types of steroids are not produced, causing the adrenal crisis by electrolytic deregulation with infant mortality risks and severe virilization by elevated sex steroids; (2) simple virilizing, when there is some residual synthesis of both glucocorticoid and mineralocorticoid that is enough to prevent the adrenal crisis; or (3) non-classical (mild form) when just the glucocorticoid synthesis is partially affected and is linked to hyperandrogenism and mild late-onset CAH. The first two are the classical forms of CAH, which have a worldwide incidence ranging from 1:14,000 to 1:18,000 live births [7]. The non-classic form has a frequency around 1:100 to 1:1000 [8]. The frequencies of CAH vary with ethnicity and geographic population groups. However, the description of SW vs. SV, SV vs. NC, etc. is rather arbitrary, hence caution must be employed when following [9]. The diagnosis of CYP21A2 deficiency is confirmed by steroid profile, mainly 17OHP in the first screening, which becomes elevated [1,7]. However, 17OHP can be altered by a deficiency in other enzymes in the steroidogenic pathway, in premature infants, and in unrelated diseases causing physiologic stress [7,8,[10][11][12][13][14][15]. Therefore, the molecular diagnosis is essential for the confirmation of complex cases and followup management of asymptomatic CAH, avoiding unnecessary treatment, along with genetic counseling [11,16,17].
The CYP21A2 gene is located on the short arm of chromosome 6 (6p21.3), situated 30 kb apart from its pseudogene (CYP21A1P). These genes share 98% sequence identity for the exons and 96% among the introns [18]. Besides that, 95% of pathogenic variants of CYP21A2 originate in recombination events [3]. More than 1300 variants in the CYP21A2 have been reported in the human gene mutation database (HGMD), and more than 200 of these are described to affect human health. With high variability between different ethnic groups and single nucleotide variants (SNVs), missense and nonsense account for half of the total variations [19,20]. Role of CYP21A2 in human steroid biosynthesis. CYP21A2 is a heme containing cytochrome P450 protein that requires cytochrome P450 reductase as a redox partner for its metabolic reactions [6]. Together with CYP11A1 in mitochondria and CYP17A1 and CYP19A1 in the endoplasmic reticulum, CYP21A2 directs the biosynthesis of steroids.
Glucocorticoid and mineralocorticoid levels determine the CYP21A2 deficiency phenotype, which can be (1) salt wasting, when both types of steroids are not produced, causing the adrenal crisis by electrolytic deregulation with infant mortality risks and severe virilization by elevated sex steroids; (2) simple virilizing, when there is some residual synthesis of both glucocorticoid and mineralocorticoid that is enough to prevent the adrenal crisis; or (3) non-classical (mild form) when just the glucocorticoid synthesis is partially affected and is linked to hyperandrogenism and mild late-onset CAH. The first two are the classical forms of CAH, which have a worldwide incidence ranging from 1:14,000 to 1:18,000 live births [7]. The non-classic form has a frequency around 1:100 to 1:1000 [8]. The frequencies of CAH vary with ethnicity and geographic population groups. However, the description of SW vs. SV, SV vs. NC, etc. is rather arbitrary, hence caution must be employed when following [9]. The diagnosis of CYP21A2 deficiency is confirmed by steroid profile, mainly 17OHP in the first screening, which becomes elevated [1,7]. However, 17OHP can be altered by a deficiency in other enzymes in the steroidogenic pathway, in premature infants, and in unrelated diseases causing physiologic stress [7,8,[10][11][12][13][14][15]. Therefore, the molecular diagnosis is essential for the confirmation of complex cases and follow-up management of asymptomatic CAH, avoiding unnecessary treatment, along with genetic counseling [11,16,17].
The CYP21A2 gene is located on the short arm of chromosome 6 (6p21.3), situated 30 kb apart from its pseudogene (CYP21A1P). These genes share 98% sequence identity for the exons and 96% among the introns [18]. Besides that, 95% of pathogenic variants of CYP21A2 originate in recombination events [3]. More than 1300 variants in the CYP21A2 have been reported in the human gene mutation database (HGMD), and more than 200 of these are described to affect human health. With high variability between different ethnic groups and single nucleotide variants (SNVs), missense and nonsense account for half of the total variations [19,20].
CYP21A2 variants are classified according to the impact on the enzyme activity. Group Null consists of deletions or nonsense variants that critically affect the enzyme activity, resulting in a complete loss of function due to altered enzyme stability, steroid or heme binding, and membrane anchoring. The most common variants are thirty kilobase deletion (30-kb del), eight base pairs, Cluster E6 (p.I237N, p.V238E, p.M240K), p.Q319X, p.R357W, and p.L307fs [3]. Group A is composed of a variant in which the enzyme activity is minimal, around 0-1%. This group is represented by an intron variant IVS2-13A/C>G, which is created by an additional splice acceptor site causing retention of 19 intronic nucleotides of the intron 2 [21,22]. Homozygous or compound heterozygote variants with the null group are often associated with the salt-wasting form, but approximately 20% of cases have a simple virilizing phenotype [3]. Group B has a residual activity of 1-10%, which is enough to prevent adrenal crisis. The p.I173N variant is representative of this group and is associated with the simple virilizing form of 21-hydroxylase deficiency [23]. Finally, group C has an enzyme activity of about 20-60% and is associated with the mild form of CAH. The variants common in this group are p.V282L, p.P454S, and p.P31L, which show phenotype variability [24].
The biochemical correlation for CYP21A2 activity works well for the CAH diagnosis. However, external factors or a combination of genetic diseases can change the steroid levels, making genotype elucidation an important tool for patient management [11,17]. In general, there is a good genotype-phenotype correlation for CAH, and the elucidation of new variants found in each population is important to improve the correct treatment [25,26]. Initially, the bovine CYP21A2 structure was used as a template to elucidate the impact of variants damage from 2011 to 2015 (PDB ID 3QZ1), but the human CYP21A2 crystal structures have recently become available. Human CYP21A2 is deposited in the RSCB Protein Data Bank (PDB) under entries 4Y8W and 5VBU, with two different steroid ligands (progesterone in 4Y8W and 17-hydroxyprogesterone in 5VBU). The human structure has just one steroid-binding site, which is different from the bovine enzyme, which has two catalytic sites [2,27].
This study investigated the uncharacterized SNVs in the CYP21A2 gene through computational and functional analysis, establishing a correlation with the residual enzyme activity and a possible CAH phenotype. We selected six missense variants in the CYP21A2 gene (p.P35L, p.L199P, p.W202R, p.E352V, p.P433L, and p.R484L) that were previously reported in the Brazilian or Portuguese populations without previous functional characterization. To determine how these variants impair the CYP21A2 enzyme, we performed a detailed analysis based on the chemical changes due to amino acid alterations, residue conservation, and effect on protein structure. In silico predictions using human CYP21A2 structure (PDB ID 4y8w) were used to understand the structural damage, while the in vitro analysis using recombinant protein expression and enzyme kinetics was performed to determine the impact on protein function.

Variant Collection and Description
We selected six missense variants that had patient genotype and phenotype available in our literature review after the initial screening described in methods. Four of these variants were found in Brazilian (p.P35L, p.L199P, p.E352V, and p.R484L) and two in Portuguese (p.W202R and p.P433L) populations. Genetic and clinical data describing the carriers of the selected SNVs are summarized in Table 1. These data were collected from the original papers.  (Table 1). One allele presents a mutation with symptoms of severe enzyme damage, while the other alleles carried three mutations: p.P35L, p.H63L, and 30-kb del. It was concluded that the p.P35L origin was not from the same genetic event of the p.H63L and 30-kb del, as it was not present in the pseudogene screening. Therefore, we analyzed this variant alone.
The variant p.L199P was found as heterozygous in a female patient with mild clitoromegaly atypical genitalia and elevated 17OHP levels detected in the Brazilian newborn screening program (Table 1). According to the authors, the child remained asymptomatic at 3.3 years old, and the clitoromegaly and the high 17OHP levels detected in the screening were likely due to the premature birth. The impact of this new variant has been unknown, as it was not possible to infer the consequence once it was identified in heterozygosis.
The variant p.W202R was found as compound heterozygous with large gene conversion in a Portuguese female (Table 1). This patient presented atypical genitalia and salt wasting, being diagnosed with classical CAH at <2 months old.
The variant p.E352V was identified in a compound heterozygous state in four classical CAH patients from Brazil (Table 1), two of them with the same intronic mutation, IVS2-13A/C>G heterozygous with p.E52V. These two patients showed the classical CAH form with salt-wasting. One female presented p.R484L as compound heterozygous with Cluster E6. These patients presented the simple virilizing CAH form. The latter has the p.R484L and p.G425S, which resulted in the simple virilization form of CAH.
The variant p.P433L was identified in a Portuguese female who presented it as compound heterozygous with the mild mutation p.P454S (Table 1). This patient was diagnosed at 17 years old with a nonclassical form of CAH after ACTH stimulation, which indicated the 17OHP elevation characteristic of that form of CAH.
The variant p.R484L was found as compound heterozygous in two Brazilian patients (Table 1). Both present classical CAH form, with salt-wasting and high 17OHP levels detected during newborn screening. One allele in both cases has p.R484L together with a splice variant IVS2-13A/C>G. The second allele in one patient was a CYP21A2 deletion and in the other p.Q319X.

Computational Characterization Indicated the Structural Impact of the SNVs
We performed a screening with five predictive tools that have different approaches, PolyPhen-2, SNAP2, MutPred2, Meta-SNP, and PredictSNP. Results obtained for the six variants chosen in this work are shown in Table 2. The variants p.L199P, p.E352V, and p.R484L were predicted to damage the protein by all predictor tools, while p.P35L had damage predicted by four tools, p.W202R by three, and p.P433L by two. Table 2. Prediction of the possible impact on the structure and/or function of CYP21A2 by genetic variants. Seven variants were used to validate the method, with four polymorphisms (•) and three with a known negative impact (↓). a SNV nomenclature according to UniProt ID Q16874-1. Scores ≥0.5 by PolyPhen-2, MutPred2, and Meta-SNP indicate protein damage. SNAP2 scores >50 indicate a strong signal for effect, between 50 and −50 weak signal and <−50 strong signal for neutral effect.  Among the variants with known activity, the neutral variants (p.R103K, p.D183E, and p.S269T) agreed with all tools. However, among the three variants known to cause damage, we had different results. The two severe variants (p.I173N and p.R427H) were predicted correctly by all tools; however, the mild (p.V282L) variant was wrongly classified by all of them. The second screening about the protein stability of all variants presented a variation of ∆G, from −1.47 kcal/mol (p.R484L) to 1.23 kcal/mol (p.E352V) (Tables 2 and 3).

Amino Acid Chemical Proprieties, Conservation, and Structural Damage
We analyzed the amino acid chemical properties to assess the characteristics of each exchange and the conservation of that residue across species to determine how it has been retained during the evolution of CYPs P450. We worked with 200 CYP 450 sequences homologous to human CYP21A2. The structural damage was characterized by Gibbs's free energy and gain/loss of hydrogen bonds.
Proline at amino acid position 35 is located in a coil on the protein surface (Figures 2  and 3). This residue shows a high conservation score, with only two residues found on homologous sequences (proline and guanine) ( Figure 4 and Table 4). The variant p.P35L has the amino acid properties changed from a polar and uncharged to a nonpolar and aliphatic residue, increasing the structural stability, ∆∆G −0.58 kcal/mol (Table 3). PolyPhen-2, SNAP, Meta-SNP, and MutPred2 predicted the effect of this mutation on the protein stability and functionality. Besides that, MutPred2 predicts two structural effects: gain of helix and alteration of transmembrane features. PredictSNP classified this variant as neutral ( Table 2).
We analyzed the amino acid chemical properties to assess the characteristics of each exchange and the conservation of that residue across species to determine how it has been retained during the evolution of CYPs P450. We worked with 200 CYP 450 sequences homologous to human CYP21A2. The structural damage was characterized by Gibbs's free energy and gain/loss of hydrogen bonds.
Proline at amino acid position 35 is located in a coil on the protein surface (Figures 2  and 3). This residue shows a high conservation score, with only two residues found on homologous sequences (proline and guanine) ( Figure 4 and Table 4). The variant p.P35L has the amino acid properties changed from a polar and uncharged to a nonpolar and aliphatic residue, increasing the structural stability, ∆∆G −0.58 kcal/mol (Table 3). Poly-Phen-2, SNAP, Meta-SNP, and MutPred2 predicted the effect of this mutation on the protein stability and functionality. Besides that, MutPred2 predicts two structural effects: gain of helix and alteration of transmembrane features. PredictSNP classified this variant as neutral ( Table 2).
Leucine at position 199 is localized at α-helix F, close to the CYP21A2 catalytic site <4 Å from heme, where it forms two hydrogen-bonds, one with p.S203 (on the loop between α-helix F and F') and one with p.I195, on α-helix F (Figure 3 and Table S1). This residue shows mild conservation, being found in 10 different sequences at this position ( Figure 4, Table 4). The variant p.L199P changes a nonpolar and aliphatic residue to a polar and uncharged residue, losing the hydrogen bond with p.I195 (Table S2). All predictor tools showed that this mutation causes damage to the protein, and MutPred2 predicted an alteration of coiled-coil and transmembrane regions; ∆∆G was 0.79 kcal/mol. Residue ligand contact maps were generated with LigPlot+ v.2.2.4 software using as maximum hydrogen -acceptor/-donor distance 2.7 Å and 3.35 Å (green line), respectively, while for non-bonded, the minimum contact distance was 2.9 Å and the maximum 3.9 Å.
Tryptophan at position 202 is also located <4 Å from the CYP21A2 catalytic site, on the α-helix F with a hydrogen bond with p.V198 (Figure 3). This residue presented a low conservation score, being highly variable between its homologous sequences ( Residue ligand contact maps were generated with LigPlot+ v.2.2.4 software using as maximum hydrogen -acceptor/-donor distance 2.7 Å and 3.35 Å (green line), respectively, while for non-bonded, the minimum contact distance was 2.9 Å and the maximum 3.9 Å. coil (p.W406) ( Figure 3 and Table S1). This residue is highly conserved across species, and there is no other variation on CYP21A2 homologs ( Figure 4 and Table 4). The variant p.E352V changes a negatively charged residue to a nonpolar aliphatic residue. The stability of the CYP21A2 structure was increased, and the changed residue lost three hydrogen bonds with W406 and p.R355 (Table S2). All the predictor tools showed that p.E352V results in protein damage ( Table 2). The MutPred2 predicted alteration of an interface, loss of allosteric site at p.E352, and altered metal binding.  Leucine at position 199 is localized at α-helix F, close to the CYP21A2 catalytic site <4 Å from heme, where it forms two hydrogen-bonds, one with p.S203 (on the loop between α-helix F and F') and one with p.I195, on α-helix F (Figure 3 and Table S1). This residue shows mild conservation, being found in 10 different sequences at this position ( Figure 4, Table 4). The variant p.L199P changes a nonpolar and aliphatic residue to a polar and uncharged residue, losing the hydrogen bond with p.I195 (Table S2). All predictor tools showed that this mutation causes damage to the protein, and MutPred2 predicted an alteration of coiled-coil and transmembrane regions; ∆∆G was 0.79 kcal/mol.
Tryptophan at position 202 is also located <4 Å from the CYP21A2 catalytic site, on the α-helix F with a hydrogen bond with p.V198 (Figure 3). This residue presented a low conservation score, being highly variable between its homologous sequences ( Figure 4 and Table 4). The variant p.W202R changes a hydrophobic residue to a hydrophilic and positively charged residue, decreasing the CYP21A2 structural stability with a ∆∆G of 1.11 kcal/mol, but without losing the hydrogen bond (Table S2). Two predictor tools classified that residue replacement as neutral (Meta-SNP and PredictSNP), while the three others predicted damage to the protein. An altered coiled-coil region was predicted by MutPred2.      Glutamate at position 352 is located in α-helix K, presenting five hydrogen bonds, four on the same helix (one with p.A348 and p.L356 and two with p.R355) and one with a coil (p.W406) ( Figure 3 and Table S1). This residue is highly conserved across species, and there is no other variation on CYP21A2 homologs ( Figure 4 and Table 4). The variant p.E352V changes a negatively charged residue to a nonpolar aliphatic residue. The stability of the CYP21A2 structure was increased, and the changed residue lost three hydrogen bonds with W406 and p.R355 (Table S2). All the predictor tools showed that p.E352V results in protein damage ( Table 2). The MutPred2 predicted alteration of an interface, loss of allosteric site at p.E352, and altered metal binding.
Proline at position 433 is located at α-helix M, where it has a hydrogen bond with p.A435 in the same helix ( Figure 3 and Table S1). This residue is close to the central heme group, but it is not a conserved residue, as it is found interchangeable as 14 different residues among the CYP21A2 homologous group (Figure 4 and Table 4). However, this indicates the evolution of this residue across species with different roles and different redox partners. The exchange of a polar and uncharged amino acid with a nonpolar and aliphatic showed ∆∆G 0.39 kcal/mol ( Table 3). The SNAP2 predictor tool showed damage in the protein with the variant and MutPred2 a gain of helix and loss of catalytic site at p.E432; however, the other three predictors indicated a neutral effect (Table 2).
Arginine at position 484 is located on the protein surface in a coil that makes a cluster with nine other amino acids. This residue has three hydrogen bonds, one with p.M486 in the same coil; one with p.Q482, located at stand β9; and one with p.A449, which makes the connection between α-helix M and β8 ( Figure 3 and Table S1). This residue showed the highest conservation score, with just two other amino acids found in the homology analysis ( Figure 4 and Table 4). The variant p.R484L changes a positively charged polar residue to a nonpolar and uncharged residue, losing the two hydrogen bonds with p.M486 and Q482 (Table S2). This amino acid exchange showed damage by all predictor tools used, and stability increased with a ∆∆G of -1.47 kcal/mol (Tables 2 and 3). Loss of intrinsic disorder was predicted by MutPred2, as well.

Functional Testing for the Validation of In Silico Results
HEK293 cells were transfected with plasmids expressing CYP21A2 WT or variants p.P35L, p.L199P, p.W202R, p.E352V, p.P433L, p.R484L, p.I173N, and p.V282L as two controls with known activity (~2% and 18-60%, respectively) [4,23,24]. The CYP21A2 activity was quantified for both WT and variants. Using the TLC analysis, we could assess the conversion ratio of progesterone to 11-deoxycorticosterone and compare the activity of each variant with the WT enzyme ( Figure 5A). Variants p.W202R and p.E352V showed only residual conversion (<2% enzyme activity), similar to p.I73N, which is associated with the classical form of CAH. Variants p.P35L and p.L199P showed partial activity, with the catalytic activities being 13.2% and 10.3% of the WT, similar to the p.V282L variant, which is associated with a nonclassical form of CAH. Variants p.P433L and p.R484L presented activity between the two controls, being 7.5% and 3.4% of the WT activity, respectively ( Figure 5C). The 21OH activity expressed accordingly with TLC spots densitometry and related to the WT. Bars represent the standard error from three samples. Inside the box is plotted the activity (total) and the specific activity (specific) of the three variants that showed statistically significant difference (* p < 0.05) between these values. The specific activity was obtained by dividing the enzyme activity by the 21OH protein estimated by western blot. All results were analyzed on GraphPad software. * p value < 0.05 The 21OH activity expressed accordingly with TLC spots densitometry and related to the WT. Bars represent the standard error from three samples. Inside the box is plotted the activity (total) and the specific activity (specific) of the three variants that showed statistically significant difference (* p < 0.05) between these values. The specific activity was obtained by dividing the enzyme activity by the 21OH protein estimated by western blot. All results were analyzed on GraphPad software. * p value < 0.05.
To determine if the reduction in activity was associated with the decrease in protein expression, we quantified CYP21A2 protein in WT and variants using western blot assay ( Figure 5B). Normalized by β-actin expression, we found the expression level was less than 50% of the WT for the p.P35L, p.V282L, p.E352V, and p.R484L variants, while p.W202R had 119% of WT. Therefore, we also calculated the specific activity of each variant compared to WT by dividing the activity obtained from the TLC results by the CYP21A2 expression level ( Figure 5C). The specificity activity for p.W202R, p.E352V, and p.R484L present a significant difference (p < 0.05) when compared with the non-protein normalized activity. The variant p.W202R had a decrease in activity to <1%, while p.E352V and p.R484L had an increase in activity by two-fold and three-fold, respectively. Altogether, our results indicated that all variants tested significantly impacted the activity of CYP21A2. The variants p.W202R, p.E352V, and p.R484L also impacted the protein expression ( Figure 5C).

Kinetic Analysis of CYP21A2 Variants
The apparent kinetic constant revelated saturation for the 21OH WT with the Michaelis-Menten constant (K m ) of 1.57 µM for progesterone and an apparent maximal reaction velocity (V max ) of 0.360 nmol·min −1 ·mg −1 ( Figure 6 and Table 4). The apparent saturation was a similar rate of the WT for p.P35L (K m 2.06 µM) and p.P433L (Km 1.91 µM), but it was six times higher for p.L199P (K m 10.24 µM), indicating that p.L199P decreases the protein affinity for the substrate progesterone. The apparent V max was lower than the WT for p.Pro35Leu (0.085 nmol·min −1 ·mg −1 ), p.L199P (0.252 nmol·min −1 ·mg −1 ), and p.P433L (0.055 nmol·min −1 ·mg −1 ). The apparent catalytic efficiencies of the mutated proteins were also lower than those of WT, and p.P35L had catalytic efficiency of 18% compared to the WT, p.L199P of 11%, and p.P433L of 13% (Table 5, Figure 6).
Our results show that the p.L199P variant, which is located close to the catalytic site, affects the substrate binding. In contrast, p.P35L and p.P433L showed inhibition of the enzyme activity through a decrease in reaction velocity.    Our results show that the p.L199P variant, which is located close to the catalytic site, affects the substrate binding. In contrast, p.P35L and p.P433L showed inhibition of the enzyme activity through a decrease in reaction velocity.

Discussion
We have presented a detailed investigation of the molecular and functional damage caused by six missense variants of the CYP21A2 gene. These variants were selected from a literature review of uncharacterized missense variants. We performed in silico analysis and functional assays to assess the effect of these CYP21A2 variants. We showed the analysis of two variants with changes in positions close to the catalytic site, p.L199P and p.W202R. Both showed a low conservation score among mammalian CYPs P450. However, looking specifically for the CYP21A2, residue 199 still had variability while 202 had high conservation. The p.W202R changed a heavy and hydrophobic residue to a hydrophilic and positively charged residue. This exchange at 202 abolished the enzyme activity (1.1% of WT), which is in agreement with the severe CAH (SW) form described for the patient [30]. In the other case, the leucine replacement by proline at position 199 broke the hydrogen bond with a residue on the same helix, destabilizing the conformation of helix F. Previous studies with helical substitution of leucine to proline at position 168 (helix E), 262 (helix H), and 322 (helix J) showed a break of the helix, as well, and association with the severe form of CAH [4,25,32]. In our study, this alteration decreased the affinity of the enzyme for the substrate by about six-fold, making the enzyme activity for the L199P variant of CYP21A2 drop to 10% of the WT.
We analyzed two CYP21A2 variants by looking at the protein surface: one at the N-terminal, residue 35, and the other on the C-terminal, residue 484. We showed that the proline at the 35 position has a high conservation score, even being on the protein surface without strong interaction with closer residues. This variant decreases the reaction velocity four times due to the faster enzyme saturation, V max . 0.08552 nmol·min −1 ·mg −1 , compared with WT, V max . 0.3604 nmol·min −1 ·mg −1 . The apparent enzyme activity was around 13 % of the WT. That result suggests an important and conserved role of P35 across species. Variant p.P35L was previously described in the Brazilian population in a compound heterozygous state with the other three variants, two being in the same allele and one in the other [28]. All the four patients described by [28] carry two other pathogenic mutations (30-kb del and p.H63L) at the same allele of p.P35L, which by themselves would transcribe a non-functional protein [3,33]. The p.H63L is derived from pseudogene and has known mild activity in the homozygous state; however, it has a synergistic effect when it is associated with another mild mutation and results in a severe phenotype [33]. The 30-kb del is a gene rearrangement event in the RP1-C4A-CYP21A1P-TNXA and RP2-C4B-CYP21A2-TNXB regions, which causes CYP21A2/CYP21A1P chimera [3]. In [28], p.P35L was not presented in any pseudogene analyzed, which indicated that these appear in the same allele from independent genetics events.
The residue studied at the carboxyl-terminus 484 has high conservation, while all the others around this sequence position have mild or low conservation scores. The p.R484 stabilizes other residues through hydrogen bonds with p.M486 in a coil, p.Q482 at stand β9, and p.A449, a crucial residue between helix M and a sheet β8 [27]. The variant p.R484L changes a polar and positively charged amino acid to a nonpolar and uncharged residue inside a polar cluster, causing an intrinsic disorder. This variant showed enzyme activity around 3% of the WT when normalized by the specific amount of CYP21A2. However, the activity was significantly lower without the protein normalization, suggesting that this variant generates a lower transcription level or higher protein degradation than the WT. The same effect was observed for p.W202R and p.E352V variants. The p.R484L was found in a compound heterozygous state with the other two variants, one in the same allele and another in the opposite allele [29]. Both of these have a known severe impact on enzyme activity [3,26].
Another residue with a high interaction number is the p.E352. This amino acid is highly conserved, and there is no residue variation at the same position in homologous sequences, showing an important and conserved role in the CYP21A2 protein. The p.E352 forms five hydrogen bonds, stabilizing its helix through four interactions (two with p.A348 and p.L356 and two with p.R355) and stabilizing a network with p.W406 in the L-M loop. The tryptophan at 406 is a backbone residue, while arginine at 355 has an extensive network with glutamine at p.E352 and arginine at 409 in the ERR-triad [25]. This triad acts to stabilize the three-dimensional structure that allows covalent binding of the heme group [25,34]. The variant p.E352V causes the loss of three hydrogen bonds (p.W406, and p.R355), destabilizing several structural elements. We showed that the CYP21A2 p.E352V decreased the enzyme activity to only 1.1% of the WT, which would be related to the classical form of CAH. De Carvalho et al. [16] identified four patients with this variant in the compound heterozygous state. Two of them had a simple virilization form of CAH, while two had the salt-wasting form. In the first form, one patient presented Cluster E6, which abolishes the enzyme activity, and the other had p.G425S, which has less than 2% of enzyme activity [4]. The latter form was detected in two patients with the same intronic variant IVS2-13A/C>G, which mostly results in the salt-wasting form (in around 79% of the cases) and simple virilizing form (in 20% of the cases) [3].
Lastly, the residue p.P433 had a low conservation score looking at the mammalian CYPs P450. Nonetheless, the score was high among the CYP21 group, highlighting the importance and conservation of this residue in this specific group. Proline at 433 is on the first helix M turn, and it has one hydrogen bond with a residue in the same helix M, p.A435, helping with the helix stabilization. Besides that, this residue lies adjacent to the L-M loop, where heme-binding residue p.R427 is located. The p.P433L makes the structure more flexible, modifying the optimal conditions for the heme-binding, as previously described [25]. This variant makes the reaction become saturated six times faster than the WT, with a V max . of 0.055 nmol·min −1 ·mg −1 , and decreases the enzyme activity to 7.5%. The p.P433L was identified in the compound heterozygous state with the mild mutation p.P454S in a patient diagnosed with the nonclassical form of CAH [31].

Search and Selection of Variants in CYP21A2
We searched Ensembl, HGMD, OMIM, and ClinVar databases for all CYP21A2 variants with an amino acid exchange, excluding nonsense and frameshift variants. From that first list, we excluded the variants with the enzyme damage that has already been characterized. The clinical phenotype associated with the variants was accessed from the original article when it was not available in the databases. Variants without patient information were excluded, as well as when it had been clinically associated with a nonpathogenic form of CAH or it had been found with either homozygous or heterozygous compounds with a pathogenic variant. A second screening was performed to select variants that have been found in either the Brazilian or Portuguese population.
For the remaining SNPs, we performed a structural impact analysis through the shift of Gibbs free energy (∆∆G) value by STRUM [40]. Gain/loss of hydrogen bonds was calculated on WHAT-IF web (https://swift.cmbi.umcn.nl/servers/html/index.html accessed on 5 December 2021). The human CYP21A2 structure used here was deposited on PDB by Pallan P.S., Lei L., and Egli M. [2]. This structure has residues from 28 to 485 of CYP21A2, so we added the missing amino acid with I-TASSER [41] in the N-and C-terminal. The I-TASSER input was the protein sequence RefSeq NM_000500.9 and PDB ID 4y8w, chain A. The CYP21A2 structures with each variant were built with STRUM, and the Gibbs free energy was compared with the wild-type (WT). Structures built with the variants were aligned with the WT using PyMOL (Schrödinger, LLC), where atoms with distance variation were measured. We used the structure-naming scheme of Pallan, et al.

Conservation Analysis
To access the significance of these variants on the evolutionary conservation, we performed an evolutionary analysis with homologous proteins by ConSurf [42]. The structure of human CYP21A2 deposited under PDB ID 4y8w (chain A) was used as a reference structure, and cytochrome P450 homologous sequences from Homo sapiens, Bos taurus, Canis lupus familiaris, Cavia porcellus, Capra hircus, Felis catus, Gorilla gorilla gorilla, Lynx lynx, Mesocricetus auratus, Macaca fascicularis, Macaca mulatta, Mus musculus, Oryctolagus cuniculus, Ovis aries, Rattus norvegicus, and Sus scrofa were used for conservation analysis. A multiple sequence alignment was built using CLUSTALW using the Bayesian method for the calculation of the conservation score. ConSurf scores range from conserved (magenta or nine) to the variable (cyan or one). Homologous sequences were collected from SWISS-PROT with BLAST algorithm (PSI-BLAST E-value 0.0001, four iterations) [43][44][45].

Construction of Plasmid and Site-Directed Mutagenesis
Mammalian expression plasmid pcDNA3.1+/C-(K)-DYK carrying the ORF sequence of WT CYP21A2 (NM_000500.7) with a C-terminal DYK (FLAG) tag was purchased from GenScript (Piscataway, NJ, USA). We used this plasmid as a template to create the variants through site-directed mutagenesis. Mutagenesis reactions were performed with a QuikChange II Site-Directed Mutagenesis kit (Agilent Technologies, Santa Clara, CA, USA), following the manufacturer protocols. Oligonucleotides sequences designed with the Quik Change Primer Design program (Agilent Technologies, Santa Clara, CA, USA), are shown in Table 6. Correct nucleotide change was confirmed by direct sequencing on an ABI 3500 Genetic Analyzer (Applied Biosystems, Carlsbad, CA, USA).

Cell Transfection and Enzymatic Activity Assay
Transient transfection was performed with the CYP21A2 WT vector and the variants. HEK293 cells were seeded in a six-well plate (6 × 10 5 cells/well). After 24 h, growth media were replaced, and the cells were transfected using Lipofectamine 2000 (Thermo Fisher, Bedford, MA, USA) and 2.6 µg of the plasmid. Transfected cells were reseeded in a 24-well plate (1.5 × 10 5 cells/well) after 24 h of the transfection for uniformity of cell population across wells. At 48 h after transfection, the functional assay was started by changing to 500 µL of fresh media and adding 1 µM of unlabeled progesterone with 10,000 cpm of [ 14 C]-Progesterone as a tracer. Media and cells were collected 45 min after incubation at 37 • C. Steroids were extracted from media with ethyl acetate and isooctane (1:1 vol/vol) dried and dissolved in methylene chloride. Steroids were separated by thinlayer chromatography (TLC), exposed to a phosphor screen, and visualized with Typhoon PhosphorImager FLA-7000 (GE Healthcare Bio-Sciences AB, Uppsala, Sweden). Image intensity was measured and quantified with ImageQuant TL v8 (Cytiva, MA, USA). The CYP21A2 enzyme activity was expressed as relative steroid conversion. Steroids were quantified as a percentage of radioactivity incorporated into 11-deoxyprogesterone to the total radioactivity measured in the whole sample and compared between the WT and variants. Cells were collected with trypsin and washed with 1× PBS to quantify the amount of protein. Results were analyzed from three technical replicates. To ensure a similar amount of CYP21A2 in each reaction, a western blot analysis was performed to normalize the enzyme activity with the relative CYP21A2 expression.

Western Blot
The amount of CYP21A2 expressed was measured from the total protein extraction. Cells were incubated for 1 h with the lysis buffer previously described [46] and centrifuged at 15,000× g for 20 min and 4 • C. The supernatant was collected for the measurement of total protein through a Pierce Coomassie Plus (Bradford) Assay Kit (Thermo Fisher, Hanover Park, IL, USA). Seven µg of total protein were loaded on an SDS-PAGE gel (GenScript, Piscataway, NJ, USA) and then transferred to a PVDF membrane, as previously described [46]. Two primary antibodies were used at the same time: a mouse monoclonal DKY-Tag antibody diluted 1:1000 (GenScript, Cat# A00187) and a mouse monoclonal antiβ-Actin antibody diluted 1:1500 (Sigma Aldrich, St. Louis, MO, USA, Cat# SAB3500350). The secondary antibody, IRDye 800CW-conjugated donkey-anti-mouse (LI-COR, Nebraska, USA-Cat# 926-32212) was diluted at 1:15,000. An Odyssey SA Infrared Imaging system (LI-COR Bioscience Inc. Lincoln, NE, USA) was used to detect the fluorescence signal.

Enzyme Kinetics Assay
To obtain the apparent reaction efficiency from the variants that showed enzyme activity, enzyme kinetic assays were performed. Five unlabeled progesterone concentrations were used (0.1, 0.3, 1, 3, and 5 µM) with 15,000 cpm of [ 14 C]-Progesterone as a tracer, under the same conditions that were used in the functional assay. Enzyme velocity was normalized with the relative CYP21A2 expression derived from western blots. Results were analyzed from three biological replicates with Michaelis Menten kinetics using GraphPad Prism 9.2.0 (GraphPad, San Diego, CA, USA).

Statistics Analysis
Statistical significance was calculated with a one-sample student's t-test, for comparing samples in the same group, and a one-way ANOVA, for comparing two groups (one variant group with the WT group). The p-value was considered significant with p < 0.05. Prism 8 (GraphPad, CA, USA) and Excel (Microsoft, Redmond, WA, USA) were used to perform the calculation and statistical analysis.

Conclusions
Here we elucidated the structural and functional impact of six CYP21A2 variants (p.P35L, p.L199P, p.W202R, p.E352V, p.P433L, and p.R484L) with unknown profiles via in silico and in vitro enzyme analysis. We showed a good correlation between in silico and in vitro functional studies for structural and conserved variants. However, residues with specific roles have ambiguous results, e.g., for the catalytic site, as the homology has a high weight for most of the predictor tools. Especially in these cases, the functional assay to access the enzyme activity was decisive to assign the damage caused by that variant. These results also emphasize the relevance of using multiple algorithms together with functional assays, especially when it is not possible to establish correlations with the phenotype based solely on bioinformatics predictions.