1. Introduction
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused the current pandemic, the largest of the 21st century [
1]. According to the World Health Organization (WHO), the novel coronavirus infected more than 759 million individuals and killed more than 6.8 million people worldwide [
2]. Previous studies performed with SARS-CoV (referring to the coronavirus responsible for the 2002/2003 epidemic) were fundamental in the development of vaccines and studies of SARS-CoV-2 inhibitors of the virus life cycle in the host cell [
3,
4]. Vaccination was essential to drastically reduce the number of infections and deaths caused by the virus [
5,
6]. Although more than 12.4 billion doses of vaccines were applied and more than 5 billion people are fully vaccinated, the vaccination does not prevent the transmission of the virus due to emerging novel SARS-CoV-2 Variants of Concerns (VOCs) that are resistant to neutralizing antibodies [
5,
6]. The emergence of SARS-CoV-2 VOCs resistant to vaccines has concerned the scientific community [
1]. In 2020 December, the Alpha variant (or lineage B.1.1.7, or Clade 20I) emerged and was soon recognized as both more transmissible and more lethal than the ancestral strain [
7]. In 2020 September, the Beta variant (or B.1.351 lineage or Clade 20H) emerged in South Africa, showing greater transmissibility and lethality than the Alpha variant [
8]. Next, the Gamma variant (or lineage P.1, or Clade 20J) originated in Brazil and Japan, becoming predominant in these countries and spreading to neighboring countries [
9]. In 2020 September, the Delta variant (or lineage B.1.617.2 or Clades 21A, 21I, 21J) emerged in India, presenting several mutations that allowed its predominance worldwide [
10]. In November 2021, the Omicron variant (or B.1.529 lineage) emerged and, in less than 3 months, became predominant over the Delta variant worldwide [
1,
11,
12]. The lineage B.1.529 (Clade 21M) gave rise to its sub-lineages, BA.1 (Clade 21K), BA.2 (Clade 21L), BA.2.12.1 (Clade 22C), BA.4 (Clade 22A), and BA.5 (Clade 22B). In 2023 March, the proportions of the Omicron sub-variants represented more than 80%, when compared to other VOCs, being XBB.1 and XBB.1.5 more than 55% prevalence in the reported cases of COVID-19 [
13].
Among the molecular targets, the Spike protein and the 3-chymotrypsin-like protease (3CLpro or Mpro, SARS-CoV-2 main protease) play key roles in vaccine and drug designs, respectively [
14,
15,
16,
17,
18,
19]. In particular, the 3CLpro cleaves 13 sites of two polyproteins,
pp1a and
pp1ab, which are results from translation of the genomic mRNA. The active form of the 3CLpro is a homodimer, in which each protomer has domains I, II, and III. Domains I (residues 10-99) and II (residues 100-182) are formed by six antiparallel β-strands, where the catalytic site is located between them [
20]. Domain III (residues 198-303) is composed of five α-helices that are involved in the regulation of 3CLpro dimerization through salt interactions between residue E290 of one protomer and residue R4 of the neighboring protomer [
21]. Dimerization of 3CLpro is a critical process for catalytic activity because the N-finger of each of the protomers interacts with E166 of the other protomer, adjusting the S1 subsite to bind the substrate [
14,
22]. The interface between domains I and II is composed of the side chains of V13, L115, F150, P122 of one protomer and P9 of the other protomer, together they form strong hydrophobic interactions [
23]. Any substitution of these residues for larger or more polar ones causes a reduction in catalytic activity, suggesting that they are residues important for inter-domain contact [
23]. Conversely, at the dimer interface, between the domain I of one protomer and the domain I of another protomer, it is reported that there is an extensive network of hydrogen bonds from residues S10, G11, and E14 of a protomer with residues S10, G11, and E14 from the other protomer [
23]. This interaction network is thus expected to play a key role in maintaining SARS-CoV-2 3CLpro dimer [
23].
The main catalytic residues of 3CLpro are dyad H41 and C145. Cysteine is conserved among the family
Coronaviridae [
24] and acts as a nucleophile in the proteolytic process [
25,
26]. The active site is composed of four subsites, S1, S1', S2 and S4 [
26], following the Schecter-Berger nomenclature for proteases [
27]. The Sis and Si's represent the subsites of the enzyme and the Pis and Pi's are notations for the substrate, in which the substrate is cleaved between P1 and P1'. Structural studies have shown that the 3CLpro substrate may occupy mainly four subsites, S1', S1, S2, and S4. The S1 subsite is composed of the residues F140, C145, S144, and H163. Specifically, the side chain of F140 makes π-π interaction with H163, interacting with the N-finger of the neighboring protomer. Such interaction networks play an important role for the structural stability of the active site and molecular recognition of the substrate [
26]. The S1' subsite is more exposed to the aqueous environment than other subsites and it is composed of residues T25, T26, L27, and H41. This sub-site has a function of accommodating residues that present small bulk side chains [
26]. The S2 subsite is highly hydrophobic, composed of residues M49, Y54, and M165 and the alkyl part of the side chain of D187 [
26]. Finally, the S4 subsite consists of residues M165, L167, F185, Q192, and Q189 [
26]. A multiple alignment of the amino acid sequences in the vicinity of the cleavage sites has shown that 3CLpro recognizes the consensus sequence X-(L/F/M)-Q↓(G/A/S)-X (where X = any amino acid, the arrow ↓ being the substrate cleavage site) [
18,
28]. Thus, the molecular recognition of 3CLpro by the substrate depends exclusively on the physicochemical features of the specific residues present in the target sequence of each substrate. Therefore, insights into the nature of the interactions involved in substrate recognition by 3CLpro are essential to allow an understanding of how structural information relates to the biological response of the virus, and thus are also essential in drug design of novel anti-SARS-CoV-2 drugs [
18,
29,
30,
31,
32].
Herein, we simulated 3CLpro
WT, 3CLpro
H41A, 3CLpro
Beta, and 3CLpro
Omicron in complex with substrates nsp4|5 and nsp5|6 for 500 ns. Our results reveal that nsp4|5 showed good conformational stability in the active site, but was significantly more stable in 3CLpro
WT and 3CLpro
Omicron. Furthermore, our root-mean-square fluctuation (RMSF) analysis showed that mutations in 3CLpro
Beta and 3CLpro
Omicron make both variants more stable,
i.e. diminishes their intrinsic mobility, thus making these proteins more prone to maintain longer interactions with the nsp4|5 and nsp5|6 substrates. However, 3CLpro
WT, 3CLpro
Beta, and 3CLpro
Omicron did not show significant differences in the estimated
KD values for the aforementioned substrates. Still, our molecular dynamics data suggest that the cleavage rate of nsp4|5 is higher for the 3CLpro
Omicron variant and could be related to the higher viral load of these VOC, as the efficient release of nsp4 is essential for replication and assembly of replicative structures of the virus [
33]. Therefore, our results suggest that higher viral load of VOCs may be related with enhanced enzymatic activity of the main protease of SARS-CoV-2 [
1,
34].