Evaluation of SYK Gene as a Prognostic Biomarker and Suggested Aromatic Potential Phytochemicals to Halt the Colorectal Cancer

: Background: Colorectal cancer is considered the third most fetal among all type of cancer. Spleen tyrosine kinase (SYK) is a non-receptor type tyrosine-protein that plays crucial role in signaling mediated via immune receptor. We adopted an onco-informatics analysis to evaluate the SYK expression and prognostic value of SYK in colorectal cancer, and identification of potential phytochemicals which may inhibit overexpression of SYK protein as well as minimized colorectal cancer. Materials & Methods: Differential expression of SYK gene was analyzed using the several transcriptomic databases including Oncomine, UALCAN, GENT2 and GEPIA2. The server, cBioPortal was used to analyze mutation and copy number alterations whereas GENT2, GEPIA, OncoLnc and PrognoScan were employed to examine the survival rate. A protein-protein interaction network of SYK and co-expressed genes of SYK was conducted via GeneMANIA. Considering SYK gene en-coding protein as drug target, selected phytochemicals were assessed by molecular docking using PyRx 0.8 packages. YASARA molecular dynamics simulators were applied for the post validation of the molecular docking data. Results: We have observed significant overexpression of mRNA expression levels of SYK gene colorectal adenocarcinoma (COAD) samples compared with normal tissues. Significant methylation level and various genetic alterations are assembled in SYK gene which can lead to the development of colorectal cancer. As a result, lower level of SYK expression was related to the more chances of patients’ survival by which all the outcomes from the multiple bioinformatics platforms and web resources have demonstrated the significant evidences that the SYK kinsase can possess as a potential biomarker for the treatment of colorectal cancer. Here, aromatic phytochemicals namely, Kaempferol and Glabridin targeting SYK showed more stability compared to controls and may be useful for the study showed dysregulated expression of SYK in colorectal cancer and potentiality to act as a biomarker for the prognosis of CRC. Moreover, we have shown phytochemicals (Kaempferol and Glabridin) target SYK as potential treatment strategies and drug repositioning potentiality in colorectal cancer. phytochemical is used as a control which can be used as a perfect drug against colorectal cancer. human colorectal cancer. Our research may shed new light on SYK kinase as a novel biomarker and therapeutic target for colorectal cancer, thereby assisting in the translation of genomic information into clinical practice. Furthermore, the ADMET analysis, cellular docking, and molecular dynamics are effectively implicated to find out the right phytochemicals against human colorectal cancer targeted spleen tyrosine kinase (SYK). Among the 500 phytochemicals tested, The strong interactions and binding affinity with all or at least one of the catalytic residues remain in human colorectal cancer spleen tyrosine kinase (SYK) are Capecitabine (Control), Glabridin, Curcumin, Kaempferol, Quercetin, and Genistein. These protein-ligand complexes also show several non-covalent interactions called hydro-gen-binding, hydrophobic and electrostatic interactions. MDS findings suggest that the physiological environment has observed the most stable protein-ligands interaction and that they interact more frequently with spleen tyrosine kinase (SYK) through the hydrogen bonds. The pharmacokinetic and ADMET analyzes have shown their effectiveness as drug molecules with no cellular toxicity. It can be established that most of the bioactive compounds we have identified, will show significant efficacy and can be used for designing successful anti-cancer drugs for human colorectal cancer. study


Introduction
Cancer is a terrific health problem and its prevalence continues to accelerate every year throughout the world. According to WHO, cancer is a large group of diseases that occurred due to abnormally heightened cells grow uncontrollably, and invade to adjacent organs in the body [1]. Moreover, the Centers for Disease Control and Prevention (CDC) reported that cancer ranked second solely after coronary illness as the head reason for death in America and around the world [2]. The International Agency for Research on Cancer (IARC) gauges the worldwide malignant growth trouble ascends to 19.3 million new cases and 10.0 million disease passings in 2020. It likewise announced that overall one of every 5 individuals create malignant growth during their lifetime and one out of 8 men and one out of 11 ladies pass on from the illness [3]. Furthermore, the American Cancer Society (ACS) as of late distributed their Cancer Facts and Figures 2021 reports, where they expressed that in the United States 1,898,160 new malignant growth cases and 608,570 disease passings are projected to happen in 2021 [4]. All these cases attract both researchers and physicians all over the world to cast the utmost emphasis on cancer research. Among all the 39 different cancer types, colorectal cancer (CRC) is the third maximum commonly diagnosed and second maximum malignant growth-related demise worldwide in both sexes and all ages of patients [4]. According to GLOBOCAN 2020 data, around 2 million people diagnosed, comprising 10%, and around 1 million people died, comprising 9.4% in CRC of all cancer types in both sexes and all ages worldwide [5]. It's a matter of global concern that the burden of colorectal cancer is expected to escalate about 60% in countries with high or very high human development index (HDI) particularly in Eastern Europe, Asia, and South America by 2030 [6]. Thus, it is necessary to establish improved assessment and treatment strategies however, the therapy and prognosis fluctuate broadly with various malignancy stages and biological features. Rely upon tumor node metastasis (TNM) staging, CRC patients determined with stage-I have a five-year endurance pace of more than 90%, while in stage-IV it declines around 12% [7]. Colorectal cancer is a heterogeneous group of diseases caused by a series of genetic and epigenetic mutations in certain cells of the epithelium [8]. For the complexity of the disease, a more comprehensive method is required to gain high accuracy for CRC detection and diagnosis as a single test might provide low accuracy. In this regard, the multi-omic data mining approach can be a feasible alternative to anti-cancer therapies.
Spleen tyrosine kinase (SYK) encodes a non-receptor type tyrosine-protein which is mainly associated with the adaptive immune receptor signal [9]. This protein is involved in various signaling-mediated biological processes including innate immunity recognition, platelet activation, and vascular development along with cellular responses such as differentiation, proliferation, and phagocytosis [9]. In the human cell, the SYK gene is located at the q22.2 position of the 9th chromosomes [10]. As SYK is broadly communicated in hematopoietic cells, it is a capable modulator of epithelial cell development and a potential tumor target in human cancers [11]. The expression of the SYK gene is equally important for both normal and cancer cells whereas it contains tumor-promoting and tumorsuppressing capability too [12]. Various evidence suggests that SYK involves as the reason behind the formation of different cancers [13]. Mainly the growth of tumor and metastasis J.
is developed and activated through the SYK signaling pathway as an intermediate [14]. SYK gene is considered a potent tumor suppressor gene in the human and the suppression of SYK activity increases breast cancer tumorigenicity [11,15]. Apart from breast cancer, lower SYK expression and methylation is connected with metastasis in other cancer including lung cancer, liver cancer, oral squamous cell cancer, pancreatic cancer, bladder cancer, gastric cancer and urinary cancer [16][17][18]. Besides colorectal cancer is also associated with this gene and the loss of its expression is found in several colorectal cancer tissues [19]. Thus SYK promotes the suppression of cancer formation as well as contributes to inhibiting cell proliferation and metastasis capacity [19].
Latest cancer therapies have severe side effects, high costs, and single survival thresholds [20]. In addition, the development of characteristics of drug resistance in cancer cells, the development of new therapeutic agents against cancer is a crying need in recent times. Several studies have shown that plant extracts or phytochemicals are purified and have executed significant positive results in cancer treatment by using directly or in combination with existing drugs [21]. Various well-known phytochemicals are found to exhibit a specific action against tumors, breast cancer, and prostate cancer [22]. It is also shown that various plant derived compounds have anti-proliferative action against cancer by modulation of cellular pathways that make the phytochemicals worthy as perfect drug candidates [23]. Plant bioactive compounds suppress the interaction of carcinogenic molecules, such as damage to DNA. The phytochemicals are usually considered safe, and easily available to consumers, so they are suggested as effective therapeutics for cancer treatment with fewer side effects in the human body.
In this study, we have performed an in-depth analysis to examine the role and clinical relevance of Spleen tyrosine kinase (SYK) in colorectal cancer (CRC) development by adopting various cancer databases. Here, we have figured out the expression patterns of the gene in different malignancy types, clinicopathological parameters, methylation status, genetic alteration frequencies, survival assay analysis, genes coexpressed with SYK, and the interaction network of these genes. Moreover, we have suggested few phytochemicals as a drug-like anticancer compound via molecular modeling targeting SYK by using in silico drug design approaches. Consequently, we have analyzed the pharmacokinetic properties (ADMET profiling) of the selected phytochemicals, protein-ligand docking, post docking analysis, and molecular dynamics simulation. These multi-omics data mining and in silico drug designing approaches will facilitate researchers in finding new anticancer therapies.

The Analysis of SYK gene Expression in Colorectal Cancer
The objectives of this research was to examine mRNA transcription levels in various cancers, including colorectal cancer, we employed Oncomine (https://www.oncomine.org/), UALCAN (http://ualcan.path.uab.edu/), GENT2 (http://gent2.appex.kr/gent2/), GEPIA2 (http://gepia2.cancer-pku.cn/#index) databases. These are all publicly accessible interactive online platform which demonstrates the mRNA transcriptional levels in different malignancy tests with their comparable to typical control. Currently, Oncomine is the world's biggest oncogene chip information base and incorporated information mining stages containing 715 autonomous datasets and 86,733 examples [24]. SYK gene mRNA expression in clinical disease examples was contrasted and relating ordinary controls where the limit boundaries of p-value and fold change were settled as 0.0001 and 2, respectively. In-depth analysis of SYK gene in colon adenocarcinoma (COAD) using the TCGA database was done from UALCAN web portal, provides expression, methylation, pan-cancer view, survival, and correlation data [25]. Besides, DNA methylation levels at the promoter region of SYK gene based on various parameters are also picked out from UALCAN database. Gene Expression database of Normal and Tumor tissues 2 (GENT2) was also employed to retrieve SYK gene expression data across 72 different paired tissues [26]. Moreover, SYK gene mRNA expression and expression in cancer stages using the TCGA data in COAD is also further assessed from Gene expression profiling interactive analysis (GEPIA) 2 database [27]. It provides a gene-specific relative investigation of various malignant growth types utilizing a standard handling pipeline containing around 8,587 typical and 9,736 tumor samples.

Mutation and CNAs Determination of SYK gene
The cBioPortal for Cancer Genomics (https://www.cbioportal.org/) is an open-source interactive web portal for systematic exploration of multidisciplinary cancer genomic datasets currently containing 308 cancer studies [28,29]. It provides information from over 5000 tumor samples for mapping the recurrence of mutations and other hereditary modifications, atomic profiling of cell lines and cancer tissue by applying multifaceted cancer studies. We scrutinize cBioPortal to analyze mutation and copy number alterations of SYK gene in colorectal cancer.

Survival Assay Analysis
Survival plot analysis is the statistical view that provides the data of patients' survival rate against time for cancer patients. Various web-based tools were employed which includes GENT2, GEPIA, OncoLnc and UALCAN to examine the survival rate of colorectal cancer patients against the expression of SYK gene. The website, GENT2 (http://gent2.appex.kr) was used to identify the survival condition where the analysis was performed based on cancer subtype and prognosis condition for a specific tissue [30]. This database provides data for colon cancer based on 1146 sample by analyzing 5 subtypes including Molecular Subtype, AJCC Stage, Duke Stage, Grade and Histology. GEPIA (http://gepia.cancer-pku.cn/) is a database that also analyzes survival data for overall survival (OS) or disease-free survival (DFS) [31]. It uses log-rank test and performs based on gene expression by selecting any specific type of cancer [31]. Besides OncoLnc database was also used to investigate survival analysis. OncoLnc (http://www.oncolnc.org/) provides survival data for 8,647 patients by analyzing mRNA, miRNA, or lncRNA expression from 21 cancer 18 [32]. The database, UALCAN (http://ualcan.path.uab.edu) is an online resource that is also used for survival analysis of SYK gene. The database offers the relation between gene expression and patient survival which evaluates TCGA patient survival data for Kaplan-Meier survival analyses and generates overall survival plots [33].

Correlation Analysis and Interaction Network
It is essential to classify the associated genes of the target gene in order to better analyze gene expression. We accomplished this by looking at the SYK gene's associated genes on the UALCAN and GEPIA websites. GEPIA server was employed to detect similar genes for TCGA COAD tumors. Besides using the correlation segment of UALCAN database, positively correlated genes with SYK in COAD was identified where Pearson-CC >0.47 was considered as moderate significant from both websites. Then the identified similar or correlated genes were inputted into the GeneMANIA database to predict the protein-protein interaction network. GeneMANIA (https://genemania.org/) is an online database that predicts an interaction network for a set of genes with the inputted genes. The database provides the interaction with the association of protein and genetic interactions, pathways, co-expression, co-localization and protein domain similarity [34]. Besides GeneMania was also used to predict the protein-protein interaction network of SYK gene to express the relationship between the SYK gene with the gene in the list.
2.5 Target Identification of SYK gene for Colorectal Cancer Spleen tyrosine kinase (SYK) is a non-receptor protein kinase of tyrosine, which has a part to perform when combined with different malignancies, thus this kinase has become an important target for the development of a therapeutic drug. It has been reported that the loss or decrease nature of SYK is expressed in different normal tissue and cancer cell types [35][36][37][38]. The loss of SYK expression is the main reason for the development of colorectal cancer. The expression could be restored in the colorectal cancer cell line by using several selected drugs or phytochemicals. As the cell proliferation is inhibited by the lower or overexpression of SYK gene, it could be used as a therapeutic target agent against colorectal cancer through targeting protein identification and thus the colorectal cancer disease can be overcome.
2.6 Pharmacokinetics analysis All the selected aromatic compounds initially analyzed via the "Swiss ADME (http://www.swissadme.ch/index.php)" online server, in which evaluate the physiochemical properties of the lead molecules and every candidate follow the Lipinski's 5 rules. To better understanding of the pharmacokinetics characteristics (absorption, distribution, metabolism, excretion, and toxicity) of the extracted ligand was conducted by the "pKCSM as an online server (http:// biosig.unimelb.edu.au/pkcsm/prediction)" [39].
2.7 Extracting lead molecule for Optimization Phytochemicals were first retrieved from the PubChem data base to prepare the Ligand for the docking of protein-ligands and obtained the 3D phytochemical structure required for molecular docking. The compounds were prepared as a ligand by using UCSF Chimera 1.14 (https://www.cgl.ucsf.edu/chimera/) tool, where optimization of ligand, added hydrogen atoms, charges at zero, and the output file saves as PDB format [40].
2.8 Extracting targeted macromolecule for optimization The tertiary, non-mutated structure of the targeted soluble SYK protein, which bearing the PDB ID at 4XG3 was collected from the online server PDB (Protein Data Bank) (https://www.rcsb.org/) and following the sequential steps for optimization of the protein, at first eliminating unnecessary things just like: water molecules, metal ions, ligands, heteroatoms as well as unwanted extra chains. Hydrogen bonds are added to the macromolecules for more precise docking and the overall optimization operation was conducted by the "UCSF Chimera 1.14" software [41].
2.9 Molecular Docking and Post Docking Data Visualization Optimization of the selected ligand which proposed in the above for using as a drug target against the selected SYK (spleen tyrosine kinase) receptor, the protein-ligand docking was carried out through using the tool "PyRx software package version 0.8" (https://pyrx.sourceforge.io/home) that performs based on the Auto-Dock Vina Platform. In each docking, the top ten best binding affinities were displayed via (.csv) file [42]. Ligplot+ version 2.2 (https://www.ebi.ac.uk/thornton-srv/software/LigPlus/) have been conducted for analysis of the interaction between protein-ligand complex, where evaluate the non-bond and the non-covalent (polar, Hydrophobic) interaction. However, the Ligplot+ visualizing tool is effectively performed based on the java interface (Java SE Runtime Environment 8u271), where only the PDB file can be applied that are retrieved from the PYMOL visualizing tool [43].

Molecular Dynamics Simulation
The molecular dynamics complex simulation was carried out in YASARA dynamics where AMBER14 force field [44] was used [45]. The cubic simulation cell was created and the complexes were optimized and hydrogen bond networks were oriented [46]. The TIP3P water solvation model was used at (0.997 g/L-1, 25c, 1 atm) where steepest gradient algorithms were used by simulated annealing method to minimize the protein complexes. The simulation system was neutralized at 0.9% NaCl, 310K and pH 7.4. The techniques of Particle Mesh Ewald were used to calculate the electrostatic interaction with a radius of 8 Å cut off [47]. In either side of the systems the simulation cell was expanded to 20Å so that the protein could move freely inside the simulation cell [48]. In order to maintain the simulation temperature, the Berendsen thermostat was used. The simulation was conducted with a time of 1.25fs, and trajectories were preserved every 100ps [49]. The simulation was performed over 100 ns and trajectories analysis was carried out in order to measure the root mean square fluctuations, root mean square deviation, hydrogen bond, radius of gyration, and solvent accessible surface area.

J.
Firstly, the differential expression patterns of SYK gene in various cancer types were analyzed by using Oncomine, GENT2, and GEPIA2 databases. We have found significantly differential expression patterns of SYK mRNAs in various cancers compared to healthy controls ( Figure 1a). Notably, we have observed significant upregulation of SYK gene in colorectal cancer (p-value <0.05) compared to controls. In total, there are 12 databases for CRC on Oncomine. Statistical details of SYK gene expression in different subtypes of colorectal cancer from the Oncomine database were presented in supplementary table 1. According to the GEPIA2 database, SYK gene has been significantly upregulated in COAD comparing with their paired normal tissues (Figure 1c). vs cancer tissues which also confirmed that this gene is significantly upregulated in various cancer types including colorectal cancer. Secondly, we analyzed SYK gene mRNA expression particularly in colorectal adenocarcinoma (COAD) tissues with their corresponding normal samples in light of various clinicopathological boundaries from UALCAN and GEPIA2 databases. The results obtained from UALCAN database indicated that SYK gene mRNA expression is significantly increased in TCGA COAD samples compared with their typical counterparts. The correlation between SYK gene mRNA expression and different clinicopathological parameters in TCGA COAD tissues demonstrates that SYK gene is significantly upregulated compared to the normal based on different variables such as sample types, individual cancer stages, patient's race, patient's gender, patient's weight, patient's age, histological subtypes, nodal metastasis status, and TP53 mutation status, etc. The data were presented in figure 2 and supplementary table 2. Moreover, the data depicted that SYK gene overexpression is more statistically significant in male patients than the female and patients between 41-80 years old. Besides, this overexpression is more significant in African-American patients than the Caucasian and Asian. Likewise, SYK gene expression in box plot and based on pathological stages in colorectal adenocarcinoma is also retrieved from GEPIA2 database. This database also provides similar kind of results asper UALCAN for SYK gene expression. SYK gene is overexpressed in COAD samples than the normal in log2 scale where the p-value cutoff was 0.01( Figure 3a). Similarly, overexpression of SYK gene is also evaluated in different pathological stages of COAD ( Figure 3b). Thirdly, the promoter methylation level of SYK gene in COAD is determined from UALCAN database where the methylation profile is presented on the Beta value scale. The degree of promoter DNA methylation is shown by the Beta value going from 0 (unmethylated) to 1 (completely methylated).  The promoter methylation level of SYK gene is altogether expanded in COAD than the typical tissues dependent on sample types, individual cancer stages, patient's race, patient's gender, patient's age, patient's weight, tumor histology, and T53 mutation status, etc. The data of promoter methylation level of SYK in COAD is presented in Figure 4 and supplementary table 3 in both graphical and tabulate format, respectively.

Genetic Alteration Analysis in SYK Protein Sequences Associated with Colorectal Cancer Development
By using cBioPortal information, we generated multiple genetic alterations data to inspect the useful meaning of SYK gene in colorectal cancer advancement. At first, we queried for changes of SYK in this database utilizing 3953 samples of 3806 colorectal malignancy patients from 10 investigations. For these 635 amino acids long human SYK protein we found 57 mutations and the somatic mutation frequency in SYK is 1.3% (Table 1). A lollipop plot depicted these 57 mutations out of which 40 were missense type and 17 were truncating type (Figure 5a). Afterward, we analyzed the genetic alteration recurrence of SYK gene by utilizing information from various colorectal malignancies examines. From this analysis, we examined that SYK alteration frequency fluctuates significantly across different colorectal cancer studies. Among these studies, SYK is mostly altered in colorectal adenocarcinoma with the most noteworthy alteration frequency of 4.17%. On the contrary, the least alteration rate has occurred in colon adenocarcinoma studies (Figure 5b). Last of all, unique expression analysis was conducted between SYK mRNA expression and putative copy-number alterations. From this analysis, we observed based on the level of SYK mRNA expression, amplification is the most upregulated copy number alteration in the RNA Seq V2 RSEM scale. On the other hand, based on expression frequency shallow deletion is the most widely recognized kind of alteration (Figure 5c). Overall, it is evident that various genetic alterations are assembled in SYK that lead to the development of colorectal cancer.

Prognostic Value of SYK and Survival Analysis
To further assess the expression of SYK gene and clinical prognosis of the patients with colorectal cancer, several online tools were used which includes GEPIA, OncoLnc, UALCAN and GENT2 databases. The server GEPIA provides survival plots for both overall survival and diseases free survival by examining 135 TCGA COAD tumors samples.  These survival plots were conducted using 50% median cut-off value, 95% Confidence Interval (CI) and Hazards Ratio (HR) from GEPIA. The database depicted that higher survival was observed for lower SYK level for overall survival and for diseases free survival condition (Figure 6a, 6b, 6c, 6d). It also seemed to be the higher survival for lower expression level (Figure 6a, 6b, 6c, 6d). Whereas to perform a Kaplan-Meier plot by analyzing a wide range of TCGA cancer studies from OncoLnc database, both upper percentile and lower percentile were selected 25. Then, the data of overall survival was analyzed using the patients' survival data of colorectal cancer with the comparison of gene expression data from UALCAN database. OncoLnc analyses 110 cancer sample for both low and high SYK expression on the contrary UALCAN analyses 71 COAD sample for higher expression and 208 for lower expression. According to the graph from OncoLnc and UAL-CAN databases, the down expression was observed for favorable prognosis (Figure 6e and 6f).  both lower and higher level of SYK expression (Figure 7). Overall a positive correlation was noticed between the overexpression of SYK gene and poor prognostic predictor for colon cancer. Lower level of SYK expression was related to the more chances of patients' survival. Therefore, by analyzing prognosis value and survival plot, it can be concluded that SYK can be considered as a tumor suppressor gene for colorectal cancer.

Analysis of Correlated genes and Preparing Interaction Network
For the identification of the correlated genes of SYK for COAD, we employed two web-utilities namely, UALCAN and GEPIA. This correlation was considered to be significant when the Pearson-correlation coefficient value is greater than 0.47 (Supplementary  Table 4). Both the databases revealed that SECISBP2 showed the highest positive correlation with SYK in COAD. We extracted 20 correlated genes from both websites to prepare an interaction network. GeneMANIA web-tool was used to investigate the interaction network. Those 20 correlated similar genes (TCF7, CDC14B, FKTN, TBC1D13, GLE1, CCDC170, NAA35, TSTD2, GOLGA1, TMEM8B, SNX30, RNF20, EFCAB14, PCDH19, FGGY, SECISBP2, ZNF782, SMC5, C6orf97 and KIAA0494) along with SYK was the employed to build network (figure 8a). The network was built by automatically selected weighting method where it showed 79.13% of co-expression and 20.87% of co-localization in the PPI network. In addition, for the prediction of PPI network of SYK gene to express the relationship between SYK gene with the significantly interacted genes, GeneMania web-tool was also used. Automatically selected weighting method was selected for the network of 20 genes (HCLS1, LYN, FCGR1A, WIPF1, POU2AF1, GP6, PIK3CG, ERBB4, CD3E, CBL, PIK3CB, EPOR, RPS6KA2, CTTN, NFAM1, BLK, ITGA2B, APBB1IP, FYN, CD72) related with SYK through several attributes including Co-expression, Co-localization, Pathway, Physical Interactions, Shared protein domains, Predicted, and Genetic Interactions (figure 8b). This network shows 67.64% of physical interactions, 13.50% of co-expression, 6.35% of predicted, 6.17% of co-localization, 4.35% of pathway, 1.40% of genetic Interactions and 0.59% of shared protein domains.

Pharmacokinetics Analysis
In this current investigation, six ligands (among them one used as control) were selected based on their ADMET results and the phytochemicals possess a better absorption rate on the human digestive system. In addition, the compounds exhibited a better excretion rate from the body after distribution and metabolism of the ligands along with it have no higher toxic effect on organs (mainly hepatocytes) ( Table 2). Crystal structure and potential hydrogen and hydrophobic interactions of all ligands and proteins were displayed in Figure 9.

Molecular Docking and Post Docking Data Analysis
To uncover the best ligand candidates among the selected aromatic phytochemicals by using molecular docking process through the PyRx 0.8 package platform. Basically, the PyRx 0.8 tools displayed the best docking score between macromolecule and ligands. Also, Table 2 displayed all the highest binding score. Here, the Capecitabine is considered as a control ligand that shows the docking score at -6.5 Kcal/mol with the grid center X=36.7526. Y=3.6328, Z=25.3921 along with the dimension of grid box at X x Y x Z at 35.9128 Å, 33.0428 Å, 25.0000 Å. The aromatic ligand Glabridin have displayed the best fitting score i.e., -8.2 Kcal/mol via using the control drug grid parameters. Furthermore, other potential phytochemicals for instant-Curcumin, Kaempferol, Quercetin, Genistein have been possessed the best docking score -8.0, -7.3, -7.2, -7.1 Kcal/mol respectively, which were represented by Table 3.

Molecular Dynamics Simulation
The molecular dynamics simulation trajectories were used to explore multiple simulation descriptors and understand the complexes flexibility. Figure 10  upper trend in RMSD till 5ns and experienced a stable trend till 30ns. Therefore, the complexes were raised their RMSD profile again after 40ns but all complexes had steady profile. The complexes did not fluctuate much in RMSD except Curcumin and Quercetin. Therefore, the other four complexes had steady trend in RMSD which indicates the complexes rigidity. The complexes had RMSD below 2.5Å except for Quercetin and Curcumin. The root mean square fluctuations of the amino acid residues from the protein-ligand complexes were checked to understand the flexibility across the amino acid residues. The maximum amino acid residues from the complexes had RMSF lower than 2.5Å which indicates the lower flexibility Figure 10. The solvent-accessible surface area from the simulation trajectories was analyzed to understand the change in the protein surface area. The Kaempferol complex had higher SASA trend at the beginning phase which indicates the expansion in the protein surface area. After 30ns, the Kaempferol complex had a similar SASA profile. The Glabridin complex had a comparatively lower SASA profile in other complexes, and this trend indicates the truncated nature of the complexes. The other complexes had steady profile and did not deviate much in SASA Figure 10.
Moreover, the radius of gyration of the protein complexes was checked to understand the protein mobility and rigidness. The protein-ligand complexes had a straight line in radius of gyration profile and observed little fluctuations. The two complexes; Quercetin and Kaempferol had a higher Rg trend after 60ns which might occur due to the labile nature of these complexes. Therefore, the hydrogen bond of the simulation systems was analyzed as they play an important role in stabilizing the complexes Figure 10. The six complexes had stable hydrogen bond trend in simulation and therefore no significant fluctuations were observed.

Discussion
Normal cell division is the result of strict control of DNA replication that maintains cellular growth, differentiation, and tissue homeostasis, and whenever this control breaks down it leads to uncontrolled cell growth i.e., tumor and/or cancer. Moreover, this multistage cell cycle process involves hundreds of distinct genes that cause autonomous growth of cancer cells due to mutations, amplifications, rearrangements, and modifications [50]. Cancer is a broad-spectrum lethal disease that is illustrated by uncontrolled cell division in the body along with the cancer cells cross their boundaries and invade the adjoining parts of the body, thus it spreads to other parts of the body from its origination [51]. Among the various types of cancers like bladder cancer, colorectal cancer, breast cancer, kidney cancer, lung cancer, and oropharyngeal cancer, colorectal cancer plays a significant cause of death in the entire world. Colorectal cancer, usually referred to as colon cancer, rectal or bowel cancer grows out of the colon or rectum [52]. It is recommended that the abnormal expression level of the SYK (Spleen tyrosine kinase) gene is associated with the development of colorectal cancer [53]. In recent times, Spleen tyrosine-protein kinases are prominent and effective drug targets due to their intimate connection with cell cycle regulation and association with tumorigenesis [54]. However, the distinct role of SYK in colorectal cancer is yet to be discovered. Therefore, in this study, we performed integrative bioinformatics analysis through several powerful publicly available datasets that indicated SYK may perform as a potential prognostic biomarker in the treatment of colorectal cancer.
The mRNA expression patterns of SYK gene across different types of cancer samples, particularly in colorectal cancer samples with their analogous normal control and SKY gene was upregulated in BRCA, CESC, CHOL, ESCA, GBM, HNSC, LAML, READ, STAD, UCEC, and down-regulated in BLCA, KIRC, KIRP, PRAD, SKCM, THCA, etc ( Figure 1). Moreover, the SYK is significantly upregulated in COAD than the normal samples based on different clinicopathological parameters (Figure 2 and Figure 3). Thus, there is a positive correlation between SYK expression and increase risk of tumor metastasis formation. Several reports also demonstrate that the role of the SYK gene in diverse cancer types as a potential prognostic marker like in human breast carcinomas [55], cervical carcinogenesis [56], human lung cancer [57], ovarian cancer [58], and neck [59], glioblastoma [60], etc. Here, the divergent role of SYK in cancer may be due to phosphorylation or methylation at the promoter region. To examine this, we explored the SYK gene promoter methylation level across COAD based on different clinicopathological parameters (Figure 4). Promoter methylation level fluctuates significantly across different stages of COAD. The methylation level is higher in tumor samples than the normal samples. Zuli  reported a correlation between clinical relevance and the methylation of SYK gene in colorectal cancer patients where they found SYK methylation in 48.6% CRC tissue samples and 57.1% in cell lines, respectively [61]. SYK expression could be resorted by the demethylation agent. However, this required further in-depth investigation. Furthermore, to investigate the functional relevance of the SYK gene in colorectal cancer development, we analyzed the mutations and copy number alterations of the SYK protein sequence based on 10 cancer studies. In total 57 mutations were found of which 40 were missense type and 17 were truncating type ( Figure 5). Moreover, the highest alteration frequency of 4.17% was found in colorectal adenocarcinoma. Some researchers also reported that identifying genomic regions that undergo intermittent alteration might provide a powerful way to discover oncogenes in human cancers 10 [62]. From this result, it is evident that SYK might play an active role in CRC development.
The prognostic relevance of the SYK gene for CRC by using KM plotter from various websites including GEPIA, OncoLnc, UALCAN, and GENT2 and analysis of survival curves expressed that lower expression of SYK is related to higher survival tendency for overall survival (OS) data. Besides diseases free survival (DFS) condition, it also seemed to be the higher survival for lower expression level ( Figure 6). Further, down expression was observed for better prognosis provided by the data from different colon tissue subtypes which contains Molecular Subtype, AJCC Stage, Duke Stage, Grade, and Histology categories (Figure 7). It is necessary to identify the particular gene which is responsible for altered gene expression and survival probability differences because that gene can be a cancer biomarker [63]. The validation of potential biomarker is associated with the survival rate [64]. Even the SYK gene is a potential marker and tumor suppressor for the reduced expression in human breast cancer (BC) and human pancreatic ductal adenocarcinoma (PDAC) [65,66]. Like this, a positive correlation was noticed between the overexpression of the SYK gene and the poor prognostic probability for colorectal cancer. Besides, survival plots from various databases suggest that SYK expression can contribute for the development and prognosis of colorectal cancer which can also be considered as a tumor suppressor gene. To find out the activity of the SYK gene, analyze the correlation and co-expression where Pearson correlation coefficients are used to measure the relation. Here searched the UALCAN and GEPIA database and identified positively correlated genes of SYK in COAD tissues. Twenty correlated genes from both websites were collected to prepare an interaction network from GeneMANIA where Pearson-CC >0.47 was considered as moderate significant. Both the database confirmed that the gene, SECISBP2 shares the highest Pearson correlation coefficients value with SYK. Even, SECISBP2 is considered a novel therapeutic target for diffuse large B-cell lymphoma, and mutations in this gene have been related to abnormal thyroid hormone metabolism. The interaction showed 79.13% of coexpression which means those genes are linked with each other by their expression level in similar across conditions and 20.87% of co-localization (Figure 8a) which indicates the linked genes are expressed in the same tissue or same cellular location. Besides to determine another interaction network, GeneMANIA was also employed and the biologically processed based 'automatically selected weighting method' was utilized for the network (Figure 8b). All the interacted twenty genes are directly or indirectly connected with different cancer. HCLS1 contains the potential prognostic characteristics with CRC [67], Lyn can be used as a prime target for the treatment of hormone-refractory human prostate cancer (PC), and Lyn is also involved in the activation of CRC [68,69], PTEN was identified as a tumor suppressor and a therapeutic target for cancer [70], WIPF1 is a candidate of the breast, brain, and colorectal cancer prognosis [71], as an ErbB subfamily, the ErbB4 gene plays a significant role in breast carcinogenesis with breast cancer prognostics and therapy [72], CBL-b can regulate cancer metastasis [73], EPOR contributes to breast cancer progression [74], CTTN can be used as new molecular therapeutic targets for Esophageal squamous cell carcinoma (ESCC) and CTTN also promotes the proliferation of CRC cells [75,76] and so on. Here, these 2 categories of protein-protein interaction networks depicted that SYK has interacted with those genes that are somehow involved in tumor suppression or tumor promotion.
There are many therapeutic agents available like chemotherapy, radiation therapy, immunotherapy, surgery, and stem cell transplantation, but all these methods process severe side effects and high cost [77]. So, the development of phytochemicals-based therapeutic drugs development is a crying need to treat colorectal cancer because the phytochemicals-based therapeutic agents possess fewer side effects on the human body. In recent times, in silico drug design method has become very popular because it increases the speed of drug development through analyzing the result of pharmacophore profiling, molecular docking, post-docking screening, molecular dynamic simulation, and prediction of noble compounds against various disease [78]. Here selected the six phytochemicals among them five phytochemicals (Figure 1) are functionally active and the remaining on phytochemical is used as a control which can be used as a perfect drug against colorectal cancer. ADMET profiling is an effective strategy for substantially saving the costs of drug development and offering 'fact tests' and secondary complimentary opinions for highperformance assays [79,80]. Through using pkCSM-pharmacokinetics & Swiss ADME online servers, the pharmacokinetic properties of those selected six phytochemicals and the data was characterized via Table 2 [81]. Under Pharmacokinetics study, the pharmacophore properties (Lipinski's rules of 5) of all selected ligands were carried out for instant-molecular weight, the number of rotatable bonds, log p-value, donors and acceptors of hydrogen bonds, and the degree of infringement (violation level). All selected phytochemicals showed no permeability to the blood-brain barrier, better human intestinal absorption, no carcinogenicity, no hepatotoxicity, and no AMES toxicity for both humans and mice, except for the control ligand. After their digestion, both substances excrete a small level of toxic waste.
Molecular docking is the process that estimates how two or more molecules bind together with the best structural confirmation as well as the lowest binding energy. More significant and efficient drug candidates were selected depending on the scoring value of molecular docking via the PyRx virtual screening tool [Version: 0.8] [82]. The docking operation for five phytochemicals as well as a control molecule with receptor 4GX3 and found the docking score of control was -6.5 Kcal/mol. Among the five phytochemicals, Glabridin showed the best binding energy -8.2 whereas Genistein was -7.1. Afterwards, an effective investigative tool Ligplot + [Version: 2.2] that generally runs depending on the Java interface was used to analyze the protein-ligand interaction from the 2D proteinligand interaction scheme. Here, it is important to mention that Ligplot + only runs the PDB file that was retrieved from the PyMol visualizer tool [Version: 2.4.1].
Molecular dynamics simulation (MDS) has been demonstrated as a diverse technique for investigating biomolecular interactions and is used to evaluate the interaction between the protein structure and function for modern drug discovery and the performance data from dynamic trajectory [83,84]. In this current investigation, the dynamic simulator YASARA (Version 11.9.18) has retained all-physiological and physio-chemical parameters (Temp-310K, pH-7.4) and has to add the compressors (Na+, Cl-) in 100 ns [85] and the performance data assess the dynamic condition of the selected active compound's to suppress the soluble receptor molecules (SYK kinase). Most notably, the Root Mean Square Deviation (RMSD), Root Mean Square Fluctuation (RMSF), Gyration of Radius (Rg), Hydrogen Bond Number, and the Solvent-Accessible Surface Area (SASA) are also analyzed by this simulation tool [86]. To quantify the stability of the protein structure and to predict conformation shifts, the root mean square deviation (RMSD) [87] of the selected SYK kinase soluble protein backbone was used, most importantly the lower value shows the most stable complex. RMSD value of less than 1.5 Å typically shows greater accuracy in the docking concerning the effective binding position. In this present study, the RMSD values of protein-ligands interacted compound in the acceptable range, i.e. minimum values less than 2 Å (the lowest value for Capecitabine is approximately 1.0 Å), whereas maximum values did not exceed 3Å, indicating a better docking position alongside, the enzyme structure was not disrupted (Figure 10). The RMSF measures the mean protein molecule fluctuations from a reference location, and the residue level fluctuations are portrayed in the RMSF plots. Furthermore, it is essential to differentiate local changes along the protein chain [88]. The fluctuation of maximum ligands molecule within 1.5 to 2.5 Å, while the quercetin and curcumin molecule showed a fluctuation rate of not more than 4 Å, which indicates that the rate of flexibility is a significant amount ( Figure 10). The gyration radius (Rg) measures the distance between the center of mass and both protein termini. Therefore, this parameter measures the compactness of the protein molecule and gives us a deeper understanding of the protein's folding properties [89]. Additionally, the higher Rg value displayed the slackpacking whereas the lower Rg value reveals the compact packing [90]. alongside their enzymes, have better compactness, and the numerical values presented in both cases 28.45. The weaker binding capacity is 28.39 Quercetin and delta Genistein. Similarly, the value of Solvent-accessible surface area (SASA) determines the effectiveness of interaction among the macromolecule-ligand complex, whereas the interaction between enzyme surface area and water depicts the amount of energy within per area of ligand and macromolecule [91]. One of the highest SASA values complies with the unstable structure in which hydrophobic amino acid residues are in close contact with the water molecule [92]. According to the SASA result, the Kaempferol has been followed by the highest values from SASA (26,250 Å2), but Capecitabine, Kaempferol, and Glabridine have the higher values from SASA, however, both Turco and Genistein have the lowest value from SASA and bind most precisely to the receptor ( Figure 10). The Conformational stability among the macromolecule and ligands was identified by the complete number of intermolecular hydrogen bonds [93] and the lowest number of hydrogen bonds in the versatile Kaempferol and Glabridin has determined the macromolecules-linking conformational stability of 1049 and 928 that is more stable than the control referred to in this research ( Figure 10). However, it's an in silico based research study there have no sufficient evidences via in silico research approaches on colorectal cancer by targeting the SYK kinase. Our research study will focus on a new insight on colorectal cancer treatment in which more wet lab and clinical based research studies will be required to valid these drug-like molecules for the treatment of colorectal cancer by targeting the SYK kinase.

Conclusion
In conclusion, we used multiple online bioinformatics platforms and web resources to perform a systematic analysis of SYK expression, methylation, mutations, and CNAs, associated genes, survival status, and prognostic value in various human cancers especially for colorectal cancer. Additionally, the current findings demonstrated the critical role of SYK expression and potential SYK-related pathways in the development of human colorectal cancer. Our research may shed new light on SYK kinase as a novel biomarker and therapeutic target for colorectal cancer, thereby assisting in the translation of genomic information into clinical practice. Furthermore, the ADMET analysis, cellular docking, and molecular dynamics are effectively implicated to find out the right phytochemicals against human colorectal cancer targeted spleen tyrosine kinase (SYK). Among the 500 phytochemicals tested, The strong interactions and binding affinity with all or at least one of the catalytic residues remain in human colorectal cancer spleen tyrosine kinase (SYK) are Capecitabine (Control), Glabridin, Curcumin, Kaempferol, Quercetin, and Genistein. These protein-ligand complexes also show several non-covalent interactions called hydrogen-binding, hydrophobic and electrostatic interactions. MDS findings suggest that the physiological environment has observed the most stable protein-ligands interaction and that they interact more frequently with spleen tyrosine kinase (SYK) through the hydrogen bonds. The pharmacokinetic and ADMET analyzes have shown their effectiveness as drug molecules with no cellular toxicity. It can be established that most of the bioactive compounds we have identified, will show significant efficacy and can be used for designing successful anti-cancer drugs for human colorectal cancer.