Frutalin Affinity Chromatography on Sepharose Gel as a Strategy for the Identification of Possible Tumor Markers in Myelodysplastic Syndromes

Myelodysplastic syndromes (MDS) are diseases that occur when blood-producing cells in the bone marrow are damaged; such damage can affect one or more types of blood cells. Common types of MDS are refractory anemia with ring sideroblasts and refractory anemia with excess blasts (MDS-RS and MDS-EB, respectively). This work analyzed the proteomics of the medullary plasma of 10 patients with MDS-RS and MDS-EB compared to healthy control people. Overexpressed proteins that may be potential candidates for biological markers for the evaluation, study, and diagnosis of these diseases have been identified. These samples were subjected to immunodepleting, concentrated, and digested for further analysis by mass spectrometry. The ratios between selected groups and healthy people were calculated. Seven overexpressed proteins in both syndromes were identified as potential biomarker candidates: vitronectin (VTN), (2) fibrinogen (FGA), (3) pregnancy zone protein (PZP), (4) kininogen (KNG1), (5) immunoglobulin lambda chain (IGLL1), (6) complement factor C4b, and (7) hemopexin (HPX). A modified affinity chromatographic column with lectin frutalin (FTL) was used for non-depleted samples. Immunoglobulin M (IgM) was expressed in the samples from both syndromes. Surprisingly, IgM from patients with syndromes was over retained on the frutalin (FTL) column when compared with the control group. We further hypothesized that over retention of this protein by the FTL is due to the presence of α-galactosidic residues in the IgM of MDS-RS and MDS-EB patients. Differential recognition of proteins on non-depleted samples from the use of FTL appears to be a powerful tool for proteomic analysis.


Introduction
In recent decades with the increase in life expectancy in some countries, the prevalence of cancer has increased. With its impact on public health, an interest in seeking alternatives to prevent or treat it early have evolved [1][2][3][4]. The Global Cancer Statistics (GLOBOCAN) foresees that in 2040, the number of people affected by some type of cancer will be able to reach 28.4 million cases worldwide 1 with about 12.0 million cancer deaths glycosylation patterns, thus making glycans qualitative biomarkers of health and disease [22].
Lectins (proteins of non-immune origin that specifically recognize carbohydrates) are capable of specifically binding to residues present in certain polypeptide segments [25]. Affinity chromatography with lectins immobilized on inert supports combined with protein identification by mass spectroscopy (MS) is a consistent, reproducible, and reliable method and is considered an excellent tool for comparative serum studies [26,27]. Its main advantage for separating proteins with different glycosylation patterns is its simplicity and low cost [28]. Frutalin (FTL) is a protein in the lectin family and is obtained from the seeds of Artocarpus incisa, which specifically binds to α-D-galactose [29]. FTL has been successfully used in immunobiology research for the recognition of cancer-associated oligosaccharides with their specificity of anomeric recognition for alpha galactosidic residues [17,18,[29][30][31][32][33].
The aim of this report was to detect a panel of proteins that could explain the highly variable clinical course of MDS based on a comparison of proteomes from healthy people with those of patients with MDS. This comparison was made using results from the proteomic analysis of serum samples from bone marrow of patients with MDS (cases with ring sideroblasts and cases with excess blasts). On the other hand, affinity chromatography of undepleted samples was performed on FTL with the aim of detecting which proteins are prone to α-glycosylation for which visible results were found.

Results and Discussion
For identification and comparison of depleted fractions of the MDS-RS ( = 10) and MDS-EB ( = 10) pools compared with the control group ( = 10), 46,221 peptides were identified with 42,443 peptides having a mass error < 10 ppm. Thirty-one and 33 proteins for MDS-RS and MDS-EB were identified, respectively, ( Figure A1 and A2) when compared with the control. Proteins that showed a significant increase in expression in one group when compared to the other were described as over expressed (ratio ≥ 1.5), proteins with a significant decrease in expression were described as under expressed (ratio ≤ 0.5), and all other proteins with non-significant expression changes (0.5 > ratio < 1.5) were described as a non-significant expression change. Some overexpressed proteins in MDS-EB were also overexpressed in MDS-RS and are summarized in Table 1. The overexpressed protein could be grouped into coagulation proteins [34] (vitronectin, fibrinogen, pregnancy zone protein, and kininogen [VTN, FGA, PZP, and KNG1, respectively]), proteins associated with defense functions [35] (immunoglobulin Lambda chain and complement factor C4b [IGLL1 and C4b, respectively]), and hemopexin (HPX). It seems that the overexpression of these proteins is not isolated biochemical events as the association between them appears to be correlated ( Figure 1). HPX has been suggested as an indicator of the status of the extracellular matrix and has been reported as a very important protein in healing processes [36]. Interactions between the domains of hemopexins form the basis for positive or negative interference with the formation of molecular complexes in the extracellular matrix and therefore, can be therapeutically exploited in cancer. In this work, the protein was overexpressed in two types of MDS possibly playing a useful role in tumor growth as it has the capacity to transform the entire molecular context of the extracellular matrix. The mutual correlation of overexpression with the presence of the MDS makes this protein a potential tumor biomarker.
Pro-coagulant proteins (VTN, FGA, PZP, KNG1) in MDS-RS [37] and -EB [38] play an essential role in hemostasis. Excessive expression of these compared to the control group was found, and it was suggested that they can be used as a biomarker for the disease. Similar behavior was observed in children with acute lymphoblastic leukemia; hence, the symptoms are more common and are only venous thromboembolism [39,40]. In addition, Rau et al. [40] detailed that the decrease or dysfunction of some of VTN, FGA, PZP, KNG1 factors can trigger bleeding or thrombotic states associated with cancer.
VTN induces cell differentiation in endoderm cells. In human plasma, malignant differentiation of lung and breast cancer has been reported [41,42] VTN mediates its effects predominantly through αVβ3 integrin and as a result, causes a change in the β-catenin localization. 43 Primary lung cancer has been associated with overexpression of β-catenin protein in tissues. It is possible that overexpression of VTN protein in MDS-RS and MDS-EB contributed to tumor differentiation processes and disease progression.
An extensive discussion in the literature about the role of the PZP protein as a biomarker in cancer exists [43]. In light of reports by various investigators concerning PZP, a positive correlation between severe serum snows and breast tumors was found [44]. Furthermore, the tissue of the breast tumor seems to synthesize PZP, and serial doses of PZP plasma have been suggested to accompany the dissipation of non-cancerous micrometastatic breast tumors [45]. However, other reports have been incapable of reproducing these results, which suggests that PZP is useless as a tumor marker [46]. Also, recent reports on the preparation and immunological reactivity of PZP allow us to presume that many previous studies may have been influenced by non-specific antibodies and/or patterns of contaminated proteins [47]. However, PZP protein overexpression in patients with MDS-RS and MDS-EB suggests that it may be useful as a tumor marker for plasmamedullary studies.
Seraglia et al. [48] reported KNG1 as a potential new non-plasma marker for colorectal cancer. KNG1 is a multifunctional protein that plays an important pathophysiological role in many processes and has been described as a possible biomarker for fibrinolysis and oncogenesis [49,50]. In addition, Umemura and coworkers reported KNG1 protein as a possible tumor marker for colorectal cancer [51]. This current study reported a protein with super expression for the studied myeloblastic syndromes studied and also proposed it as interesting protein that could be part of a protein panel used to study these syndromes.
Proteins of the complement system and immunoglobulins in MDS-RS and MDS-EB were identified as overexpressed proteins associated with the complement system (C4b). The activation mechanisms of the complement system are already known and have been described in various studies [52,53]. The complement system is an important component of the innate and adaptive immune system and is composed of more than 40 plasma and membrane proteins. This finding is more important in the complement system or in reconstitution of pathogenic agents in which the activation of the chemotaxis of leukocytes and the induction of cell lysis by formation of the membrane attack complex (MAC). Due to the large number of genetic and epigenetic alterations associated with carcinogenesis, neoplastic transformation can increase the capacity of the malignant cells to activate complement [54,55], a fact that can explain the overexpression of C4b, which also agrees with other works that report or increase expression of complement proteins in lung cells in cancer [56].
Traditionally, the only source of Igs was believed to be mature B lymphocytes. Although some researchers have reported that Igs could also be detected in carcinoma cells, which are derived from epithelium. Recently, a study demonstrated that human cancer cell lines express the IgG variable region of mRNA [57]. Immature B lymphocytes generate diversity in the antibody repertoire and lead to reorganization of the variable chain of Ig. Some important proteins for genetic recombination that generate variability in Igs, such as recombination-activating gene (RAG) proteins, have been found, a finding that allows us to think that they may be present in cancer that could hypothetically generate Ig. Qiu et al. [57] found gene expression of RAG1 and 2 in three epithelial cancer cell lines. In this sense and with subsequent studies, it was found that other types of Ig could also be produced by cancer cells. They reported that IgM was expressed in some tumor cell lines although at a much lower level than IgG. In addition, Chen et al. [58] reported expression of the heavy chain of IgM and IgA in non-lymphoid lineage and neoplastic cells. We report over expression of IgLL1 and expression of IgM in both MDS-RS and MDS-EB. However, these results place this protein as a very interesting biomarker candidate due to the importance and previously demonstrated activity of cancer cells to produce immunoglobulins [59][60][61][62][63]. A chromatographic affinity column was made using peanut agglutinin (PNA) and FTL lectin proteins that were individually immobilized on Sepharose 4B. The FTL column was assembled using an elution solution containing an extract of the seeds of A. incisa. Two peaks were observed after sample elution ( Figure A3). Peak (I) represents the fraction not retained on the column, while peak (II) represents the retained fraction. Samples were eluted using Buffers A and B. Lectin proteins were used to cause a decrease in the number of observed peptides and to expand the detection capacity of less abundant proteins in plasma samples from patients with MDS-RS and MDS-EB.
PNA is a lectin that has been widely described as a tool for the procuring of galactose β1-3-N-acetylgalactosamine (Tf) antigen [64]. However, it was not possible to observe any retained fraction for the column that had immobilized PNA. For the FTL sepharose column, a retained fraction was observed. It was possible to observe and identify 15 proteins for the MDS-RS versus the control and 10 proteins for the MDS-EB versus the control ( Figure 2). The observed proteins were analyzed via mass spectrometry. The proteins that were over retained that presented a ratio > 1 are summarized in Table 2. In comparison of the chromatographic profiles, the retained fractions of the samples derived from MDS-RS and -EB are shown and are greater than the one obtained for the control group. It is surprising that the alterations in the glycosylation profiles of proteomes derived from MDS generated a fraction by mainly retaining an increase in the appearance of alpha galactosidic residues in glycoproteins. The results are in full agreement with those from other studies using FTL that reported a majority of retained fractions in cancer samples on affinity chromatography compared with samples from the control group.
The retention capacity of the FTL affinity chromatographic columns allows studies of undepleted samples with the aim of searching for protein interactions. This lectin specifically binds α-D-galactose [29]. FTL has been used successfully in immunobiology research for the recognition of cancer-associated oligosaccharides, taking into account its anomeric recognition specificity for alpha galactosidic residues [17,18]. In oncogenesis, glycosylation is affected by alteration in the transferred glycosyl in such a way that residues that commonly should be associated with the proteins, such as beta-galactosidic residues, are manifested as alpha residues [17,18]. The Galβ1-GlcNAc groups and Galβ1-GlcNAc repeats are further elongated by the processes of fucosylation, sialylation, galactose, glucuronylation, and/or sulfation. A GalNAcβ1 residue is added to some GlcNAc residues. Variations in the type of binding from beta to alpha or exposure to galactose associated with the Tn antigen are indicators of cell malignancy. Figure 3 shows the interaction between FTL and α-D-galactose. Frutalin has been used successfully in immunobiology research for the recognition of cancer-associated oligosaccharides [17,18]. However, the molecular basis by which FTL promotes these specific activities remains poorly understood. The interactions between frutalin and α-D-galactose was previously characterized by X-ray crystallography [29] Lectin exhibits a post-translational cleavage that produces α-and β-chains (133 and 20 amino acids, respectively). The carbohydrate binding site (CBS) involved in the N-terminus of the α chain contains four key residues: (1) Gly25, (2) Tyr146, (3) Trp147, and (4) Asp149. Molecular dynamic simulations suggest that cleavage of the Thr-Ser-Ser-Asn (TSSN) peptide leads to a reduction in the stiffness of FTL CBS, thus increasing the number of interactions with ligands and resulting in multiple binding sites and anomeric recognition of α-D-galactose. Overexpression of IgM in MDS-EB and subexpression of IgM in MDS-RS was observed on mass spectrometry when compared to profiles of healthy people. However, a higher retention of IgM was demonstrated in the fraction retained on FTL chromatography for MDS samples compared to the control. The affinity of FTL on the chromatography column for α-glycoprotein was increased in the expressed proteins in MDS-RS and MDS-EB. This effect could be due to the conformational exchange of α-IgM exposed glycosylation profiles.
IgM plays an important role in primary defense mechanisms [65]. Studies have reported that IgM is involved in early recognition of external invaders, such as bacteria and viruses and cellular and self-modified debris in addition to recognition and elimination of precancerous and cancerous lesions [65]. The membrane-attached IgM is found in most normal lymphocytes B together with IgD [66]. Membrane-bound IgM induces the phosphorylation of CD79a and CD79b by the Src family of protein tyrosine kinases, which can induce cell death by apoptosis. IgM has also been found in soluble form and represents around 30% of the total number of Igs in plasma in which it is found exclusively as a homopolymer [67]. Once the antigen binds to the B lymphocyte receptor, the secreted form is released in large amounts [68]. Tisch et al. described the recognition of IgM with a diversity of tumor antigens [69], hence, the increase in IgM was shown to be associated with different types of cancer.

Conclusions
From the 31 proteins overexpressed in MDS-RS and the 33 overexpressed in MDS-EB, a panel of seven overexpressed proteins (VTN, FGA, PZP, KNG1, IGLL1, C4b, and HPX) in both myeloid syndromes were generated, which could be used as biomarkers for MDS-RS and MDS-EB. The seven proteins presented strong associations between them, which suggests that studies should be carried out to understand mechanisms of expression and self-regulation. It was possible to fractionate the proteins using Sepharose chromatography columns modified with lectin proteins; hence, FTL, distinct from PNP, presented a retention capacity that contained α-galactosidic residues. Due to the direct relationship between Igs with altered expression and glycosylation processes in cancer, proteomic studies that use FTL columns make it possible to fractionate non-depleted samples, which makes it a powerful tool for cancer studies. Surprisingly, IgM was strongly retained by the FTL column due to the α-galactosidic residues present in it. This finding suggests that the glycosylation of IgM in its synthesis process could be modified to factors associated with the MDS-RS and MDS-EB.

Patients and Samples
All samples were obtained by bone marrow aspiration after obtaining informed consent at diagnosis. All patients were diagnosed according to the guidelines [70] from WHO 2016 and to the Revised International Prognostic Scoring System (R-IPSS) [71]. A total of 30 marrow serum samples were obtained: 10 patients diagnosed with MDS with ring sideroblasts (MDS-RS), 10 patients with MDS with excess blasts (MDS-EB-1 and -2), and 10 as control group. The study was approved by the Ethics Committee of the NPDM Federal University of Ceará (CAAE 69366217.6.0000.5054). All samples were stored at -80 °C until used, and concentrations were determined using the Nanovue PlusTM instrument (GE Healthcare, Uppsala, Sweden).

Protein management
The protocol of Lobo and collaborators was used with some modifications [17]. Three groups were formed with the different samples: (1) Group 1 -Control (healthy marrow plasma donors), (2) Group 2 -Samples from people with MDS-RS, and (3) Group 3 -people with MDS-EB-1 and MDS-EB-2. Initially, all samples were centrifuged at 3000 x g after thawing, the resulting supernatant was then saved, individual samples were frozen at -30 °C, and the pellet was discarded. Subsequently, pools were grouped, ensuring that the protein chastity contribution of each sample was the same (10.6 mg) for each of the pools, resulting in a pool of 1.6 mL (6.625 mg/mL). These pools were used to perform all assessments and ensured the same amount of protein for each procedure. Each sample containing 50 μg of protein, which was denatured with 0.2% RapiGestTM SF (Waters, Milford, USA), reduced (10 mM dithiothreitol), alkylated (10 mM iodoacetamide), and enzymatically digested with trypsin (Promega, Madison, WI, USA). At the end of this process, the samples were centrifuged, and the supernatant was transferred to a vial to which 5 μL of internal standard, alcohol dehydrogenase (ADH, 50 fmol, access code P00330 on Swis-sProt) and 85 μL of 3% acetonitrile with acid were added in 0.1% formic acid. The final concentrations of glycoproteins and ADH were estimated at 250 ng/μL and 25 fmol/μL, respectively, and the final volume in the microtube was 200 μL. Quantitative and qualitative experiments using nano ultrapure liquid chromatography (nanoUPLC) and tandem nano electrospray ionization mass spectrometry (ESI/MSE) of the digested samples were performed using reverse phase chromatography of peptides with 3% to 40% (v/v) of acetonitrile containing 0.1% formic acid for 90 min. A flow rate of 600 nL /min was maintained for 100 min on a nanoACQUITY UPLC core system. A 1.7 μm, 100 μm × 10 cm nanoACQUITY C18 UPLC BEH reverse phase column was used in conjunction with the SCX 5 μm, 180 μm x 23 mm precolumn.

Protein Samples Depletion
The protocol of Lobo and collaborators was used. 17 Human serum albumin (HSA) and Immunoglobulin G (IgG) were removed from marrow plasma by affinity chromatography performed on a HiTrap Albumin IgG depletion™ column (GE Healthcare) coupled to the ÄKTApurifier 10 automated fast protein liquid chromatography (FPLC) system (GE Healthcare). Initially, marrow serum plasma samples were filtered on a 0.22 μM membrane (Vertipure™ PVDF syringe filters, Veritical) after which the samples were manually and individually injected into the FPLC system. The matrix was subjected to a constant flow of 1 mL/min of 5 mL of buffer A (20mM Tris-HCl pH 7.4 in 0.15 M NaCl) until the injection of 150 μL of sample when the matrix was washed with 8 mL of buffer A followed by 7 mL of buffer B (0.1 M Glycine-HCl pH 2.6 in 0.15 M NaCl) under the same conditions with protein elution measured at absorbances of 280 and 220 nm.

Mass Spectrometry
The fractions were digested (50 ug) and tryptic peptides were separated using a nanoACQUITY UPLC system (Waters) equipped with an HSS T3 C18 reverse-phase column (1.8 μm, 75 μm × 20 mm) for 110 min using a 0%-40% gradient for 90 min and 40%-85% gradient for 5 min after which the column was re-equilibrated for 15 min at 35 °C. The flow rate was 0.35 μl/min, and mobile phases A and B contained 0.1% formic acid in water and 0.1% formic acid in acetonitrile, respectively.
A data-independent analysis (MSE) of tryptic peptides was performed using a Synapt HDMS mass spectrometer (nanoESI-Qq-oaTOF; Waters, Manchester, UK). All measurements on the mass spectrometer were done in the "V" mode with a resolving power of at least 12,000 times. All analyses were performed using nanoESI (+). The collection channel of the analyzed sample was closed every 30 sec for the passage of the reference ion (Glu-Fib (Glu1) derived from fibrinopeptide B human ([M + 2H]2+ = 785.2486). The exact mass retention times of nanoLC-MSE data were collected using alternative lower (3 eV) and elevated collision ramp energies (12-40 eV) applied to the argon collision cell with argon gas using a scan time of 1.5 s with a 0.2-s interscan delay for each MS scan from m/z 50 to 2,000. The radio frequency RF offset (MS profiles) was adjusted such that the LC/MS data were effectively acquired from m/z 300 to 2,000, which ensured that any mass observed in the LC/MSE data < m/z 300 arose from dissociations in the collision cell.
The selected databases were randomized during database queries and appended to the original database to assess the false-positive identification rates. The identified proteins were organized by Protein Lynx Global Server (PLGs) into a list that corresponded to a single protein for both conditions (study or control group) and a logarithmic ratio between the different groups was plotted on a scatter plot to visualize differences between the groups. Proteins only in points of presence and confidence greater than 99% (3x3 assays) were considered for accepting searches in the database, and when the same protein was identified for different MS/MS ion fragmentations, those that presented the highest scores were considered for comparisons and data presentation.
For searching spectra and the database, we used the default parameters of PLGS followed by a maximum of one missed trypsin-cleavage and fixed carbamidomethyl and variable oxidation modifications [72,73]. The absolute quantification of each run was calculated according to the three most intense peptides (label-free Hi3 method) using ADH peptides as internal standards [73].
The average quantitative values of all samples were calculated, and the p value (p < 0.05) calculated using ExpressionE software to refer to the differences between biological replicates.

Protein interaction network analysis
The protocol of Cavalcante and collaborators was used [18]. The proteins identified in this work as having differential expression were used for an interaction analysis. The Search Tool for the Retrieval Interaction of Genes/Proteins (STRING) program version 9.05 was used, and the different proteins were identified with the accession number that identifies them in the database. Swiss-Prot protein data, the analysis parameters were Homo sapiens species, significance level between 0.400 and 0.900 with parameter prediction methods enabled. Protein lists were then filtered to show only those present in all three repeat injections of each sample after which the PLGs created an output table. This table presented the names and access codes in addition to the expression levels of the proteins based on the ratio (ratio) of the proteins among the selected groups. The probability of the ratio demonstrated the significance of the protein over-expression (up-regulated) ratio ≥ 2, under-expression (down-regulated) ratio ≤ 0.5, and others without significant changes between groups (unchanged) 0.5 < ratio < 2.

Affinity Chromatographic Column (Frutalin as stationary phase)
The marrow plasma samples were thawed, centrifuged at 12,000 x g for 15 min at 8 °C and filtered through a 0.22 μM membrane (Vertipure™ PVDF syringe filters, Veritical) to prevent obstruction of the chromatography column. Affinity chromatographs with plasma samples (150 uL) were performed on a matrix with the isolated lectins coupled (Frutalin and PNA) on an activated Sepharose 4B column as described in the manufacturer's protocol (Sigma). Peak I was eluted with 20 mM Tris in 150 mM NaCl pH 7.4. Nonretained fractions with an absorbance ≥ 0.100 at wavelengths of 280 and/or 220 nm were stored in a pool, and peak II elution was performed later with 20 Mm Gly buffer, pH of 2.6. The retained and eluted fractions with absorbances ≥ 0.050 occurring before mentioned wavelengths were concentrated and dialyzed with ammonium bicarbonate 50 mM Nanosep 3 kDa 10000 x g at 20 o C and later stored at -30 o C for further proteomic analysis as indicated Cavalcante et al. [18].

Frutalin extraction
Purification of the lectin lectin D-galactose-ligand from the seeds of Artocarpus incisa (frutalin). During the purification process of the D-galactose-binding lectin of the seeds of frutalin, a chromatographic process was carried out on a column of D-Galactose/Agarose (PIERCE) previously balanced with saline solution (150 mM NaCl). All of the proteins that did not interact with the matrix (non-retained fraction, Peak I) were eluted with the equilibrium solution and followed at an absorbance of 280 nm. The retained and eluted fraction with 0.2 M D-Galactose solution in 150 mM NaCl (Fraction retained, Peak II), contained the fructyl that had interacted with the matrix. The exposed galactosidic residue was fixed, and when eluted with a solution with D-Galactose, the chemical balance was displaced to favor the affinity on the galactose binding proteins coupled to the galactose in suspension.