Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Edeepsadpr: An Extensive Deep-Learning Architecture for Prediction of the in Situ Crosstalks of Serine Phosphorylation and ADP-Ribosylation

Version 1 : Received: 3 January 2023 / Approved: 4 January 2023 / Online: 4 January 2023 (02:26:50 CET)

How to cite: Jiang, H.; Shang, S.; Sha, Y.; Li, L. Edeepsadpr: An Extensive Deep-Learning Architecture for Prediction of the in Situ Crosstalks of Serine Phosphorylation and ADP-Ribosylation. Preprints 2023, 2023010040. https://doi.org/10.20944/preprints202301.0040.v1 Jiang, H.; Shang, S.; Sha, Y.; Li, L. Edeepsadpr: An Extensive Deep-Learning Architecture for Prediction of the in Situ Crosstalks of Serine Phosphorylation and ADP-Ribosylation. Preprints 2023, 2023010040. https://doi.org/10.20944/preprints202301.0040.v1

Abstract

Protein phosphorylation and ADP-ribosylation (ADPr), as two types of post-translational modifications (PTM), are the process of adding phosphate group and ADP-ribose moieties to proteins, respectively. Although both PTM types can occur on many amino acid types, serine is the most common. Serine phosphorylation (pS), serine ADPr (SADPr), and their in situ crosstalks (pSADPr) play essential roles in biological processes. Although in silico classifiers have been developed for predicting pS and SADPr sites, the classifier for predicting pSADPr sites is unavailable. In this study, we developed classifiers to predict pSADPr sites. Specifically, we collected 3250 human pSADPr, 7520 SADPr, 151,227 pS and 80,096 unmodified serine sites. Based on them, we investigated the characteristics of pSADPr sites and constructed three classifiers to predict pSADPr sites from the pS dataset, the SADPr dataset and the protein sequences separately. We built and evaluated five deep-learning classifiers in ten-fold cross-validation and independent test datasets. Three of them (e.g. Convolutional Neural Network with the One-Hot encoding, dubbed CNNOH) performed better than the rest two. For instance, CNNOH had the AUC values of 0.700, 0.914 and 0.954 for recognizing pSADPr sites from the SADPr, pS and unmodified serine sites.Therefore, it is challenging to distinguish pSADPr sites from SADPr sites compared to the other two. It is consistent with our observation that pSADPr's characteristics are more similar to those of SADPr than the rest. Furthermore, we used the classifiers as base classifiers to develop a few stacking-based ensemble classifiers to improve performance. However, none of the ensemble classifiers showed better performances, suggesting that the base classifiers have good enough performances. Finally, we developed an online tool for extensively predicting human pSADPr sites based on the CNNOH classifier, dubbed EdeepSADPr. It is freely available through http://edeepsadpr.bioinfogo.org/.

Keywords

ADP-ribosylation; proteomics; post-translational modifications; deep-learning; stacking-based ensemble learning; protein network

Subject

Biology and Life Sciences, Biology and Biotechnology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.