Submitted:
08 June 2026
Posted:
09 June 2026
You are already at the latest version
Abstract

Keywords:
1. Introduction
2. Materials and Methods
2.1. Feature Selection for the Baseline Regression Model
2.2. Construction of the Baseline Regression Model
2.3. Generation of HIP Candidates
2.4. Construction of Local XGBoost Models
2.5. Construction of the Global Ranking Model for HIP Candidates
3. Results
3.1. Multi-Level Ranking Model for HIP Candidates
3.2. Baseline Ridge Model
3.3. XGBoost-Based Refinement of the Baseline Model and Construction of Local Models
3.4. Global Ranking of HIP Candidates
4. Discussion
5. Conclusions
6. Limitations
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AUC | Area under the receiver operating characteristic curve |
| CHGA | Chromogranin A |
| CGRP | Calcitonin gene-related peptide |
| ELISPOT | Enzyme-linked immunospot |
| FASTA | FAST-All |
| HIPs | Hybrid insulin peptides |
| HLA | Human leukocyte antigen |
| IAPP | Islet amyloid polypeptide |
| IFN-γ | Interferon gamma |
| INS | Insulin/proinsulin |
| INSA | Insulin A-chain |
| INSB | Insulin B-chain |
| INSC | Insulin C-peptide |
| MHC | Major histocompatibility complex |
| MHC-II | Major histocompatibility complex class II |
| NOD | Non-obese diabetic |
| NPY | Neuropeptide Y |
| OOF | Out-of-fold |
| ROC | Receiver operating characteristic |
| SCG1 | Secretogranin-1 |
| SCG2 | Secretogranin-2 |
| T1D | Type 1 diabetes |
| TCR | T-cell receptor |
| XGBoost | Extreme Gradient Boosting |
References
- Mittal, R.; Goldmann, R.; et al. Neoepitopes at the crossroads of immunometabolism: metabolic remodeling of antigen presentation in type 1 diabetes. Front. Immunol. 2026, 17, 1744422. [Google Scholar] [CrossRef]
- Delong, T.; Wiles, T. A.; et al. Pathogenic CD4 T cells in type 1 diabetes recognize epitopes formed by peptide fusion. Science 2016, 351, 711–714. [Google Scholar] [CrossRef] [PubMed]
- Wiles, T. A.; Powell, R.; et al. Identification of Hybrid Insulin Peptides (HIPs) in Mouse and Human Islets by Mass Spectrometry. J. Proteome Res. 2019, 18, 814–825. [Google Scholar] [CrossRef] [PubMed]
- Baker, R. L.; Rihanek, M.; et al. Hybrid Insulin Peptides Are Autoantigens in Type 1 Diabetes. Diabetes 2019, 68, 1830–1840. [Google Scholar] [CrossRef] [PubMed]
- Tran, M. T.; Faridi, P.; et al. T cell receptor recognition of hybrid insulin peptides bound to HLA-DQ8. Nat. Commun. 2021, 12, 5110. [Google Scholar] [CrossRef]
- Reed, B.; Crawford, F.; et al. Lysosomal cathepsin creates chimeric epitopes for diabetogenic CD4 T cells via transpeptidation. J. Exp. Med. 2021, 218. [Google Scholar] [CrossRef]
- Crawford, S. A.; Wiles, T. A.; et al. Cathepsin D Drives the Formation of Hybrid Insulin Peptides Relevant to the Pathogenesis of Type 1 Diabetes. Diabetes 2022, 71, 2793–2803. [Google Scholar] [CrossRef]
- Crawford, S. A.; Groegler, J.; et al. Hybrid insulin peptide isomers spontaneously form in pancreatic beta-cells from an aspartic anhydride intermediate. J. Biol. Chem. 2023, 299, 105264. [Google Scholar] [CrossRef]
- Wiles, T. A.; Delong, T. HIPs and HIP-reactive T cells. Clin. Exp. Immunol. 2019, 198, 306–313. [Google Scholar] [CrossRef]
- Reed, B. K.; Kappler, J. W. Hidden in Plain View: Discovery of Chimeric Diabetogenic CD4 T Cell Neo-Epitopes. Front. Immunol. 2021, 12, 669986. [Google Scholar] [CrossRef]
- Lin, Y.; Perovanovic, J.; et al. Antibody-Mediated Targeting of a Hybrid Insulin Peptide Toward Neonatal Thymic Langerin-Positive Cells Enhances T-Cell Central Tolerance and Delays Autoimmune Diabetes. Diabetes 2022, 71, 1735–1745. [Google Scholar] [CrossRef]
- Hohenstein, A. C.; Gallegos, J. B.; et al. Novel T-Cell Reactivities to Hybrid Insulin Peptides in Islet Autoantibody-Positive At-Risk Individuals. Diabetes 2025, 74, 933–942. [Google Scholar] [CrossRef]
- Parras, D.; Sole, P.; et al. Recognition of Multiple Hybrid Insulin Peptides by a Single Highly Diabetogenic T-Cell Receptor. Front. Immunol. 2021, 12, 737428. [Google Scholar] [CrossRef]
- Callebaut, A.; Guyer, P.; et al. An Insulin-Chromogranin A Hybrid Peptide Activates DR11-Restricted T Cells in Human Type 1 Diabetes. Diabetes 2024, 73, 743–750. [Google Scholar] [CrossRef] [PubMed]
- Norris, N.; Yau, B.; et al. Optimized Proteomic Analysis of Insulin Granules From MIN6 Cells Identifies Scamp3, a Novel Regulator of Insulin Secretion and Content. Diabetes 2024, 73, 2045–2054. [Google Scholar] [CrossRef] [PubMed]
- Arribas-Layton, D.; Guyer, P.; et al. Hybrid Insulin Peptides Are Recognized by Human T Cells in the Context of DRB1*04:01. Diabetes 2020, 69, 1492–1502. [Google Scholar] [CrossRef]
- Mannering, S. I.; Rubin, A. F.; et al. Identifying New Hybrid Insulin Peptides (HIPs) in Type 1 Diabetes. Front. Immunol. 2021, 12, 667870. [Google Scholar] [CrossRef] [PubMed]
- Calis, J. J.; Maybeno, M.; et al. Properties of MHC class I presented peptides that enhance immunogenicity. PLoS Comput. Biol. 2013, 9, e1003266. [Google Scholar] [CrossRef]
- Schaap-Johansen, A. L.; Vujovic, M.; et al. T Cell Epitope Prediction and Its Application to Immunotherapy. Front. Immunol. 2021, 12, 712488. [Google Scholar] [CrossRef]
- Farriol-Duran, R.; Dominguez-Dalmases, C.; et al. PredIG: an interpretable predictor of T-cell epitope immunogenicity. Genome Med. 2025, 17, 140. [Google Scholar] [CrossRef]
- Akter, R.; Cao, P.; et al. Islet Amyloid Polypeptide: Structure, Function, and Pathophysiology. J. Diabetes Res. 2016, 2016, 2798269. [Google Scholar] [CrossRef] [PubMed]
- Galeazza, M. T.; O'Brien, T. D.; et al. Islet amyloid polypeptide (IAPP) competes for two binding sites of CGRP. Peptides 1991, 12, 585–591. [Google Scholar] [CrossRef] [PubMed]
- Kaiser, A.; Muller, P.; et al. Unwinding of the C-Terminal Residues of Neuropeptide Y is critical for Y(2) Receptor Binding and Activation. Angew. Chem. 2015, 54, 7446–7449. [Google Scholar] [CrossRef] [PubMed]





| № | Local model |
Number of HIP candidates | Ridge rank | XGB rank | Top-k validation | Pearson (r) | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| L26 | A18 | G15 | L26 | A18 | G15 | KnownHIPs | Positives in XGB top 5% | Negatives in XGB top 10% | ||||
| 1 | INS_CHGA | 247,050 | 13,798 | 54,937 | 52,124 | 2,300 | 14,617 | 54,992 | 27,139 | Yes | No | 0.635 |
| 2 | INS_IAPP | 45,018 | 2,189 | 6,836 | 6,580 | 393 | 1,159 | 18 | 3 | Yes | No | 0.242 |
| 3 | INS_NPY | 49,410 | 1,901 | 7,682 | 7,633 | 28 | 469 | 125 | 75 | Yes | No | 0.676 |
| 4 | INS_SCG1 | 367,830 | 21,561 | 73,124 | 72,521 | 2,600 | 14,333 | 21,026 | 4,096 | Yes | No | 0.930 |
| 5 | INS_SCG2 | 334,890 | 16,731 | 57,823 | 63,229 | 8,385 | 9,301 | 12,452 | — | Yes | No | 0.749 |
| 6 | INSC_INSA | 3,024 | 11 | 260 | 201 | 6 | 10 | 3 | 4 | Yes | No | 0.306 |
| 7 | INSC_INSB | 4,968 | 161 | 643 | 652 | 23 | 11 | 44 | 888 | Yes | No | 0.171 |
| 8 | INSC_INSC | 5,184 | 143 | 882 | 1,002 | 131 | 415 | 801 | 8 | Yes | No | 0.886 |
| № | HIP sequence | Global model position from 1,057,374 | Local XGB model (position / candidates) | Final score | Length | Left source | Right source |
|---|---|---|---|---|---|---|---|
| 1 | SHLVEALYLFRARAYGFR | 36 | 1,828 / 247,050 | 0.894 | 18 | INSB | CHGA |
| 2 | SHLVEALYLERAHQQKKH | 54 | 4,543 / 247,050 | 0.893 | 18 | INSB | |
| 3 | VCGERGFFYERAHQQKKH | 61 | 2,368 / 247,050 | 0.892 | 18 | INSB | |
| 4 | AGSLQPLALRAYGFRGPG | 70 | 4,462 / 247,050 | 0.891 | 18 | INSC | |
| 5 | SHLVEALYLARAYGFRGP | 80 | 1,793 / 247,050 | 0.891 | 18 | INSB | |
| 6 | SHLVEALYLHQVEKRKCN | 16 | 467 / 45,018 | 0.898 | 18 | INSB | IAPP |
| 7 | VCGERGFFYHQVEKRKCN | 22 | 77 / 45,018 | 0.898 | 18 | INSB | |
| 8 | AGSLQPLALHQVEKRKCN | 32 | 381 / 45,018 | 0.895 | 18 | INSC | |
| 9 | CGERGFFYTQVEKRKCNT | 120 | 594 / 45,018 | 0.889 | 18 | INSB | |
| 10 | AGSLQPLALNTYGKRNAV | 168 | 84 / 45,018 | 0.887 | 18 | INSC | |
| 11 | AGSLQPLALQRYGKRSSP | 1 | 395 / 49,410 | 0.905 | 18 | INSC | NPY |
| 12 | SHLVEALYLQRYGKRSSP | 2 | 230 / 49,410 | 0.903 | 18 | INSC | |
| 13 | QCCTSICSLQRYGKRSSP | 4 | 1,191 / 49,410 | 0.902 | 18 | INSA | |
| 14 | AGSLQPLALRYYSALRHY | 5 | 267 / 49,410 | 0.902 | 18 | INSC | |
| 15 | TSICSLYQLITRQRYGKR | 6 | 576 / 49,410 | 0.901 | 18 | INSA | |
| 16 | CGERGFFYTRQVLKTSRK | 8 | 4,898 / 367,830 | 0.901 | 18 | INSB | SCG1 |
| 17 | CTSICSLYQWKSSHFERR | 11 | 3,596 / 367,830 | 0.900 | 18 | INSA | |
| 18 | AGSLQPLALVDKRRTRPR | 18 | 2,165 / 367,830 | 0.898 | 18 | INSC | |
| 19 | VCGERGFFYVDKRRTRPR | 23 | 1,815 / 367,830 | 0.897 | 18 | INSB | |
| 20 | CGERGFFYTVDKRRTRPR | 24 | 7,457 / 367,830 | 0.897 | 18 | INSB | |
| 21 | TSICSLYQLQQWPERKLK | 3 | 812 / 334,890 | 0.903 | 18 | INSA | SCG2 |
| 22 | LCGSHLVEAQQWPERKLK | 14 | 6,574 / 334,890 | 0.900 | 18 | INSB | |
| 23 | CTSICSLYQQQWPERKLK | 25 | 1,259 / 334,890 | 0.897 | 18 | INSA | |
| 24 | ERGFFYTPKPERKLKHMQ | 26 | 3,348 / 334,890 | 0.897 | 18 | INSB | |
| 25 | RGFFYTPKTTQQWPERKL | 37 | 2,475 / 334,890 | 0.894 | 18 | INSB | |
| 26 | AGSLQPLALYQLENYCN | 2,179 | 15 / 3,024 | 0.869 | 17 | INSC | INSA |
| 27 | GSLQPLALLYQLENYCN | 3,561 | 18 / 3,024 | 0.864 | 17 | INSC | |
| 28 | SLQPLALYQLENYC | 4,779 | 31 / 3,024 | 0.860 | 14 | INSC | |
| 29 | GSLQPLALYQLENYC | 4,879 | 40 / 3,024 | 0.860 | 15 | INSC | |
| 30 | GSLQPLALYQLENYCN | 4,979 | 36 / 3,024 | 0.860 | 16 | INSC | |
| 31 | AGSLQPLALQHLCGSHL | 1,611 | 64 / 4,968 | 0.872 | 17 | INSC | INSB |
| 32 | GSLQPLALQHLCGSHLV | 1,627 | 67 / 4,968 | 0.872 | 17 | INSC | |
| 33 | GPGAGSLQERGFFYTPK | 1,718 | 82 / 4,968 | 0.871 | 17 | INSC | |
| 34 | GSLQPLALRGFFYTPKT | 2,178 | 50 / 4,968 | 0.869 | 17 | INSC | |
| 35 | AGSLQPLALQHLCGSHLV | 2,183 | 61 / 4,968 | 0.869 | 18 | INSC | |
| 36 | AGSLQPLALEDLQVGQV | 15,975 | 52 / 5,184 | 0.842 | 17 | INSC | INSC |
| 37 | AGSLQPLALEAEDLQVG | 17,728 | 54 / 5,184 | 0.839 | 17 | INSC | |
| 38 | AGSLQPLALEAEDLQVGQ | 19,772 | 63 / 5,184 | 0.837 | 18 | INSC | |
| 39 | GSLQPLALEDLQVGQV | 21,493 | 78 / 5,184 | 0.835 | 16 | INSC | |
| 40 | AGSLQPLALEDLQVGQVE | 21,837 | 57 / 5,184 | 0.834 | 18 | INSC |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).