Preprint
Article

This version is not peer-reviewed.

RankBridge: Privacy-Preserving Rank-Based Explanation Clustering for Heterogeneous Federated Phishing Detection

Submitted:

07 May 2026

Posted:

08 May 2026

You are already at the latest version

Abstract
Federated learning lets organizations train a shared model without pooling private data. The standard method, Federated Averaging, requires all participants to use the same input features, a condition that fails in cross-sector phishing detection, where banks analyze URL structure and hospitals analyze email content. We present RankBridge, a system that groups participants by comparing ranked lists of SHapley Additive exPlanations (SHAP) feature importance rather than model weights or gradients. Each participant trains a local LightGBM model, extracts the top-K features by SHAP importance, and sends only 60 bytes of ranked indices to a central server. The server applies rank correlation and Ward’s hierarchical clustering to identify similarly threatened organizations, then combines models only within each discovered group. Across 32 participants in five organization types, RankBridge achieves F1 =0.854 on synthetic data and F1 =0.775 on real phishing data. Federated Averaging collapses to F1 =0.278 on the same data. RankBridge recovers the correct organizational groupings with Normalized Mutual Information (NMI) =0.978 while each participant transmits roughly 10,000× less data per round than a full model upload.
Keywords: 
;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated