Instant messaging platforms such as Telegram enable rapid information exchange butalso facilitate deceptive messaging at scale. In this study, we examine Telegram spamdetection through a hierarchy of models that vary in linguistic modeling capacity, frominterpretable lexical baselines (Logistic Regression, Random Forest, LightGBM) tosequential (GRU) and context-aware transformer representations (ALBERT). Usinga harmonized preprocessing and evaluation pipeline on 20,348 labeled messages, wecompare predictive performance across metrics (F1, ROC–AUC, PR–AUC, calibra-tion) and assess pairwise differences via McNemar’s test with multiple-comparisoncorrection. Across all metrics, ALBERT achieves the strongest performance and sub-stantially improves spam-class detection relative to lexical models. This performancegap is consistent with the presence of a subset of deceptive messages whose signals areless concentrated in surface keywords and more distributed across context. However,improved performance may also reflect differences in model capacity and inductivebias, benefits from large-scale pretraining, and stronger handling of sparse patternsvia contextual and subword representations. Accordingly, we interpret the proposed“complex tier” as an operational characterization of lexically subtle spam in this cor-pus, and we suggest that keyword-based moderation may be insufficient on its own tocapture the full spectrum of deceptive messaging observed here.