Preprint
Article

This version is not peer-reviewed.

DR-Transformer: A Dual-Regularized Transformer Combining Sparse Attention and Supervised Contrastive Learning for Interpretable Stress Detection in Social Media

Submitted:

21 May 2026

Posted:

22 May 2026

You are already at the latest version

Abstract
Automatic detection of stress from social-media text holds promise for digital mental health, but most existing Transformer-based approaches are opaque and computationally demanding. This work presents DR-Transformer, a Dual-Regularized Transformer that combines two complementary mechanisms: (i) a group sparsity penalty (L2,1/L2 elastic net) applied to the query and key projection matrices of every attention head, which encourages whole-row sparsity for token-level interpretability; and (ii) a supervised contrastive loss on the [CLS] projection, which organizes the latent space according to the stress label. The architecture is intentionally lightweight (6 layers, 8 heads, 256-dim embeddings; ∼9.5M parameters) and runs entirely on consumer-grade hardware (NVIDIA GTX 1660, 6 GB). Experiments on the publicly available Dreaddit dataset (binary stress classification, 2,838 train / 715 test segments) compare DR-Transformer against logistic regression, BiLSTM, a standard Transformer of identical architecture, and MentalBERT. Across five seeded runs, DR-Transformer (Full) reaches F1 = 0.876 (bootstrap 95% CI 0.852–0.898), outperforming the Standard Transformer (F1 = 0.842; McNemar p < 0.001 with Bonferroni correction) and performing comparably to the much larger MentalBERT (F1 = 0.879; p = 0.421). Sparse regularization increases the fraction of near-zero attention weights (below 0.01) from 0.215 to 0.682, yielding visibly focused attention maps, while the supervised contrastive loss improves the silhouette score of [CLS] embeddings from 0.312 to 0.483. Dual regularization thus combines accuracy, interpretability, and efficiency in a single model trainable without specialized infrastructure.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated