A Systemic Immune-State Axis Distinguishes Psoriatic Arthritis from Psoriasis

Yoon Kyeong Lee; Hyun-A Seong

doi:10.20944/preprints202604.1871.v1

Submitted:

26 April 2026

Posted:

27 April 2026

You are already at the latest version

Abstract

Psoriasis and psoriatic arthritis (PsA) are systemic immune-mediated diseases, but the features that distinguish cutaneous-dominant psoriasis from musculoskeletal involvement remain unclear. We analyzed four core public cross-sectional datasets spanning whole-blood methylation, PBMC single-cell RNA sequencing summarized at the subject level, skin RNA sequencing, and purified CD4+ T-cell methylation, and used two additional public skin cohorts for external contextual checks to define an inflammatory disease axis (DIR) and a contrast-resolved systemic-state coordinate (CRS) representing additional systemic immune-state variation associated with PsA. In whole-blood methylation, DIR primarily separated healthy controls from psoriasis, whereas CRS separated psoriasis from PsA with minimal correlation to DIR. In PBMC single-cell data, CRS was higher in PsA and in the source-defined PSX subgroup (joint pain without CASPAR-classified PsA) than in PsO. Cell-type-resolved analyses localized CRS-related shifts to CD8 naive T cells, NK cells, CD14 monocytes, and regulatory T cells and identified multicompartment pathway-state remodeling along the CRS continuum. In contrast, skin RNA sequencing mainly captured lesional inflammatory burden and showed only limited additional PsA-related separation within the same tissue state. These findings support a model in which PsA is distinguished from psoriasis by an additional systemic immune-state axis rather than by skin inflammatory burden alone.

Keywords:

psoriasis

;

psoriatic arthritis

;

systemic inflammation

;

immune-state axis

;

DNA methylation

;

single-cell RNA sequencing

;

skin transcriptomics

;

disease axis

Subject:

Biology and Life Sciences - Immunology and Microbiology

1. Introduction

Psoriasis and psoriatic arthritis (PsA) are chronic immune-mediated disorders that are increasingly recognized as components of a broader systemic inflammatory spectrum rather than isolated organ-specific diseases [1,2,3]. Although psoriasis and PsA share substantial immunologic and inflammatory features, only a subset of patients with psoriasis develop musculoskeletal involvement [3,4]. The key unresolved question is therefore not whether psoriasis is systemic, but which additional systemic features distinguish psoriasis from PsA.

Previous studies have identified immune, metabolic, epigenetic, and tissue-level abnormalities in both psoriasis and PsA; however, many analyses have remained centered on a single tissue, a dominant molecular layer, or a limited set of pairwise contrasts, even when multimodal data were available [5,6,7,8]. These approaches are useful for identifying differentially expressed genes, methylation changes, enriched pathways, and cohort-specific classifiers, but they are less well suited to separating biological programs that are shared across psoriatic disease from those that are specifically associated with PsA. Recent integrative frameworks in psoriasis have also illustrated the value of cross-layer public-data synthesis, but they have not been designed to distinguish biology shared with PsA from the systemic immune-state shift associated with arthritis development [9]. In practice, skin inflammatory activity, circulating immune changes, and cell-composition effects are often interpreted together, which makes it difficult to determine whether PsA represents a quantitatively stronger form of psoriasis or a qualitatively distinct systemic immune state.

Accordingly, the present study was designed less as a search for another single-layer biomarker and more as a cross-dataset coordinate framework. Relative to earlier analyses emphasizing cell-type markers, machine-learning classification, or purified-cell methylation differences within one cohort [5,7], the specific aim here was to place blood- and skin-derived signals into a common biological structure that separates shared psoriatic inflammatory burden from the additional systemic state associated with PsA.

A framework that separates shared inflammatory disease activity from the additional biology associated with PsA would be biologically informative and clinically relevant [10,11,12]. If psoriasis and PsA share a common inflammatory foundation but diverge at the level of systemic immune organization, then disease progression should not be modeled as a single severity continuum [10,13]. Instead, it may be better represented by at least two partially distinct coordinates: one axis reflecting the shared inflammatory component of psoriatic disease and a second axis reflecting positional shift toward a PsA-associated systemic state.

To test this concept, we integrated four core public datasets spanning whole-blood DNA methylation, PBMC single-cell RNA sequencing summarized at the subject level, skin RNA sequencing, and purified CD4⁺ T-cell methylation, and used two additional public skin cohorts for external contextual checks. Using these complementary data layers, we defined two disease coordinates: an inflammatory disease axis (DIR) and a contrast-resolved systemic-state coordinate (CRS) representing the additional systemic component associated with PsA. We then examined whether these coordinates could distinguish shared psoriatic inflammatory burden from the additional systemic biology associated with PsA. Our results support a model in which PsA is distinguished from psoriasis by an additional systemic immune-state axis rather than by increased skin inflammatory burden alone.

2. Materials and Methods

2.1. Study Design and Public Datasets

This study integrated four core publicly available transcriptomic and epigenomic datasets representing complementary biological layers of psoriatic disease (Supplementary Table S1). GSE200376, a peripheral blood DNA methylation cohort containing healthy controls (HC), psoriasis without arthritis (PsO), and psoriatic arthritis (PsA), was used as the primary systemic discovery dataset to define the disease coordinates [14,15]. GSE194315, a subject-level peripheral blood mononuclear cell (PBMC) single-cell RNA sequencing dataset with surface epitope profiling, was used as an independent systemic transcriptomic comparison cohort for subject-level positioning, intermediate-state analysis in PSX, early-mover compartment analysis, and pathway-state remodeling [5,16]. GSE186063, a skin RNA sequencing dataset with lesional and non-lesional biopsies plus healthy-appearing skin from ankylosing spondylitis comparators, was used as the cutaneous anchoring cohort to determine whether the psoriasis-versus-PsA distinction could be explained by skin inflammatory burden [6,17]. GSE236694, a small exploratory purified circulating CD4⁺ T-cell DNA methylation cohort, was included only to examine whether disease-axis structure might also be detectable in a more cell-enriched methylation context [7,18]. Two additional public skin cohorts, GSE205748 and GSE202011, were used later as external contextual checks for the skin-layer interpretation [19,20,21,22].

All analyses were performed on publicly available, de-identified data. No new human participants were recruited for this study. Additional notes on how DIR and CRS were operationalized and interpreted across modalities are summarized in Appendix Table A1.

2.2. Metadata Harmonization and Cohort Definitions

Sample- and subject-level metadata were harmonized across datasets using the integrated project metadata tables and dataset-specific design tables generated during preprocessing. Disease groups were standardized to HC, PsO, PsA, and, where available, PSX. In GSE194315, the source study defined PSX as the subgroup of psoriasis subjects reporting joint pain but not meeting CASPAR classification criteria for PsA [5,23]. We therefore treated PSX here as a dataset-defined arthralgia subgroup rather than as a newly assigned clinical label. Because the public source records did not adjudicate alternative causes of joint pain, the PSX label in the current study should not be interpreted as proof of prodromal PsA. For GSE186063, skin state was retained separately as healthy-appearing ankylosing spondylitis comparator skin (AS-H), non-lesional (N), or lesional (L), and the analysis unit was defined as AS-H, PsO-N, PsO-L, PsA-N, or PsA-L. Accordingly, AS-H in this dataset denotes a within-dataset non-psoriatic skin comparator rather than skin from unaffected healthy volunteers. For GSE194315, subject rather than individual cell was treated as the primary statistical unit. Subject-level metadata were linked to cell-level metadata by subject identifier, and treated subjects were excluded from the primary analysis set. A prespecified threshold of 100 retained cells per subject was used as a basic quality-control safeguard to avoid unstable subject-level cell-type proportion estimates and pseudobulk analyses in very low-coverage subjects. No subject in the final primary cohort fell below this threshold. A disease-stage factor was prespecified in the fixed order HC → PsO → PSX → PsA. For ordered trend analyses, this stage order was encoded as an ordinal numeric variable (HC = 0, PsO = 1, PSX = 2, and PsA = 3) to test monotonic associations of cell-type proportions and signature scores with disease-stage rank. This coding was used only to represent rank order and was not intended to imply equal biological spacing or observed longitudinal progression between groups.

Across the four core datasets, group standardization and subject-level alignment were applied during metadata harmonization, and harmonized age- and sex-adjusted score-level consistency checks were performed. Dataset characteristics are summarized in Supplementary Table S1, and harmonized age- and sex-adjusted score-level consistency checks are summarized in Supplementary Table S2.

2.3. Preprocessing of DNA Methylation Datasets

Raw Illumina methylation IDAT files were processed in R using minfi (v1.56.0) [24]. For both GSE200376 and GSE236694, raw Illumina methylation IDAT basenames were matched to dataset-specific target tables keyed to GEO sample identifiers (GSM) to define the analysis cohorts. Detection p values were computed with minfi::detectionP, and per-sample quality-control metrics, including call rate and median methylated and unmethylated signal intensities, were recorded. Background correction and normalization were performed using preprocessNoob [25]. Beta values and M values were extracted with getBeta and getM, respectively.

For the primary GSE200376 differential methylation analysis, probes were retained only if detection

p \leq 0.01

in at least 95% of samples, after which non-CpG probes, sex-chromosome probes, and published cross-reactive and SNP-overlapping probes were excluded using the EPIC blacklist files corresponding to Supplementary Tables S1, S4, S5, and S6 of the original resource [26].

For GSE236694, the primary differential methylation analysis likewise retained probes with detection

p \leq 0.01

in at least 95% of samples and then restricted the analysis to autosomal probes. Sample-level quality-control metrics were reviewed for both methylation cohorts, and this review did not identify any sample requiring additional exclusion in the primary differential methylation analyses. Filtered beta matrices, M-value matrices, detection p matrices, and per-sample quality-control tables were retained as downstream analysis inputs.

Because GSE200376 served as the primary whole-blood discovery methylation cohort, a stringent blacklist-based probe filter was applied, whereas GSE236694 was treated as a small exploratory purified CD4+ T-cell cohort and therefore used a minimal probe filter limited to detection-passing autosomal probes in order to preserve analyzable probe coverage.

2.4. Construction of the GSE200376 Discovery Coordinate System

GSE200376 was used to construct the primary discovery coordinate system. Two axes were defined. The inflammatory disease axis (DIR) was anchored to the PsO-versus-HC contrast and was intended to capture the shared inflammatory component of psoriatic disease. The contrast-resolved systemic-state coordinate (CRS) was anchored to the PsA-versus-PsO contrast and was intended to capture the additional systemic state associated with PsA relative to PsO.

Differentially methylated probes (DMPs) were identified from M values using limma (v3.66.0), and differentially methylated regions (DMRs) were identified using DMRcate (v3.6.0) [27,28]. For pairwise comparisons, the primary total-signal model was specified as

\sim 0 + group + Age + Sex + sentrix_id,

where group denotes the harmonized disease-group factor (HC, PsO, or PsA) and sentrix_id denotes the Illumina BeadChip/Sentrix slide identifier derived from the IDAT basename. A cell-adjusted sensitivity model additionally included the estimated whole-blood cell fractions CD8T, CD4T, NK, Bcell, and Mono. Ordered trend analyses across HC, PsO, and PsA used a numeric trend term coded as HC

= 0

, PsO

= 1

, and PsA

= 2

. This coding was used only to represent ordered disease-stage rank for trend testing and was not intended to imply equal biological spacing or observed longitudinal progression. The total-signal trend model was specified as

\sim trend + Age + Sex + sentrix_id

and a cell-adjusted sensitivity trend model additionally included the same whole-blood cell-fraction covariates. Statistical testing was performed on M values, and strict DMPs were defined as probes with Benjamini–Hochberg false discovery rate (FDR)

< 0.05

and

| Δ β | \geq 0.02

. DMRs were called using a strict seed threshold of FDR

< 0.05

in the reported primary run.

Whole-blood cell fractions were estimated with the FlowSorted.Blood.EPIC reference-based deconvolution workflow using estimateCellCounts2 with the preprocessNoob processing method and IDOL probe selection [29]. Sample identifiers were harmonized to GSM IDs before merging estimated cell fractions with the design table, and cell-fraction row sums were checked and renormalized when minor numerical drift from 1.0 was detected.

Final GSE200376 DIR and CRS scores were computed from CpG signatures derived from the internal five-fold resampling procedure used for score construction. Within each training fold, CpGs were selected de novo from the relevant limma contrast, and probes selected in at least three valid folds, defined here as folds in which the training contrast was estimable and a fold-specific score model could be fit, were retained for final scoring; this three-fold rule was used as a heuristic stability filter rather than as a formal optimality criterion. This yielded 25 CpGs for DIR and 45 CpGs for CRS in the final scoring set, reflecting axis-specific fold stability rather than a matched target feature count. For each sample j, the raw score was computed as

{Score}_{j}^{raw} = \sum_{i} w_{i} z (M_{i j}),

where

w_{i}

is the limma moderated t statistic for CpG i from the relevant discovery contrast, and

z (M_{i j})

denotes CpG-wise standardization of the sample M value using the mean and standard deviation of that CpG within the relevant reference set. For each axis, the M value of each retained CpG was first standardized using the mean and standard deviation of that CpG within the relevant reference groups (DIR: HC+PsO; CRS: PsO+PsA). These standardized CpG values were then combined as a weighted sum to obtain the raw score. The raw score was subsequently centered and scaled using the mean and standard deviation of the raw-score distribution within the same reference set, and the resulting reference-centered standardized scores were saved as the final DIR and CRS values used for downstream positioning and visualization. These axis-specific reference sets were chosen to preserve contrast-specific interpretability rather than to place DIR and CRS on a single shared assay-level scale.

2.5. Internal Stability and Calibration Assessment in GSE200376

The GSE200376 discovery framework was internally evaluated using leakage-aware nested resampling, but this procedure was used only to characterize the stability of the coordinate system and the discrimination and calibration of the derived score-based prediction models; it was not treated as external validation. Five-fold outer cross-validation was performed. Fold assignment was grouped by sentrix_id whenever feasible to reduce batch leakage; when grouped splitting was infeasible, a stratified fallback preserving class balance was used.

Within each training fold, feature selection was performed de novo using limma, and the weighted score was reconstructed using training data only. Fold-specific score standardization was based exclusively on the training mean and standard deviation. For each axis, prediction models were then fitted for the corresponding discovery contrast (DIR: HC versus PsO; CRS: PsO versus PsA) using ridge logistic regression (glmnet, v4.1-10), with a generalized linear model fallback implemented as a safeguard when penalized fitting was infeasible in a given fold [30]. The ridge penalty parameter was selected within each training fold by inner cross-validation using the 1-SE rule, defined as the largest penalty yielding cross-validated error within one standard error of the minimum [31]. The full prediction model included the fold-specific standardized score, age, sex, and the first two principal components of the cell-fraction matrix.

The area under the receiver operating characteristic curve (AUC), Brier score, calibration intercept, and calibration slope were recorded together with five-bin calibration tables and fold-level diagnostics.

2.6. Subject-Level and Cell-Type-Level DIR/CRS Scoring in GSE194315

GSE194315 was analyzed at the subject level. Subject-level cell-type proportions were first calculated from the filtered cell metadata. Pseudobulk RNA count matrices were then generated for each subject and annotated cell type by summing pre-normalization single-cell counts within each subject–cell type combination. Subject–cell type combinations represented by fewer than 30 retained cells were excluded from pseudobulk generation. Per-cell-type pseudobulk matrices and matching subject-level metadata tables were then constructed for downstream scoring and pathway analysis.

Cell-type-specific DIR and CRS signatures were derived independently within GSE194315 from pseudobulk contrasts using edgeR (v4.8.2), and signature scores were subsequently computed by single-sample gene set enrichment analysis using GSVA (v2.4.4) [32,33]. For each cell type, a signed ranking statistic was defined as

signed_stat = sign (log F C) \sqrt{F},

where F is the quasi-likelihood test statistic from glmQLFTest [34]. This quantity was used only to rank genes for signature construction and was not interpreted as a formal inferential effect-size estimate.

DIR signatures were learned using only HC and PsO subjects, whereas CRS signatures were learned using only PsO and PsA subjects; PSX subjects were explicitly excluded from signature learning and were scored only after the signatures had been fixed. For each contrast and cell type, signature learning required at least five subjects in each contrast group. The top 100 genes with the most positive signed statistic and the top 100 genes with the most negative signed statistic were retained as the UP and DOWN components of the signature, respectively. Per-cell-type scoring was then performed only when at least five genes remained in both the UP and DOWN components after matching to the expression matrix.

Per-cell-type scores were computed by ssGSEA using GSVA, with kcdf = "Gaussian", absRanking = TRUE, and ssgsea.norm = TRUE. The final score was defined as

Score = ssGSEA (UP) - ssGSEA (DOWN) .

Subject-level overall DIR and CRS scores were then computed as weighted means across cell types, using weights proportional to

\sqrt{n_{cells}}

when subject–cell type cell counts were available and equal weighting otherwise. This

\sqrt{n_{cells}}

weighting scheme was used as a heuristic compromise between equal weighting and direct cell-count weighting.

2.7. Definition of the CRS Reference Band and Early-Mover Compartments

To formalize a reference interval along the systemic CRS continuum, CRS scores in GSE200376 were first normalized relative to the PsO distribution:

C R S_{norm} = \frac{C R S - μ_{PsO}}{σ_{PsO}} .

A logistic model for

P (PsA ∣ C R S_{norm})

was then fitted using the PsO and PsA samples only. The initial lower and upper band limits were defined as the normalized CRS values corresponding to predicted probabilities of 0.20 and 0.80, respectively. To summarize uncertainty and obtain the final applied band, class-balanced bootstrap resampling (1000 iterations) was performed, and the bootstrap medians of the lower and upper bounds were retained as the final CRS reference band. These probability cut points were used as a heuristic reference interval for the central reference zone rather than as optimized diagnostic thresholds.

Band stability in the discovery cohort was further assessed by repeated stratified five-fold cross-validation (50 repeats), in which CRS normalization and logistic fitting were repeated within each training split, and the resulting band limits were summarized across repeats. Concise summaries of the bootstrap, repeated-CV, and cutoff-sensitivity analyses are reported in Supplementary Figure S1.

The resulting GSE200376-derived band was transferred to GSE194315 after recalculating the PsO mean and standard deviation within the scRNA-seq cohort: for subject-level overall scores, normalization used the overall PsO distribution, whereas for subject–cell type scores, normalization was performed separately within each cell type using the corresponding PsO distribution. This transfer was based on contrast-level rather than feature-level equivalence: in both datasets, CRS was defined independently from the same PsO-versus-PsA biological contrast within the assay-specific data layer. PsO-centered normalization therefore aligned each cohort to its own internal psoriasis reference state, so that transferred band labels reflected relative displacement away from PsO rather than equality of methylation-derived and transcript-derived measurement scales. The transfer was interpreted only as a test of whether a discovery-derived positional interval preserved PsO-like versus PsA-like ordering in an independent transcriptomic cohort. Subject-level and subject–cell type observations were then classified as below-band (PsO-like), in-band (central reference zone), or above-band (PsA-like).

Early-mover compartments were defined as cell types in which PSX subjects were enriched in the above-band state relative to PsO subjects. For each cell type, this was tested only when at least six PsO and six PSX subjects were available, using one-sided Fisher’s exact tests for enrichment of PSX in the above-band state relative to PsO and permutation testing based on 5000 shuffles of the PSX/PsO labels, followed by Benjamini–Hochberg correction across cell types.

2.8. Cell-Type Pathway Dynamics along the CRS Continuum

To characterize within-cell-type pathway-state remodeling associated with CRS, a focused Hallmark pathway panel was evaluated in GSE194315 pseudobulk data. The panel included interferon-alpha response, interferon-gamma response, TNF/NF

κ

B signaling, IL6/JAK/STAT3 signaling, inflammatory response, complement, IL2/STAT5 signaling, and TGF-

β

signaling, together with a custom cytotoxicity gene set defined as NKG7, PRF1, GZMB, GZMH, GNLY, KLRD1, FCGR3A, TRAC, TRBC1, and TRBC2. Gene symbols were harmonized across pseudobulk matrices, and Ensembl identifiers were mapped to symbols using the org.Hs.eg.db annotation package when necessary.

Pathway analysis was restricted to a focused eight-cell-type panel selected for focused modeling on sufficiently represented immune compartments: CD8.Naive, NK, CD14.Mono, Treg, B.naive, CD4.Naive, CD4.TCM, and CD16.Mono. This analysis was intended as a focused rather than exhaustive survey of all annotated cell types. For each selected cell type, pseudobulk counts were normalized using edgeR::calcNormFactors and transformed to log-CPM values using cpm(log = TRUE, prior.count = 1), and ssGSEA pathway scores were then computed per subject using GSVA (v2.4.4). Cell types with fewer than eight subjects in common with the subject-level CRS table were excluded from the primary pathway analysis. A broader exploratory survey across all eligible annotated pseudobulk cell types, defined here as cell types with at least eight subjects in common with the subject-level CRS table, is reported in Supplementary Figure S7; this broader screen was not used to redefine the primary eight-cell-type panel.

Within each selected cell type and pathway, pathway activity was modeled across the cross-sectional CRS continuum spanning PsO, PSX, and PsA subjects as a smooth function of

C R S_{norm}

using generalized additive models implemented in the mgcv package (v1.9-4) [35] of the form

pathway score \sim s (C R S_{norm}, k = 5) .

For each pathway–cell type pair, the pathway inflection point was operationally defined as the

C R S_{norm}

value at which the absolute first derivative of the fitted smooth function was maximal. This procedure was used as a descriptive summary of pathway-specific positional dynamics across the cross-sectional CRS continuum.

2.9. Skin Transcriptomic Analysis in GSE186063

GSE186063 skin RNA-seq data were analyzed from a preprocessed dataset containing an edgeR DGEList, a logCPM expression matrix, and parsed GEO series-matrix metadata. Metadata rows were aligned to the expression matrix primarily by sample identifier, with fallback matching by sample title and GSM when necessary. Disease group, tissue state, and detailed unit-group labels were standardized to AS-H, PsO-N, PsO-L, PsA-N, and PsA-L, where AS-H denotes healthy-appearing ankylosing spondylitis comparator skin rather than skin from unaffected healthy volunteers. Subject identifiers were parsed from the aligned sample identifiers to allow evaluation of possible paired lesional/non-lesional analyses.

Differential expression analysis was performed using the edgeR quasi-likelihood framework with robust dispersion estimation. The tested contrasts were PsA-L versus AS-H, PsA-N versus AS-H, PsA-L versus PsA-N, PsA-L versus PsO-L, PsA-N versus PsO-N, PsO-L versus AS-H, PsO-N versus AS-H, and PsO-L versus PsO-N. For unpaired contrasts, the design was specified as

\sim 0 + group + Sex + Age .

For within-disease lesional-versus-non-lesional comparisons, paired models of the form

\sim subject_block + group

were prespecified when at least five paired subjects were available, where group denotes the harmonized analysis-group factor (AS-H, PsO-N, PsO-L, PsA-N, or PsA-L) and subject_block denotes the subject-specific blocking factor for paired lesional/non-lesional samples; otherwise an unpaired design was used. In the final analysis-ready dataset, both within-disease contrasts were analyzed using the unpaired design. Differential expression results were exported in full, and FDR-significant gene counts were summarized across contrasts.

Pathway analysis was performed from the differential expression results using pre-ranked GSEA with fgsea (v1.36.2) [36]. Ranking was based on signed

- {log}_{10} (P)

statistics derived from the edgeR quasi-likelihood test, defined as

sign (log F C) \times [- {log}_{10} (P)]

, where P denotes the edgeR quasi-likelihood test p value. Hallmark, GO biological process, KEGG legacy, and Reactome collections were analyzed by GSEA. In addition, Hallmark over-representation analyses for upregulated and downregulated gene sets were performed using clusterProfiler::enricher (v4.18.4) [37] when at least 10 FDR-significant mapped genes were available in the corresponding direction, with all mapped tested genes in the corresponding contrast used as the background universe. For the primary skin-axis scores, DIR_skin was anchored to the PsO-L versus AS-H contrast and CRS_skin> to the PsA-L versus PsO-L contrast. Scores were computed by ssGSEA as UP minus DOWN using the same GSVA-based ssGSEA settings described in Section 2.6. Core signatures from the prespecified strict and relaxed GMT signature sets defined in the core-signature pipeline were used when available, with fallback first to the relaxed GMT set and then to ranked-signature TOP-100 sets when core signatures were unavailable or too small. This ranked-signature fallback was required for the primary CRS_skin score because the corresponding strict and relaxed core signatures were empty.

2.10. Exploratory Purified CD4⁺ T-Cell Methylation Analysis in GSE236694

GSE236694 was analyzed as an exploratory purified-cell methylation cohort. Upstream preprocessing was performed from raw IDAT files using the same minfi Noob-based workflow described above, and the present analysis used the resulting analysis-ready beta-value, M-value, and detection-p matrices. Differential methylation analysis was performed on M values using limma with robust empirical Bayes moderation [27]. The design matrix included disease group and, when available and non-degenerate within the analyzed subset, age, sex, batch_slide, and batch_array. Three contrasts were evaluated: PsO versus HC (DIR), PsA versus PsO (CRS), and PsA versus HC (disease spectrum).

DMRs were identified using DMRcate, with array type inferred automatically from probe identifiers. Primary DMR calling used CpG-level annotation with a strict seed threshold of FDR

< 0.05

and region calling under the default strict DMRcate setting (without relaxed pcutoff override). When no reportable regions were returned, pcutoff was relaxed sequentially to 0.05, 0.10, and then 0.20 while other DMRcate settings were held constant. In the reported run, all three contrasts yielded reportable regions at the first pcutoff-relaxation step (pcutoff = 0.05), and no further relaxation was required [28]. Gene-set enrichment was performed with missMethyl::gometh (v1.44.0) for GO and KEGG categories [38]. Because no contrast yielded FDR-significant CpG sets of sufficient size for enrichment, the reported gometh analyses used the fallback set of probes with nominal

P < 0.05

.

Exploratory DIR and CRS methylation scores were constructed separately from the GSE236694 DMP tables for the DIR (PsO versus HC) and CRS (PsA versus PsO) contrasts using a tiered feature-selection strategy, because the prespecified strict thresholds yielded too few FDR-significant CpGs to support stable score construction in this small cohort. The selection tiers were applied in the following order: (i) FDR

\leq 0.10

and

| Δ β | \geq 0.05

; (ii) FDR

\leq 0.20

and

| Δ β | \geq 0.02

; (iii) FDR

\leq 0.30

and

| Δ β | \geq 0.01

; (iv) fallback to top

| Δ β |

probes irrespective of FDR; and (v) fallback to top

| t |

probes. The top 30 CpGs passing the best available tier were used for each axis. For each probe, beta values were standardized relative to the appropriate baseline group (HC for DIR and PsO for CRS), multiplied by the sign of

Δ β

(or the sign of the t statistic if needed), averaged across probes, and then re-standardized within the baseline group. Because these scores were built and evaluated within the same small cohort, the resulting DIR and CRS scores from GSE236694 were interpreted only as exploratory within-cohort findings.

2.11. Additional Robustness Analyses

Several additional robustness analyses were performed using previously generated subject-level, cell-type, and pathway summary outputs in order to quantify the stability of the main subject-level and pathway-level conclusions. In GSE194315, subject-level DIR and CRS scores were recalculated using alternative cell-type aggregation weights proportional to 1,

\sqrt{n_{cells}}

,

n_{cells}

, and

log (1 + n_{cells})

, and the resulting subject-level scores were compared with the primary implementation by Pearson and Spearman correlation, preservation of the broad HC/PsO/PSX/PsA group ordering, and group-wise above-band occupancy.

In parallel, the discovery-derived CRS reference interval was transferred to GSE194315 using three positional strategies: the primary PsO-centered z-score transfer, pooled quantile transfer, and a within-cohort PsO-versus-PsA refit.

To refine interpretation of the early-mover Fisher analyses when PsO above-band counts were zero, Haldane–Anscombe continuity-corrected odds ratios and bounded Newcombe–Wilson risk-difference confidence intervals were also computed as descriptive sensitivity summaries; these quantities did not replace the primary Fisher/permutation inference.

For the main-text pathway-dynamic analyses, evaluated across the 72 pathway–cell type pairs in the primary pathway-localization panel, the generalized additive model basis dimension was varied from

k = 4

to

k = 7

, and bootstrap resampling (100 resamples per pathway–cell type pair) was used to summarize the directional stability of the inferred pathway-localization patterns. These robustness summaries are reported in Supplementary Tables S3 and S4.

2.12. External Skin Cohort Checks in GSE205748 and GSE202011

Two additional public skin datasets were analyzed to provide external context for the limited additional PsA-related separation observed in GSE186063.

GSE205748 is a small independent bulk skin RNA-seq cohort containing healthy control skin, uninvolved PsA skin, and PsA lesional skin (9 samples per group) [19]. Gene-level count data were normalized with edgeR::calcNormFactors and transformed to logCPM values using cpm(log = TRUE, prior.count = 2) [32]. LogCPM expression values were then scored by ssGSEA against a focused Hallmark panel comprising TNF/NF

κ

B signaling, interferon-alpha response, interferon-gamma response, IL6/JAK/STAT3 signaling, inflammatory response, and TGF-

β

signaling, using the same GSVA-based ssGSEA settings applied elsewhere in the manuscript [33]. The tested contrasts were PsA lesional versus healthy control skin and PsA lesional versus uninvolved PsA skin, evaluated by exact rank-sum permutation testing with Benjamini–Hochberg correction. Because the cohort did not include psoriasis without arthritis, it was used only for pathway-level PsA skin corroboration rather than for direct PsO-versus-PsA testing.

GSE202011 is an external spatial transcriptomic skin cohort containing healthy control, PsO, and PsA samples, including both lesional and non-lesional skin states [21]. Because the aim of this analysis was sample-level external projection of the GSE186063-derived bulk-like skin signatures rather than inference on spatial microenvironments, raw spot-level matrices were collapsed to sample-level pseudobulk counts by summing transcript counts across spots within each Visium sample. These sample-level pseudobulk counts were normalized with edgeR::calcNormFactors and transformed to logCPM values using cpm(log = TRUE, prior.count = 2) [32]. The official GSE186063-derived DIR_skin and CRS_skin signatures were then projected to these pseudobulk matrices using the same ssGSEA UP-minus-DOWN scoring logic and GSVA-based settings described above [33]. External score contrasts were evaluated by exact rank-sum permutation testing with Benjamini–Hochberg correction for the lesional comparisons PsO-L versus healthy control skin, PsA-L versus healthy control skin, and PsA-L versus PsO-L.

2.13. Covariate-Adjusted Consistency Checks

To verify that the principal directional conclusions were not explained solely by age- or sex-imbalance, we performed harmonized score-level consistency checks across the four core datasets. For selected key disease-axis findings emphasized in the main figures, the relevant DIR/CRS score was fitted in a pairwise linear model including disease group, age, and sex:

score \sim group + Age + Sex .

These checks were performed on the already generated dataset-specific DIR/CRS score tables rather than by repeating each full modality-specific pipeline, and they were interpreted as harmonized sensitivity summaries rather than as replacements for the primary analyses. The resulting summary is provided in Supplementary Table S2.

2.14. Statistical Analysis and Software

Correlations between DIR and CRS were assessed using Pearson and Spearman coefficients. Multiple-testing correction was performed using the Benjamini–Hochberg method where relevant. All tests were two-sided unless a directional alternative was explicitly specified (e.g., in the early-mover enrichment analysis).

Figure 3(b)–(d) should be interpreted as descriptive summaries of pathway-dynamic localization and representative fitted trajectories rather than as per-tile or per-curve hypothesis tests.

All analyses were performed in R version 4.5.1, and principal software packages and version numbers are reported at first use in the relevant subsections. During preparation of this study and manuscript, OpenAI GPT-5.4 Pro was used to assist with code review and debugging, as well as English-language editing and wording refinement. All analytical decisions, code execution, result verification, and final interpretation were performed and verified directly by the authors.

3. Results

3.1. Internal Resampling Supports Biological Coordinate Interpretation Rather Than Clinical Classification

We next assessed the internal stability of the GSE200376 framework using leakage-aware resampling grouped by Sentrix ID where feasible. Discrimination was moderate for both axes. The full-model area under the receiver operating characteristic curve (AUC) was

0.650

(95% CI,

0.474

–

0.826

) for DIR and

0.682

(95% CI,

0.521

–

0.843

) for CRS. Calibration was also imperfect, with calibration slopes of

0.504

for DIR and

0.430

for CRS (Supplementary Figure S1). These findings do not support interpretation of the discovery framework as a clinical prediction tool in its current form. Instead, they support its use as a biologically interpretable positional coordinate framework for organizing the systemic relationship among HC, PsO, and PsA (Figure 1). Consistent with this interpretation, logistic modeling of normalized CRS yielded a discovery-derived reference interval that was subsequently transferred as a positional reference to the independent single-cell cohort.

3.2. Subject-Level Single-Cell Profiles Recapitulate a CRS-Intermediate State between Psoriasis and Psoriatic Arthritis

To determine whether the two-axis structure was recapitulated in an independent systemic cohort, we examined subject-level PBMC single-cell RNA sequencing data from GSE194315 (Figure 2). DIR again aligned more strongly with the shared psoriatic disease burden, whereas CRS was higher in PSX and PsA than in PsO, with PSX occupying an intermediate systemic position. At the overall subject level, the mean DIR score was

0.016

in HC,

0.383

in PsO,

0.321

in PSX, and

0.243

in PsA. In contrast, the mean CRS score increased from

- 0.455

in PsO to

- 0.081

in PSX and

0.054

in PsA, supporting the interpretation that this dataset-defined PSX arthralgia subgroup occupies an intermediate systemic position rather than a simple extension of psoriasis.

The same overall pattern was retained in harmonized age- and sex-adjusted score-level checks, in which DIR remained higher in PsO than in HC and CRS remained higher in both PSX and PsA than in PsO (Supplementary Table S2).

Application of the discovery-derived CRS reference band further reinforced this structure. All PsO subjects were positioned below the band, whereas 25 of 28 PsA subjects were positioned above it. Notably, 10 of 14 PSX subjects were already above the band, supporting enrichment of PsA-aligned systemic states within this symptom-defined subgroup despite the absence of overt PsA classification. Within this cohort-specific signature-learning framework, PsO-versus-PsA separation based on normalized CRS was nearly complete (AUC

= 0.997

), with above-band sensitivity of

0.893

and specificity of

1.000

. Because the GSE194315 CRS signature was learned from PsO and PsA subjects within the same cohort, this PsO-versus-PsA separation should be interpreted only as an internal anchoring check rather than as independent classification performance. The key non-training observation was the PSX position after exclusion from signature learning. Additional CRS-localization analyses are shown in Supplementary Figure S2. Together, these findings indicate that the systemic distinction between psoriasis and PsA is already detectable at the subject level and that PSX occupies an intermediate position along the CRS coordinate, although the public metadata do not establish whether PSX joint pain reflects prodromal PsA or alternative musculoskeletal conditions.

Robustness analyses supported this interpretation. Alternative subject-level weighting schemes yielded highly correlated overall scores relative to the primary implementation (DIR Pearson

= 0.933

–

0.980

; CRS Pearson

= 0.962

–

0.982

) and preserved the broad HC/PsO/PSX/PsA ordering, although exact above-band occupancy varied modestly under direct cell-count weighting. In addition, the subject-level PSX above-band count remained 10 of 14 under the primary z-score transfer, pooled quantile transfer, and within-cohort refit strategies, indicating that the main PSX localization result was not driven by a single band-transfer rule (Supplementary Table S3).

3.3. PSX-Enriched High-CRS Compartments and Multicompartment Pathway-State Remodeling Localize CRS-Related Biology

We next asked whether the CRS-related signal was uniformly distributed across immune compartments or preferentially localized to specific cell types. Early-mover analysis showed that the CRS shift was not generalized across all cell populations, but instead was concentrated in a limited set of compartments (Figure 3). The strongest early-mover compartments were CD8 naive T cells, NK cells, CD14 monocytes, and regulatory T cells (Treg). For example, no PsO subjects (0/18) occupied the above-band state in CD8 naive T cells, whereas 8 of 12 PSX subjects did so (Fisher FDR

= 1.10 \times 10^{- 3}

; permutation FDR

= 1.73 \times 10^{- 3}

). Similar patterns were observed for NK cells (0/22 vs 7/13 PSX above-band), CD14 monocytes (0/24 vs 7/14), and Treg cells (0/10 vs 7/9).

Because several of these PsO-versus-PSX contingency tables contained zero above-band PsO observations, we additionally computed continuity-corrected effect-size summaries. The strongest early-mover compartments retained large corrected odds ratios and absolute risk differences, including Haldane–Anscombe odds ratios of 69.9 for CD8 naive T cells, 51.9 for NK cells, 49.0 for CD14 monocytes, and 63.0 for Treg cells, with corresponding risk differences of 0.50–0.78 (Supplementary Table S4).

We then examined whether CRS reflected changes in cell abundance or changes in cell-intrinsic transcriptional state. Stage-associated shifts in cell-type proportion were relatively limited, with only four cell populations showing FDR-significant trends across HC, PsO, PSX, and PsA: Treg, CD4 proliferating cells, CD16 monocytes, and double-negative T cells. In contrast, 36 biologically annotated cell type–signature combinations showed FDR-significant stage trends, indicating that the CRS-related pattern was more strongly represented by within-compartment state reprogramming than by simple changes in abundance. Additional abundance-focused and pathway-detail panels are shown in Supplementary Figure S3.

Pathway modeling along the normalized CRS continuum identified 72 inflection events across eight selected cell types: CD8 naive T cells, NK cells, CD14 monocytes, Treg cells, B naive cells, CD4 naive T cells, CD4 central memory T cells, and CD16 monocytes. Of these, 51 inflection events were downward and 21 were upward. Recurrent pathway changes involved TNF/NF

κ

B signaling, TGF-

β

signaling, IL6/JAK/STAT3 signaling, inflammatory response, interferon-response programs, and cytotoxicity-associated programs. Representative curves showed that these changes were pathway- and cell-type-specific rather than consistent with a uniform increase in inflammatory signaling across all compartments. Together, these summaries localized CRS-related pathway remodeling to innate, regulatory, and naive-memory lymphoid compartments, together with additional B-cell and monocyte involvement. A broader exploratory all-cell-type screen likewise supported multicompartment pathway localization beyond the focused main-text panel (Supplementary Figure S7 and Table S5).

The inferential support for this interpretation derived from the early-mover enrichment tests and the FDR-controlled stage-trend analyses, whereas the inflection heatmap and representative pathway curves provided descriptive summaries of where along the CRS continuum these transcriptional changes were concentrated. Additional robustness checks supported directional stability of these summaries. Across basis dimensions

k = 4

–7, 70 of 72 pathway–cell type pairs retained the same directional change, and bootstrap resampling yielded a median directional concordance of 0.905, with 52 of 72 pairs showing concordance

\geq 0.80

(Supplementary Table S3). These results indicate that the pathway summaries were more stable at the level of directional localization than at the level of exact inflection-point placement.

3.4. Skin RNA Sequencing Captures Inflammatory Tissue Burden and Shows Limited Additional PsA-Related Separation

We next asked whether the psoriasis-to-PsA distinction could be explained primarily at the level of skin inflammatory burden. In GSE186063, skin transcriptomes strongly separated inflammatory skin states from the healthy-appearing ankylosing spondylitis comparator skin (AS-H) and generated a robust lesion-driven differential-expression burden (Figure 4). PsO lesional skin versus the comparator AS-H state yielded 6619 FDR-significant genes, and PsA lesional skin versus the comparator AS-H state yielded 8019 FDR-significant genes. Within-disease lesional versus non-lesional contrasts were similarly strong (6415 FDR-significant genes in PsO and 8861 in PsA).

In contrast, psoriasis-versus-PsA comparisons within the same tissue state showed minimal additional separation at the individual-gene level. PsA lesional skin versus PsO lesional skin yielded no FDR-significant genes, and PsA non-lesional skin versus PsO non-lesional skin yielded only one FDR-significant feature, KRT16P3 (ENSG00000214822). Ranked Hallmark enrichment analyses refined this interpretation further. Relative to the robust lesional-versus-AS-H contrasts, psoriasis-versus-PsA differences within the same tissue state were more evident at the pathway level than at the level of individually significant genes. In the lesional comparison, interferon-response, inflammatory-response, TNF/NF

κ

B, and IL6/JAK/STAT3 programs remained detectably enriched in PsA relative to PsO (Figure 4). Harmonized age- and sex-adjusted score-level checks sharpened this interpretation further, showing strong DIR_skin separation for PsO-L versus AS-H, no additional DIR_skin separation for PsA-L versus PsO-L, and a modest but significant CRS_skin shift for PsA-L versus PsO-L (Supplementary Table S2). Together, these results indicate that skin transcriptomes primarily capture the shared inflammatory burden of psoriatic disease, while only limited additional PsA-related separation remains detectable within the same tissue state and does not account for the stronger systemic distinction observed in blood-derived datasets. Because the official skin CRS_skin score required a ranked-signature fallback rather than a stable core lesional signature, this residual CRS_skin signal should be interpreted cautiously.

External skin checks sharpened this interpretation. In the small balanced PsA skin bulk RNA-seq cohort GSE205748, exact permutation rank-sum tests showed reproducible lesional enrichment of IL6/JAK/STAT3, inflammatory-response, interferon-alpha, interferon-gamma, and TNF/NF

κ

B programs relative to both HC skin and uninvolved PsA skin (all FDR

\leq 9.87 \times 10^{- 5}

), whereas TGF-

β

signaling remained weak (Supplementary Figure S5). Because GSE205748 did not include psoriasis without arthritis, these data corroborated lesion-associated PsA skin inflammation but did not directly test psoriasis-versus-PsA skin separation. In the external spatial skin cohort GSE202011, projected DIR_skin scores again separated lesional PsO and lesional PsA from HC skin (both FDR

= 1.75 \times 10^{- 3}

) without additional PsA-versus-PsO lesional separation (FDR

= 0.547

), whereas projected CRS_skin scores showed only modest lesion-versus-HC shifts (FDR

= 2.62 \times 10^{- 2}

for PsO-L versus HC; FDR

= 2.21 \times 10^{- 2}

for PsA-L versus HC) and no lesional PsA-versus-PsO separation (FDR

= 0.902

; Supplementary Figure S6). This external non-replication of lesional CRS_skin separation is consistent with instability introduced by the ranked-signature fallback used for the official skin CRS_skin score.

3.5. Purified CD4⁺ T-Cell Methylation as an Exploratory Within-Cohort Analysis

Finally, we examined purified CD4⁺ T-cell methylation as a small exploratory layer. Upstream differential methylation support was limited in this cohort: the PsO-versus-HC contrast yielded one FDR-significant CpG (cg17864916), whereas the PsA-versus-PsO contrast yielded none. Exploratory score construction nevertheless produced directional shifts that were broadly compatible with the disease-axis framework. The corresponding DIR–CRS map and group-wise score distributions are shown in Supplementary Figure S4. We therefore interpret GSE236694 only as a small exploratory within-cohort analysis.

4. Discussion

The main contribution of this study is not to argue that psoriasis is systemic, which is already well supported, but to define which additional systemic features distinguish psoriasis without arthritis from psoriatic arthritis [1,2,3]. Across the analyzed datasets, the psoriasis–PsA spectrum was better represented by at least two partially distinct coordinates: a shared inflammatory disease axis (DIR) and an additional CRS coordinate that captured the systemic component most closely aligned with PsA. In this framework, PsA is not simply a more severe version of psoriasis. Rather, it is associated with an added systemic immune-state component superimposed on a shared inflammatory background [10,11,13].

This framing also clarifies the novelty of the study. The goal was not to claim a new fixed biomarker panel from any single cohort, nor to replace earlier single-layer analyses that identified cell-type markers, machine-learning classifiers, or purified-cell methylation differences [5,7]. Instead, the advance here lies in organizing those disparate signals into a single cross-dataset coordinate structure that separates shared psoriatic inflammatory burden from the additional systemic state associated with PsA and then testing where that structure reproduces, weakens, or fails across blood- and skin-derived layers.

Several findings support this interpretation. In the discovery blood methylation cohort, DIR and CRS were only weakly correlated, arguing against a single-axis severity model. DIR primarily tracked the shift from HC to PsO, whereas CRS primarily tracked the shift from PsO to PsA. In the independent PBMC single-cell cohort, the same broad geometry was preserved, with the source-defined PSX arthralgia subgroup occupying an intermediate CRS position and frequently falling on the above-band, PsA-aligned side of the discovery-derived reference band. Together, these observations are more consistent with an added systemic immune-state dimension than with a simple escalation of generalized inflammation [11,12,39].

An additional strength of the current framework is that it localizes the CRS-related signal to specific immune compartments. The strongest PSX-enriched high-CRS compartments were CD8 naive T cells, NK cells, CD14 monocytes, and Treg cells, implicating cell populations involved in immune priming, regulation, and innate surveillance rather than only terminal inflammatory effector responses. More broadly, the GSE194315 results indicate that the psoriasis-to-PsA distinction is represented more strongly by coordinated multicompartment state reprogramming than by a simple redistribution of major blood cell populations [8,39].

The descriptive pathway analyses refine this interpretation further. CRS was associated with patterns consistent with structured pathway-state remodeling across TNF/NF

κ

B, TGF-

β

, IL6/JAK/STAT3, inflammatory-response, interferon-response, and cytotoxicity-related programs. These changes did not resemble uniform hyperinflammation. Instead, different pathways changed in different directions across different cell types, supporting the interpretation that PsA involves a qualitatively different systemic immune configuration [13,40]. This distinction matters because it suggests that the central biology of the psoriasis-to-PsA distinction lies in altered immune organization, not merely in greater magnitude of inflammation.

The skin RNA-seq results further sharpen the biological interpretation of the two-axis model. Lesional skin robustly reflected inflammatory burden, as expected for active psoriatic disease [6,9], whereas psoriasis-versus-PsA comparisons within the same tissue state showed minimal additional separation at the individual-gene level and only more modest ranked pathway-level differences. This does not diminish the importance of skin inflammation in psoriatic disease. Rather, it suggests that skin inflammation corresponds more closely to the shared disease component represented by DIR, whereas the systemic distinction between psoriasis and PsA is better captured by blood-derived molecular and cellular states. In practical terms, this means that the psoriasis-versus-PsA distinction cannot be adequately explained by cutaneous inflammatory burden alone.

The external skin analyses were informative precisely because they tested different pieces of the skin argument. GSE205748 reinforced that lesional PsA skin reproducibly carries strong inflammatory pathway activity relative to healthy and uninvolved skin, but it could not address psoriasis-versus-PsA skin separation because psoriasis without arthritis was not represented in that cohort. GSE202011 addressed that narrower question more directly at the sample level and reproduced lesion-versus-healthy inflammatory burden without reproducing additional lesional CRS_skin separation between PsO and PsA. Taken together, these external checks support DIR_skin as a lesion-burden summary more strongly than they support CRS_skin as a stable external skin marker of PsA-specific separation.

The present study also clarifies how the CRS reference band should be interpreted. The band derived from GSE200376 was useful for positioning subjects along a PsO-like to PsA-like continuum, but it should not be treated as a clinical threshold or a deployable diagnostic rule. The discovery methylation model showed only moderate internal discrimination and imperfect calibration, and the band itself was derived from cross-sectional data. Accordingly, the framework is best understood as a biological coordinate system for organizing systemic disease states rather than as a ready-to-use clinical classifier.

Several limitations should be acknowledged. First, all analyses were performed on public datasets, so the study necessarily inherits the cohort composition, annotation quality, and technical heterogeneity of the original studies. Second, the data are cross-sectional and do not directly establish temporal ordering or causality; where dynamic language is used in this manuscript, it refers to inferred biological positioning rather than observed longitudinal movement. Third, the exact GSE200376-derived whole-blood methylation score model was not externally validated in an independent whole-blood methylation cohort. The current study therefore supports cross-modality consistency of the framework more strongly than transportability of a fixed same-modality methylation predictor. Fourth, the GSE194315 PSX subgroup was defined retrospectively from source metadata as psoriasis with joint pain but without CASPAR-classified PsA, and the public records did not adjudicate alternative causes of musculoskeletal pain. The PSX findings should therefore be interpreted as enrichment within a symptom-defined subgroup rather than as proof of imminent PsA conversion. This cautious framing is consistent with the 2023 EULAR points to consider, which treat arthralgia in people with psoriasis as a risk context while emphasizing the need to consider alternative diagnoses [41]. Fifth, the discovery and comparison layers do not all share the same assay or tissue context, which limits any claim of a single transferable biomarker panel. Sixth, the purified CD4⁺ T-cell methylation cohort was small, had limited upstream DMP support, and used the same cohort for exploratory score construction and evaluation; those findings should therefore be interpreted only as exploratory within-cohort observations.

Despite these limitations, convergent evidence from whole-blood methylation, subject-level PBMC single-cell transcriptomics, skin RNA sequencing, and exploratory purified CD4⁺ T-cell methylation supports the biological utility of the proposed framework. More broadly, these findings are consistent with a model in which psoriasis and PsA share a systemic inflammatory backbone but diverge through an additional multicompartment immune-state axis. Future studies should test this model in longitudinal cohorts, determine whether CRS-like profiles can be detected before clinically overt arthritis and how early they emerge, and define how metabolic, microbial, and immune-network perturbations converge on the systemic-state difference identified here.

5. Conclusions

This study supports a model in which psoriasis without arthritis and psoriatic arthritis share a systemic inflammatory foundation but are separated by an additional immune-state axis. DIR captures the common inflammatory component of psoriatic disease, whereas CRS captures multicompartment systemic reorganization associated with PsA. Across independent blood-derived datasets, the psoriasis–PsA distinction was represented more clearly by systemic molecular and cellular states than by skin inflammatory burden alone. Together, these findings support the interpretation that PsA is distinguished from psoriasis by a qualitatively distinct systemic immune-state configuration that cannot be explained by cutaneous inflammatory burden alone.

6. Patents

The authors declare that no patents resulted from the work reported in this manuscript.

Supplementary Materials

Figure S1: Internal leakage-aware performance assessment of the GSE200376 discovery framework; Figure S2: Additional CRS localization analyses in GSE194315; Figure S3: Additional cell-type abundance and pathway-detail analyses in GSE194315; Figure S4: Exploratory purified CD4⁺ T-cell methylation support in GSE236694; Figure S5: External PsA skin pathway corroboration in GSE205748; Figure S6: External spatial skin score check in GSE202011; Figure S7: Exploratory all-cell-type pathway-dynamics screen in GSE194315; Table S1: Public datasets included in the multimodal disease-axis analysis; Table S2: Harmonized age- and sex-adjusted summary of the key disease-axis findings across the four core datasets; Table S3: Robustness analyses for weighting, band transfer, and pathway-dynamic stability; Table S4: Continuity-corrected effect-size summaries for early-mover compartments; Table S5: Exploratory all-cell-type pathway-screen eligibility and summary in GSE194315.

Author Contributions

Conceptualization, Y.K.L.; methodology, Y.K.L.; software, Y.K.L.; validation, Y.K.L.; formal analysis, Y.K.L.; investigation, Y.K.L.; data curation, Y.K.L.; visualization, Y.K.L.; writing—original draft preparation, Y.K.L.; writing—review and editing, H.A.S.; supervision, H.A.S.; funding acquisition, H.A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Global-Learning & Academic Research Institution for Master’s·PhD students and Postdocs (G-LAMP) Program of the National Research Foundation of Korea (NRF) funded by the Ministry of Education, grant number RS-2024-00445180.

Institutional Review Board Statement

Ethical review and approval were not required for this study because it exclusively analyzed publicly available de-identified datasets and did not involve direct recruitment of human participants, access to identifiable private information, or new intervention or sample collection.

Informed Consent Statement

Patient consent was waived because this study exclusively analyzed publicly available de-identified datasets and did not involve direct participant enrollment.

Data Availability Statement

The public datasets analyzed in this study are available from the Gene Expression Omnibus (GEO) under accession numbers GSE200376, GSE194315, GSE186063, GSE236694, GSE205748, and GSE202011. Derived analysis tables supporting the findings of this study are provided within the article and its Supplementary Materials. Additional processed outputs and analysis scripts are available from the corresponding author upon reasonable request.

Acknowledgments

The authors thank the investigators who generated and publicly released the datasets analyzed in this study through the Gene Expression Omnibus. During the preparation of this study and manuscript, the authors used OpenAI GPT-5.4 Pro to support code review and debugging, as well as English-language editing and wording refinement. The authors reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

AUC	Area under the receiver operating characteristic curve
BMI	Body mass index
CRS	Contrast-resolved systemic-state coordinate
DIR	Inflammatory disease axis
DMP	Differentially methylated probe
DMR	Differentially methylated region
FDR	False discovery rate
GAM	Generalized additive model
GEO	Gene Expression Omnibus
GO	Gene Ontology
GSEA	Gene set enrichment analysis
AS-H	Healthy-appearing ankylosing spondylitis comparator skin
HC	Healthy control
IDAT	Illumina intensity data file
IFN	Interferon
KEGG	Kyoto Encyclopedia of Genes and Genomes
NK	Natural killer
PBMC	Peripheral blood mononuclear cell
PsA	Psoriatic arthritis
PsO	Psoriasis without arthritis
PSX	Psoriasis with joint pain but without confirmed PsA (source-dataset label)
RNA-seq	RNA sequencing
ROC	Receiver operating characteristic
ssGSEA	Single-sample gene set enrichment analysis
TNF	Tumor necrosis factor
Treg	Regulatory T cell

Appendix A. Cross-Modality Interpretation of DIR and CRS

DIR and CRS were designed as harmonized biological coordinates rather than as identical assay-level variables across all datasets. Accordingly, the mathematical implementation of each score depended on the modality and cohort context, but the biological interpretation was held constant. The intended commonality was contrast-level: each modality independently instantiated DIR from the HC-versus-PsO comparison and CRS from the PsO-versus-PsA comparison using assay-appropriate features and scoring rules. Cross-modality transfer therefore relied on shared biological anchoring and PsO-relative positioning rather than on direct feature reuse or absolute scale equivalence. DIR was intended to capture the shared inflammatory disease component of the psoriatic spectrum, whereas CRS was intended to capture the additional systemic state associated with psoriasis-versus-PsA positioning.

In GSE200376, DIR and CRS were derived directly from weighted whole-blood methylation contrasts. DIR was anchored to the HC-versus-PsO comparison, and CRS was anchored to the PsA-versus-PsO comparison. In GSE194315, the same biological contrasts were implemented using subject-level pseudobulk transcriptomic signatures and ssGSEA-based up-minus-down scoring. The GSE200376-derived CRS reference band was then applied to GSE194315 after cohort-specific normalization to the PsO distribution, providing a shared positional frame for comparing subject and cell-type states along a PsO-like to PsA-like continuum.

In GSE186063, the same two-axis framework was adapted to the skin context. DIR_skin was anchored to the PsO-L-versus-AS-H contrast to represent inflammatory tissue burden, whereas CRS_skin was anchored to the PsA-L-versus-PsO-L contrast to test whether lesional skin within the same tissue state could distinguish arthritis-related biology beyond shared lesion activity. In GSE236694, DIR and CRS were constructed as exploratory purified-cell methylation scores using the same biological contrast logic, but those analyses were interpreted cautiously because the same small cohort was used for both score construction and evaluation.

Table A1. Cross-modality implementation and interpretation of DIR and CRS.

Dataset	Biological Layer	DIR Implementation and Interpretation	CRS Implementation and Interpretation
GSE200376	Whole-blood DNA methylation	Weighted methylation score anchored to the HC-versus-PsO contrast; represents the shared inflammatory disease component across the psoriatic spectrum.	Weighted methylation score anchored to the PsA-versus-PsO contrast; represents the additional systemic state associated with musculoskeletal involvement.
GSE194315	Subject-level PBMC single-cell RNA-seq	Weighted average of cell-type ssGSEA up-minus-down transcriptomic scores learned from HC and PsO subjects; independently instantiates the same HC-versus-PsO contrast in the transcriptomic layer and positions systemic disease activity at the subject level.	Weighted average of cell-type ssGSEA up-minus-down transcriptomic scores learned from PsO and PsA subjects; independently instantiates the same PsO-versus-PsA contrast in the transcriptomic layer. After PsO-centered normalization, the GSE200376-derived band was used only as a positional reference to place the source-defined PSX arthralgia subgroup, identify PSX-enriched above-band compartments, and summarize pathway-state remodeling along the CRS continuum.
GSE186063	Skin RNA-seq	ssGSEA up-minus-down score anchored to the PsO-L-versus-AS-H contrast, where AS-H denotes healthy-appearing ankylosing spondylitis comparator skin; represents cutaneous inflammatory burden.	ssGSEA up-minus-down score anchored to the PsA-L-versus-PsO-L contrast; tests whether lesional skin within the same tissue state carries additional PsA-related separation beyond shared lesion biology.
GSE236694	Purified CD4⁺ T-cell DNA methylation	Exploratory purified-cell methylation score anchored to the PsO-versus-HC contrast; interpreted only as an exploratory within-cohort result because of the small cohort and limited upstream DMP support.	Exploratory purified-cell methylation score anchored to the PsA-versus-PsO contrast; interpreted only as an exploratory within-cohort result and not as an independent validation layer.

References

Griffiths, C.E.M.; Armstrong, A.W.; Gudjonsson, J.E.; Barker, J.N.W.N. Psoriasis. Lancet 2021, 397, 1301–1315. [Google Scholar] [CrossRef] [PubMed]
Korman, N.J. Management of psoriasis as a systemic disease: what is the evidence? Br. J. Dermatol. 2020, 182, 840–848. [Google Scholar] [CrossRef]
FitzGerald, O.; Ogdie, A.; Chandran, V.; Coates, L.C.; Kavanaugh, A.; Tillett, W.; et al. Psoriatic arthritis. Nat. Rev. Dis. Prim. 2021, 7, 59. [Google Scholar] [CrossRef]
Eder, L.; Haddad, A.; Rosen, C.F.; Lee, K.A.; Chandran, V.; Cook, R.; Gladman, D.D. The Incidence and Risk Factors for Psoriatic Arthritis in Patients With Psoriasis: A Prospective Cohort Study. Arthritis Rheumatol. 2016, 68, 915–923. [Google Scholar] [CrossRef]
Liu, J.; Kumar, S.; Hong, J.; Huang, Z.M.; Paez, D.; Castillo, M.; Calvo, M.; Chang, H.W.; Cummins, D.D.; Chung, M.; et al. Combined Single Cell Transcriptome and Surface Epitope Profiling Identifies Potential Biomarkers of Psoriatic Arthritis and Facilitates Diagnosis via Machine Learning. Front. Immunol. 2022, 13, 835760. [Google Scholar] [CrossRef]
Deng, J.; Leijten, E.; Zhu, Y.; Olde Nordkamp, M.; Ye, S.; Pouw, J.; et al. Multi-omics approach identifies PI3 as a biomarker for disease severity and hyper-keratinization in psoriasis. J. Dermatol. Sci. 2023, 111, 101–108. [Google Scholar] [CrossRef] [PubMed]
Natoli, V.; Charras, A.; Hofmann, S.R.; Northey, S.; Russ, S.; Schulze, F.; McCann, L.; Abraham, S.; Hedrich, C.M. DNA methylation patterns in CD4+ T-cells separate psoriasis patients from healthy controls, and skin psoriasis from psoriatic arthritis. Front. Immunol. 2023, 14, 1245876. [Google Scholar] [CrossRef]
Jin, J.Q.; Wu, D.; Spencer, R.; Elhage, K.G.; Liu, J.; Davis, M.; et al. Biologic insights from single-cell studies of psoriasis and psoriatic arthritis. Expert Opin. Biol. Ther. 2022, 22, 1449–1461. [Google Scholar] [CrossRef]
Lee, Y.K.; Kim, H.Y.; Shim, D. A Triple-Hit Multi-Omics Framework for Psoriasis: Microbial Metabolic Remodeling and Immune Cell Methylome Signature Associated with an AMP-Dominant Lesional Program. Life 2026, 16, 516. [Google Scholar] [CrossRef] [PubMed]
Scher, J.U.; Eder, L.; Aydin, S.Z.; Ogdie, A. Preventing psoriatic arthritis: focusing on patients with psoriasis at increased risk of transition. Nat. Rev. Rheumatol. 2019, 15, 153–166. [Google Scholar] [CrossRef]
Pennington, S.R.; FitzGerald, O. Early Origins of Psoriatic Arthritis: Clinical, Genetic and Molecular Biomarkers of Progression From Psoriasis to Psoriatic Arthritis. Front. Med. 2021, 8, 723944. [Google Scholar] [CrossRef] [PubMed]
Zabotti, A.; Tinazzi, I.; Aydin, S.Z.; McGonagle, D. From Psoriasis to Psoriatic Arthritis: Insights From Imaging on the Transition to Psoriatic Arthritis and Implications for Arthritis Prevention. Curr. Rheumatol. Rep. 2020, 22, 24. [Google Scholar] [CrossRef] [PubMed]
Schett, G.; Rahman, P.; Ritchlin, C.; McInnes, I.B.; Elewaut, D.; Scher, J.U. Psoriatic arthritis from a mechanistic perspective. Nat. Rev. Rheumatol. 2022, 18, 311–325. [Google Scholar] [CrossRef]
Deng, M.; Su, Y.; Wu, R.; Li, S.; Zhu, Y.; Tang, G.; et al. DNA methylation markers in peripheral blood for psoriatic arthritis. J. Dermatol. Sci. 2022, 108, 39–47. [Google Scholar] [CrossRef]
NCBI Gene Expression Omnibus. GSE200376: Whole blood DNA methylation arrays of individuals with psoriatic arthritis, psoriasis vulgaris, and healthy control subjects. Gene Expression Omnibus, 2023. Available online: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE200376 (accessed on 2026-04-24).
NCBI Gene Expression Omnibus. GSE194315: RNA and surface epitope sequencing of single cells involved in spondyloarthritis. Gene Expression Omnibus, 2022. Available online: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194315 (accessed on 2026-04-24).
NCBI Gene Expression Omnibus. GSE186063: Multi-omics profiling reveals the interactions of host defense and hyper-keratinization in psoriasis. Gene Expression Omnibus. 2022. Available online: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE186063 (accessed on 2026-04-24).
NCBI Gene Expression Omnibus. GSE236694: DNA methylation patterns in CD4+ T cells separate psoriasis patients from healthy controls, and skin psoriasis from psoriatic arthritis I. Gene Expression Omnibus, 2023. Available online: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE236694 (accessed on 2026-04-24).
Johnsson, H.; Cole, J.; Siebert, S.; McInnes, I.B.; Graham, G. Cutaneous lesions in psoriatic arthritis are enriched in chemokine transcriptomic pathways. Arthritis Res. Ther. 2023, 25, 73. [Google Scholar] [CrossRef]
NCBI Gene Expression Omnibus. GSE205748: Psoriatic arthritis skin lesions and uninvolved bulk RNA-seq. Gene Expression Omnibus, 2023. Available online: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE205748 (accessed on 2026-04-24).
Castillo, R.L.; Sidhu, I.; Dolgalev, I.; Chu, T.; Prystupa, A.; Subudhi, I.; Yan, D.; Konieczny, P.; Hsieh, B.; Haberman, R.H.; et al. Spatial transcriptomics stratifies psoriatic disease severity by emergent cellular ecosystems. Sci. Immunol. 2023, 8, eabq7991. [Google Scholar] [CrossRef]
NCBI Gene Expression Omnibus. GSE202011: Spatial transcriptomics stratifies psoriatic disease by emergent cellular ecosystems. Gene Expression Omnibus. 2022. Available online: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE202011 (accessed on 2026-04-24).
Taylor, W.J.; Gladman, D.D.; Helliwell, P.S.; Marchesoni, A.; Mease, P.J.; Mielants, H. the CASPAR Study Group. Classification criteria for psoriatic arthritis: Development of new criteria from a large international study. Arthritis Rheum. 2006, 54, 2665–2673. [Google Scholar] [CrossRef] [PubMed]
Aryee, M.J.; Jaffe, A.E.; Corrada-Bravo, H.; Ladd-Acosta, C.; Feinberg, A.P.; Hansen, K.D.; Irizarry, R.A. Minfi: A flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 2014, 30, 1363–1369. [Google Scholar] [CrossRef]
Triche, T.J.J.; Weisenberger, D.J.; Van Den Berg, D.; Laird, P.W.; Siegmund, K.D. Low-level processing of Illumina Infinium DNA methylation BeadArrays. Nucleic Acids Res. 2013, 41, e90. [Google Scholar] [CrossRef]
Zhou, W.; Laird, P.W.; Shen, H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 2017, 45, e22. [Google Scholar] [CrossRef]
Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C.W.; Shi, W.; Smyth, G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [Google Scholar] [CrossRef] [PubMed]
Peters, T.J.; Buckley, M.J.; Statham, A.L.; Pidsley, R.; Samaras, K.; Lord, R.V.; Clark, S.J.; Molloy, P.L. De novo identification of differentially methylated regions in the human genome. Epigenet. Chromatin 2015, 8, 6. [Google Scholar] [CrossRef]
Salas, L.A.; Koestler, D.C.; Butler, R.A.; Hansen, H.M.; Wiencke, J.K.; Kelsey, K.T.; Christensen, B.C. An optimized library for reference-based deconvolution of whole-blood biospecimens assayed using the Illumina HumanMethylationEPIC BeadArray. Genome Biol. 2018, 19, 64. [Google Scholar] [CrossRef]
Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2 ed.; Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
Chen, Y.; Chen, L.; Lun, A.T.L.; Baldoni, P.; Smyth, G.K. edgeR v4: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and larger datasets. Nucleic Acids Res. 2025, 53, gkaf018. [Google Scholar] [CrossRef] [PubMed]
Hänzelmann, S.; Castelo, R.; Guinney, J. GSVA: gene set variation analysis for microarray and RNA-Seq data. BMC Bioinform. 2013, 14, 7. [Google Scholar] [CrossRef]
Chen, Y.; Lun, A.T.L.; Smyth, G.K. From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline. F1000Research 2016, 5, 1438. [Google Scholar] [CrossRef]
Wood, S.N. Generalized Additive Models: An Introduction with R, 2 ed.; Chapman and Hall/CRC: Boca Raton, FL, 2017. [Google Scholar] [CrossRef]
Korotkevich, G.; Sukhov, V.; Sergushichev, A. Fast gene set enrichment analysis. bioRxiv 2019, 060012. [Google Scholar] [CrossRef]
Wu, T.; Hu, E.; Xu, S.; Chen, M.; Guo, P.; Dai, Z.; Feng, T.; Zhou, L.; Tang, W.; Zhan, L.; et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. The Innovation 2021, 2, 100141. [Google Scholar] [CrossRef]
Phipson, B.; Maksimovic, J.; Oshlack, A. missMethyl: an R package for analyzing data from Illumina’s HumanMethylation450 platform. Bioinformatics 2016, 32, 286–288. [Google Scholar] [CrossRef]
Graßhoff, H.; Comdühr, S.; Nording, H.; von Esebeck, J.; Letz, P.; Prohaczka, M.; et al. Transition From Psoriasis to Psoriatic Arthritis Is Characterized by Distinct Alterations in Peripheral Blood Tc17, Th17, and CD4+ Effector Memory Cells. Arthritis Rheumatol. 2026, 78, 332–343. [Google Scholar] [CrossRef] [PubMed]
Kim, J.J.; Krueger, J.G. Highly Effective New Treatments for Psoriasis Target the IL-23/Type 17 T Cell Autoimmune Axis. Annu. Rev. Med. 2017, 68, 255–269. [Google Scholar] [CrossRef] [PubMed]
Zabotti, A.; De Marco, G.; Gossec, L.; Baraliakos, X.; Aletaha, D.; Iagnocco, A.; Gisondi, P.; Balint, P.V.; Bertheussen, H.; Boehncke, W.H.; et al. EULAR points to consider for the definition of clinical and imaging features suspicious for progression from psoriasis to psoriatic arthritis. Ann. Rheum. Dis. 2023, 82, 1162–1170. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Discovery of DIR and CRS in whole-blood methylation. (a) Two-dimensional DIR–CRS map in GSE200376 showing the distribution of healthy controls (HC), psoriasis without arthritis (PsO), and psoriatic arthritis (PsA) samples. (b) Group-wise DIR and CRS distributions demonstrate preferential separation of HC from PsO by DIR and of PsO from PsA by CRS. (c) Logistic modeling of normalized CRS defines a discovery-derived reference band used for downstream positioning of systemic states. Together, these data support a model in which psoriatic disease contains both a shared inflammatory axis and an additional PsA-related systemic immune-state axis.

Figure 2. Subject-level single-cell transcriptomic profiles recapitulate the systemic immune-state coordinate and position PSX between psoriasis and overt PsA at the systemic level. (a) Overall DIR–CRS map across HC, PsO, PSX, and PsA subjects in GSE194315. (b) Discovery-aligned normalized CRS values show a progressive shift from PsO toward the source-defined PSX arthralgia subgroup and PsA. (c) Among PSX subjects only, occupancy on the above-band, PsA-aligned side of the discovery-derived CRS reference band supports systemic enrichment within this symptom-defined subgroup rather than proving imminent arthritis conversion.

Figure 3. CRS localizes PSX-enriched high-CRS compartments and multicompartment pathway-state remodeling associated with the PsA-related systemic state. (a) The fraction of PSX subjects occupying high-CRS states differs by cell type, highlighting PSX-enriched high-CRS compartments such as CD8 naive T cells, NK cells, CD14 monocytes, and Treg cells. (b) Hallmark inflection mapping along the CRS continuum summarizes descriptive pathway-position patterns across multiple immune compartments. (c) A representative NK cytotoxicity curve shows pathway-state remodeling across the CRS continuum. (d) A representative Treg interferon-alpha response curve illustrates regulatory pathway-state remodeling across the same systemic CRS continuum. In panels (c,d), HC points are shown for visual context only and were not included in the pooled PsO/PSX/PsA CRS-continuum GAM fits.

Figure 4. Skin transcriptomic profiles primarily reflect shared inflammatory burden and show limited additional PsA-related separation within the same tissue state. (a) DIR and CRS distributions in GSE186063 skin RNA-seq samples show strong separation by inflammatory skin state. In this dataset, AS-H denotes healthy-appearing skin from ankylosing spondylitis comparators rather than skin from unaffected healthy volunteers. (b) Hallmark enrichment for psoriatic lesional skin versus the dataset-specific non-psoriatic comparator highlights robust inflammatory tissue activation. (c) Hallmark enrichment for PsA lesional skin versus PsO lesional skin shows that, despite minimal additional separation at the individual-gene level, ranked pathway-level differences remain detectable within the lesional state.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.