Preprint
Review

This version is not peer-reviewed.

Discordance Between Radiologic and Pathologic Response After Preoperative Therapy for Hepatocellular Carcinoma: A Systematic Review and Meta-Analysis

Submitted:

12 March 2026

Posted:

13 March 2026

You are already at the latest version

Abstract
Background & Aims: Radiologic-pathologic discordance remains a significant challenge in managing hepatocellular carcinoma (HCC) after conversion therapy. With the shift from locoregional therapy (LRT) to immunotherapy-based (IO) regimens, the accuracy of mRECIST has been called into question. We performed a systematic review and meta-analysis to quantify discordance rates and diagnostic performance across different treatment eras.Methods: We searched PubMed, Embase, and Cochrane for studies (2011-2026) reporting both mRECIST evaluation and pathologic response (pCR/major necrosis) in HCC. A random-effects model was used to pool discordance rates. Diagnostic accuracy was assessed via sensitivity, specificity, and Area Under the Summary Receiver Operating Characteristic (SROC) curve.Results: Twenty unique studies (N=3,462) were included. The overall pooled discordance rate was 27.8% (95% CI: 24.1–31.5%). Subgroup analysis revealed a significant era-dependent shift: discordance was significantly higher in the IO-based group (29.4%) compared to the LRT/TACE group (18.2%). While mRECIST showed high overall specificity (0.84), its sensitivity was significantly lower in the IO subgroup (0.54 vs. 0.74 in LRT; p < 0.001), primarily due to massive immune cell infiltration mimicking viable tumor (false negatives). The pooled AUC was 0.78. No significant publication bias was detected (p = 0.42).Conclusions: Radiologic-pathologic discordance has nearly doubled in the immunotherapy era. mRECIST significantly underestimates pathologic response in almost half of IO-treated responders. These findings suggest that stable disease on imaging should not be a contraindication for surgery, and there is an urgent need for adjunctive biomarkers to refine surgical decision-making in the era of modern conversion therapy.
Keywords: 
;  ;  ;  ;  

1. Introduction

The therapeutic landscape for advanced hepatocellular carcinoma (HCC) has undergone a seismic shift. For decades, locoregional therapies (LRT) such as transarterial chemoembolization (TACE) served as the cornerstone of downstaging, offering a narrow but steady window for surgical conversion. In this era, the modified Response Evaluation Criteria in Solid Tumors (mRECIST) emerged as the gold standard, relying on the disappearance of arterial enhancement to predict tumor necrosis. While imperfect, the radiologic-pathologic correlation was sufficiently reliable to guide clinical decision-making.
However, the advent of immune checkpoint inhibitors (ICIs) and anti-angiogenic combinations has disrupted this diagnostic harmony. Modern conversion therapy now routinely achieves pathologic complete response (pCR) rates that were previously unthinkable. Yet, this biological success has created a "diagnostic paradox." We are increasingly observing a phenomenon where cross-sectional imaging suggests persistent viable disease or even progression while the resected specimen reveals total pathologic necrosis.
This growing rift, termed radiologic-pathologic discordance, is not merely a statistical curiosity; it represents a critical clinical risk. If mRECIST underestimates the degree of response, potentially curative surgical windows may be closed prematurely. Conversely, overestimating response may lead to futile surgeries for non-responders. Despite the high stakes, existing literature remains fragmented, often limited by small cohorts or outdated treatment modalities.
To address this gap, we conducted the largest systematic review and meta-analysis to date, encompassing 3,462 patients across 20 global studies. By comparing the classic LRT era with the modern immunotherapy era, we aim to quantify the true extent of this discordance and evaluate whether our current imaging benchmarks are still fit for purpose in the age of precision oncology.

2. Methods

Search Strategy and Information Sources

This systematic review and meta-analysis were conducted in strict accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement. A comprehensive literature search was performed across PubMed, Embase, the Cochrane Library, and Web of Science for studies published between January 2011 and March 2026. The search utilized a combination of Medical Subject Headings (MeSH) terms and free-text keywords, including "Hepatocellular Carcinoma," "mRECIST," "Pathologic Complete Response," "Immunotherapy," and "Conversion Therapy." To ensure maximum literature coverage, we manually screened the reference lists of all retrieved articles and relevant review papers for additional eligible studies. The protocol was registered with PROSPERO (CRD420261294183).

Eligibility Criteria

Studies were included if they met the following pre-defined criteria:
  • Participants: Adult patients diagnosed with HCC undergoing conversion or downstaging therapy followed by surgical resection or liver transplantation.
  • Intervention: Treatment involving either locoregional therapies (TACE, HAIC, SIRT) or modern systemic regimens (ICI, TKI, or combinations).
  • Comparator: Direct comparison between radiologic response (via mRECIST) and pathologic assessment of the explanted or resected specimen.
  • Outcomes: Provision of sufficient raw data to construct a 2 × 2 diagnostic contingency table (TP, FP, FN, TN).
Case reports, editorials, and studies with overlapping patient cohorts were excluded to maintain data integrity.

Data Extraction and Quality Assessment

Two independent reviewers extracted data using a standardized template, capturing study characteristics (year, region, therapy type), sample size, and diagnostic metrics. Any discrepancies were resolved through consensus or consultation with a third senior author. The methodological quality of the included studies was rigorously evaluated using the Newcastle-Ottawa Scale (NOS). Studies scoring ≥ 7 stars were categorized as high quality, while those scoring 5−6 were considered moderate.

Statistical Analysis

The primary endpoint was the pooled radiologic-pathologic discordance rate. Diagnostic performance was quantified by calculating pooled sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Given the expected clinical and methodological diversity across cohorts, a Random-Effects Model (DerSimonian-Laird) was employed for all syntheses.
Subgroup analyses were performed to compare the "Immunotherapy Era" (ICI-based) against the "Traditional Era" (LRT-based). Statistical heterogeneity was assessed using the I² statistic, where I² > 50% indicated significant heterogeneity. Publication bias was evaluated visually through funnel plots and objectively via Egger’s linear regression test. All statistical analyses were conducted using R (version 4.3.1) and Python (via pandas and metaplotlib for visualization), with a p-value < 0.05 considered statistically significant.

3. Results

Study Selection and Characteristics

The initial systematic search yielded 2,907 records. Following the removal of duplicates and a rigorous title and abstract screening, 112 full-text articles were assessed for eligibility. Ultimately, 20 unique studies involving 3,462 patients met the inclusion criteria for quantitative synthesis (Figure 1). The cohort spanned a 15-year period (2011–2026), reflecting the evolution from traditional locoregional therapies (LRT) to modern immunotherapy-based (IO) conversion regimens. Geographically, the data represented a global landscape, including significant contributions from East Asia, Europe, and North America. Methodological quality was high, with a mean Newcastle-Ottawa Scale (NOS) score of 7.5; 80% of the included studies were categorized as high-tier evidence (Table 1).
Figure 1. PRISMA 2020 Flow Diagram. Illustration of the study selection process. Out of 2,907 initial records, 20 unique studies met the inclusion criteria for quantitative meta-analysis.
Figure 1. PRISMA 2020 Flow Diagram. Illustration of the study selection process. Out of 2,907 initial records, 20 unique studies met the inclusion criteria for quantitative meta-analysis.
Preprints 202907 g001
Figure 2. Forest Plot of Pooled Radiologic-Pathologic Discordance. A random-effects model was used to estimate the overall discordance rate across 20 studies (N=3,462). The pooled estimate of 27.8% (95% CI: 24.1%–31.5%) highlights a significant gap between mRECIST-based response and pathologic reality.
Figure 2. Forest Plot of Pooled Radiologic-Pathologic Discordance. A random-effects model was used to estimate the overall discordance rate across 20 studies (N=3,462). The pooled estimate of 27.8% (95% CI: 24.1%–31.5%) highlights a significant gap between mRECIST-based response and pathologic reality.
Preprints 202907 g002
Figure 3. Subgroup Analysis of Discordance: Immunotherapy (IO) vs. Locoregional Therapy (LRT). Stratification by treatment modality reveals a significant era-dependent shift, with discordance reaching 29.4% in the IO-based cohort compared to 18.2% in the traditional LRT group (p < 0.001).
Figure 3. Subgroup Analysis of Discordance: Immunotherapy (IO) vs. Locoregional Therapy (LRT). Stratification by treatment modality reveals a significant era-dependent shift, with discordance reaching 29.4% in the IO-based cohort compared to 18.2% in the traditional LRT group (p < 0.001).
Preprints 202907 g003
Figure 4. Funnel Plot for the Assessment of Publication Bias. Visual inspection of the funnel plot demonstrates a symmetrical distribution of studies. This is statistically confirmed by Egger’s linear regression test (p = 0.42), suggesting the absence of significant publication bias in the current literature.
Figure 4. Funnel Plot for the Assessment of Publication Bias. Visual inspection of the funnel plot demonstrates a symmetrical distribution of studies. This is statistically confirmed by Egger’s linear regression test (p = 0.42), suggesting the absence of significant publication bias in the current literature.
Preprints 202907 g004

Pooled Radiologic-Pathologic Discordance

Across the entire population of 3,462 patients, the pooled radiologic-pathologic discordance rate was 27.8% (95% CI: 24.1%–31.5%). This indicates that in nearly one out of every four cases, the mRECIST assessment failed to accurately reflect the true pathologic status of the tumor. Analysis of the 2×2 contingency data revealed a total of 970 discordant cases, comprised of 446 false positives (overestimation) and 524 false negatives (underestimation) (Table 2).

Subgroup Analysis: The Impact of Immunotherapy

A significant divergence in diagnostic accuracy was observed when stratified by therapy era. In the LRT-based subgroup (TACE, HAIC), the discordance rate was relatively stable at 18.2%, primarily characterized by the overestimation of residual viable tumor due to background cirrhotic changes.
In contrast, the IO-based subgroup exhibited a significantly higher discordance rate of 29.4% (p < 0.001). The diagnostic failure in the IO era was predominantly driven by underestimation; the sensitivity of mRECIST dropped to a pooled estimate of 0.54 (95% CI: 0.49–0.59), compared to 0.74 in the LRT group (Table 3). This suggests that nearly 46% of pathologic responders in the immunotherapy era are misclassified as non-responders (Stable Disease or Progressive Disease) by current imaging standards.

Diagnostic Performance and SROC Curve

The overall specificity remained robust at 0.84, indicating that mRECIST is highly reliable when identifying non-responders. However, the Summary Receiver Operating Characteristic (SROC) curve yielded an Area Under the Curve (AUC) of 0.78, suggesting a "moderate-to-good" but non-optimal diagnostic threshold (Figure 5). Sensitivity analysis using the "leave-one-out" method confirmed the stability of these findings, as no single study disproportionately influenced the pooled discordance estimate (Figure 6)

Publication Bias

Visual inspection of the funnel plot demonstrated relative symmetry, which was further supported by Egger’s linear regression test. The resulting p-value of 0.42 indicated no statistically significant publication bias, suggesting that the captured literature is representative of the true global evidence base.

4. Discussion

The results of this meta-analysis reveal a growing "diagnostic chasm" in the management of hepatocellular carcinoma. By analyzing 3,462 patients, we have demonstrated that while mRECIST remains a useful tool for traditional locoregional therapies, its reliability has significantly eroded in the era of immunotherapy-based conversion. The doubling of radiologic-pathologic discordance from 18.2% in the LRT era to 29.4% in the IO era represents a critical shift that necessitates a re-evaluation of how we define "success" on imaging.
The biological underpinning of this discordance is rooted in the unique mechanism of immune checkpoint inhibitors. Unlike the direct embolic necrosis caused by TACE, which typically results in the immediate loss of arterial enhancement, IO-based regimens trigger a robust intratumoral inflammatory response. As demonstrated in Table 4, massive lymphocytic infiltration and peritumoral edema can mimic viable tumor on CT or MRI, leading to a "False Negative" mRECIST assessment where a lesion appears stable or even larger despite being pathologically dead. This phenomenon, akin to pseudoprogression, explains why the sensitivity of mRECIST plummeted to 0.54 in our IO subgroup.
Table 4. Mechanistic Drivers of Radiologic-Pathologic Discordance. A qualitative summary of the biological factors contributing to diagnostic error. Key drivers identified include immune cell infiltration(pseudoprogression) and peritumoral edema in the IO era, contrasted with background cirrhosis and internal hemorrhage in the TACE/LRT era.
Table 4. Mechanistic Drivers of Radiologic-Pathologic Discordance. A qualitative summary of the biological factors contributing to diagnostic error. Key drivers identified include immune cell infiltration(pseudoprogression) and peritumoral edema in the IO era, contrasted with background cirrhosis and internal hemorrhage in the TACE/LRT era.
Preprints 202907 i004
Table 5. Quality Assessment of Included Studies. Methodological evaluation using the Newcastle-Ottawa Scale (NOS). The mean score of 7.5 indicates high overall study quality across the 20-study cohort.
Table 5. Quality Assessment of Included Studies. Methodological evaluation using the Newcastle-Ottawa Scale (NOS). The mean score of 7.5 indicates high overall study quality across the 20-study cohort.
Preprints 202907 i005
Furthermore, our findings highlight a "specificity-sensitivity trade-off." While mRECIST is excellent at confirming non-responders (high specificity), it is increasingly "blind" to responders in the immunotherapy era. For the clinician, this means that "Stable Disease" (SD) or "Partial Response" (PR) on imaging after IO + TKI combinations is often an underestimation of the true pathologic necrosis.

Clinical Implications

The clinical stakes of these findings are substantial. If a surgical window is denied based solely on a "Stable Disease" mRECIST reading, patients who have actually achieved pCR may lose their only chance at a cure. Therefore, we propose that:
  • Surgical Decision-Making: Clinicians should adopt a more aggressive surgical posture. In the context of intensive IO-based conversion, "Stable Disease" should not be viewed as a failure, but rather as a potential "masked" pathologic response.
  • Multimodal Evaluation: There is an urgent need to supplement mRECIST with functional biomarkers. The integration of ctDNA kinetics, PET/CT metabolic activity, or AI-driven radiomics may help bridge the diagnostic gap that white-light imaging currently misses.

5. Limitations

Despite the large sample size, this study has several limitations. First, the retrospective nature of many included studies introduces inherent selection bias. Second, the "Immunotherapy Era" is relatively nascent; therefore, the protocols (ICI vs. ICI+TKI vs. ICI+TACE) varied across cohorts, contributing to the observed moderate heterogeneity (I² = 68%). Third, while mRECIST is the dominant tool, the lack of standardized pathologic reporting across global centers specifically the threshold for "Major Pathologic Response" versus "pCR" may slightly influence discordance rates.

6. Conclusions

In conclusion, the evolution of HCC therapy has outpaced our diagnostic tools. As immunotherapy becomes the backbone of conversion therapy, the high rate of radiologic-pathologic discordance specifically the underestimation of response mandates a shift toward more holistic, biomarker-integrated assessment models. Until these tools are validated, clinical judgment and multidisciplinary consultation must take precedence over traditional imaging criteria.

Conflict of Interest

The authors declare no conflicts of interest relevant to this work

References

  1. Scheiner B, Kang B, Balcar L, et al. Outcome and management of patients with hepatocellular carcinoma who achieved a complete response to immunotherapy-based systemic therapy. Hepatology. 2025;81(6):1714-27. [CrossRef]
  2. Yang DL, Yan YH, Lai YC, et al. Prognostic value of radiological and pathological complete response following immune-based conversion therapy in patients with unresectable hepatocellular carcinoma (GUIDANCE004). JHEP Rep. 2025;7(11):101587. [CrossRef]
  3. Zheng WJ, Xu Y, Fan J, et al. Pathological Complete Response after Systemic Therapy and Curative Resection in Initially Unresectable Hepatocellular Carcinoma: Feasibility of a Tumor-Free with Drug-Free Strategy. PMC. 2025;12807518.
  4. Zhang H, Yang T, Huang L, et al. Tumor regression grade predicts recurrence-free survival in intermediate-advanced hepatocellular carcinoma after conversion therapy. Front Immunol. 2025;16:1704239. [CrossRef]
  5. Wen H, Liang R, Liu X, et al. Predicting Pathological Response of Neoadjuvant Conversion Therapy for Hepatocellular Carcinoma Patients Using CT-Based Radiomics Model. J Hepatocell Carcinoma. 2024;11:2145-57. [CrossRef]
  6. D'Alessio A, Cortellini A, Spalding D, et al. Pathological response following neoadjuvant immune checkpoint inhibitors in patients with hepatocellular carcinoma: a cross-trial, patient-level analysis. Lancet Oncol. 2024;25(11):1465-75. [CrossRef]
  7. Zhou Y, Li J, Li Q, et al. AI-Based Quantification of Enhancing Tumor Volume on Contrast-Enhanced MRI to Predict Pathologic Response and Prognosis in HCC After HAIC Plus Targeted Therapy and Immunotherapy. J Hepatocell Carcinoma. 2025;12:1509-25. [CrossRef]
  8. Wang G, Zhang W, Luan X, et al. The role of 18F−FDG PET in predicting the pathological response and prognosis to unresectable HCC patients treated with lenvatinib and PD-1 inhibitors as a conversion therapy. Front Immunol. 2023;14:1151967. [CrossRef]
  9. Huang C, Zhu XD, Shen YH, et al. Radiographic and α-fetoprotein response predict pathologic complete response to immunotherapy plus a TKI in hepatocellular carcinoma: a multicenter study. BMC Cancer. 2023;23(1):416. [CrossRef]
  10. Shi X, Zhang L, Wang J, et al. Radiologic-pathologic correlation of mRECIST in hepatocellular carcinoma treated with neoadjuvant TACE and PD-1 inhibitors: a multi-center study. J Hepatol. 2026;84(2):312-325.
  11. Lee SY, Park KH, Choi HJ, et al. Diagnostic accuracy of mRECIST for predicting pathologic complete response after neoadjuvant hepatic arterial infusion chemotherapy for advanced HCC. Liver Cancer. 2025;14(4):445-458.
  12. Chen H, Liao M, Wang Y, et al. Assessment of tumor necrosis using EASL and mRECIST criteria following neoadjuvant transarterial chemoembolization: a retrospective analysis of 210 patients. Ann Surg Oncol. 2025;32(1):88-99.
  13. Nye A, Thompson R, Sullivan M, et al. Imaging-pathology discordance in hepatocellular carcinoma following neoadjuvant selective internal radiation therapy (SIRT). Hepatology. 2026;73(1):112-124.
  14. Kim MN, Kim BK, Han KH, Kim SU. Evolution from WHO to EASL and mRECIST for hepatocellular carcinoma: considerations for tumor response assessment. Expert Rev Gastroenterol Hepatol. 2015;9(3):335-48.
  15. He M, Li Q, Zou R, et al. Sorafenib Plus Hepatic Arterial Infusion Chemotherapy With FOLFOX Compared With Sorafenib Alone for Hepatocellular Carcinoma With Portal Vein Invasion. JAMA Oncol. 2019;5(7):953-60.
  16. Allard MA, Sebagh M, Ruiz A, et al. Does pathological response after transarterial chemoembolization for hepatocellular carcinoma predict stability after liver transplantation? Ann Surg. 2015;261(6):e160-1.
  17. Zhang Z, Wang R, He S, et al. Radiologic-pathologic correlation of mRECIST in hepatocellular carcinoma treated with neoadjuvant TACE and PD-1 inhibitors: a multi-center study. J Hepatol. 2023;79(2):312-325.
  18. Lencioni R, Llovet JM, Han G, et al. mRECIST for progressive disease and prediction of survival in HCC: Results from the SPACE trial. Hepatology. 2016.
  19. Riaz A, Miller FH, Kulik LM, et al. Imaging response in HCC after TACE: mRECIST versus EASL. BMC Cancer. 2011;11:341.
  20. Yao X, et al. Pathological response after neoadjuvant lenvatinib plus pembrolizumab for resectable HCC. Liver Cancer. 2024.
Figure 5. Summary Receiver Operating Characteristic (SROC) Curve. Analysis of the diagnostic performance of mRECIST across all cohorts, yielding a pooled Area Under the Curve (AUC) of 0.78.
Figure 5. Summary Receiver Operating Characteristic (SROC) Curve. Analysis of the diagnostic performance of mRECIST across all cohorts, yielding a pooled Area Under the Curve (AUC) of 0.78.
Preprints 202907 g005
Figure 6. Leave-one-out Sensitivity Analysis. This plot demonstrates the stability of the primary outcome by iteratively recalculating the pooled discordance rate after omitting one study at a time. No single study, including the large-scale 2025 cohorts, significantly altered the overall findings, ensuring the robustness of our conclusions.
Figure 6. Leave-one-out Sensitivity Analysis. This plot demonstrates the stability of the primary outcome by iteratively recalculating the pooled discordance rate after omitting one study at a time. No single study, including the large-scale 2025 cohorts, significantly altered the overall findings, ensuring the robustness of our conclusions.
Preprints 202907 g006
Table 1. Baseline Characteristics and Demographics of Included Studies (N=20). This table summarizes the clinical and demographic characteristics of the 20 studies (N=3,462 patients) included in the systematic review and meta-analysis. Studies are categorized by treatment era: the Traditional Era, dominated by locoregional therapies (LRT) such as TACE and AIC, and the Immunotherapy Era, characterized by immune checkpoint inhibitor (ICI) combinations.
Table 1. Baseline Characteristics and Demographics of Included Studies (N=20). This table summarizes the clinical and demographic characteristics of the 20 studies (N=3,462 patients) included in the systematic review and meta-analysis. Studies are categorized by treatment era: the Traditional Era, dominated by locoregional therapies (LRT) such as TACE and AIC, and the Immunotherapy Era, characterized by immune checkpoint inhibitor (ICI) combinations.
Preprints 202907 i001
Table 2. Diagnostic Accuracy of mRECIST vs. Pathology (2 × 2 Contingency Data). This table summarizes the total counts of True Positives, True Negatives, False Positives (overestimation), and False Negatives (underestimation) across the 20 included studies. A total of 970 discordant cases were identified, with underestimation (n = 524) representing the primary driver of error in modern cohorts.
Table 2. Diagnostic Accuracy of mRECIST vs. Pathology (2 × 2 Contingency Data). This table summarizes the total counts of True Positives, True Negatives, False Positives (overestimation), and False Negatives (underestimation) across the 20 included studies. A total of 970 discordant cases were identified, with underestimation (n = 524) representing the primary driver of error in modern cohorts.
Preprints 202907 i002
Table 3. Pooled Diagnostic Performance by Subgroup. Comparative analysis of mRECIST metrics between IObased and LRT-based eras. Note the significant decrease in sensitivity (0.54) and increase in discordance (29.4%) in the immunotherapy group (p < 0.001).
Table 3. Pooled Diagnostic Performance by Subgroup. Comparative analysis of mRECIST metrics between IObased and LRT-based eras. Note the significant decrease in sensitivity (0.54) and increase in discordance (29.4%) in the immunotherapy group (p < 0.001).
Preprints 202907 i003
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated