Submitted:
08 June 2026
Posted:
09 June 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. What Is Industrial Anomaly Detection?
1.2. The Five Generalization Mechanisms
1.3. Why This Article
1.4. Comparison With Prior IAD Surveys
| Topical axes of the wave | Reciprocal axes | |||||||||
| Prior work | Year | Closed-set 2D |
FM era (CLIP/VFM) |
MLLM- AD |
3D / multimodal |
Synthesis | Open-world bench |
Eval audit |
Earlier-era method depth |
Cross-domain (med./aerial) |
| Liu et al. [2] | 2024 | • | ∘ | — | ∘ | ∘ | — | • | • | — |
| Xie et al. (IM-IAD)† [3] | 2024 | • | ∘ | ∘ | — | ∘ | — | • | • | — |
| Cao et al. [4] | 2024 | • | • | ∘ | • | ∘ | — | ∘ | • | — |
| Liang et al. [18] | 2025 | — | ∘ | ∘ | • | ∘ | — | — | ∘ | — |
| Lin et al. [17] | 2025 | • | • | • | • | • | — | • | • | — |
| Wang et al. (IAS) [19] | 2025 | — | — | ∘ | — | • | — | — | — | — |
| Cheng et al. (RWIDD) [22] | 2026 | • | ∘ | ∘ | • | ∘ | ∘ | — | • | — |
| This review | 2026 | ∘ | • | • | • | • | • | • | ∘ | — |
1.5. Scope And Corpus
1.6. Roadmap
| Section | Mechanism / topic | Anchor methods | Replaces what | Central tension |
| Section 2 | Established baseline | PatchCore, EfficientAD, DRAEM | — | MVTec-AD saturated |
| Section 3 | M1 Visual priors | CLIP, DINOv2, AnomalyVFM, MoECLIP | WinCLIP CLIP-adapter saturation | Saturation band crossed only by architectural change |
| Section 4 | M2 Reasoning priors | IAD-R1, JUDO, AD-FM, EAGLE | AnomalyGPT SFT-only template | MMAD gains regress binary AUC at deployment FPR |
| Section 5 | M3 Geometric / multimodal | UniMMAD, GS-CLIP, SiM3D | M3DM/CPMF memory-bank monoculture | MVTec-3D saturated; CAD-to-production gap unresolved |
| Section 6 | M4 Universal / MoE | AnomalyMoE, AdaptCLIP, UniSpector | UniAD per-dataset | Cross-domain transfer wall (UniSpector AP50 69.1%→14.1%) |
| Section 7 | M5 Synthesis priors | ARAS, QARAD, MAGIC, FAST | AnomalyDiffusion per-category | No tripartite synthesis benchmark adopted |
| Section 8 | Evaluation frontier | Kaputt, MMR-AD, ASBench | MVTec/VisA single-tier evaluation | Tier-1 benchmarks wide open |
| Section 9 | Cross-cutting bottlenecks | —(8 gaps catalogued) | — | No paper addresses >1 bottleneck |
| Section 10 | Agenda + reforms | — | — | Evidence needed for strong-version paradigm shift |
2. The Established Baseline Being Replaced
2.1. Established Method Lines and Their Saturation
The closed-set, normal-only era and its saturation.
The first multi-class wave: UniAD and one-model-per-dataset.
The first foundation-model bridge: WinCLIP descendants and their saturation.
The first MLLM bridge: AnomalyGPT and MMAD.
Multimodal and 3D foundations.
Anomaly synthesis foundations.
2.2. The Established Benchmark Ecosystem
- MVTec-AD [1]—15 categories, saturated above 99% I-AUROC since 2022.
- VisA [67]—12 categories, harder per-pixel metrics, still active.
- Real-IAD [70]—30 categories, ∼150K images, five viewpoints per part.
- MMAD [44]—the MLLM-AD VQA leaderboard.
- IM-IAD [16]—the deployment-realistic protocol (FPR-bound recall).
3. Generalization Mechanism 1—Pretrained Visual Priors
3.1. The CLIP Prompt-Engineering Line and Its Saturation
3.2. Architectural Responses
The pure-VFM wave (text-free).
Retrieval-augmented and memory-driven.
Training-free, non-conformity, and geometric variants.
3.3. Cross-Modal Bridges
LMM-assisted zero-shot segmentation (bridge to Section 4).
3D-meets-VFM (bridge to Section 5).
3.4. Honest Assessment
4. Generalization Mechanism 2—Language and Reasoning Priors
4.1. Capability Axes: Detection versus Reasoning
| Method | Venue | Base model | Paradigm | MMAD (1-shot) | Binary detection |
| AnomalyGPT | AAAI 2024 | LLaVA-7B | SFT | 36.52% | ∼60% acc. |
| AnomalyR1 | preprint 2025 | Qwen2-VL-3B | SFT+GRPO | 76.96% | ∼70% acc. |
| IAD-R1 | AAAI 2026 | LLaVA-OV-7B | PA-SFT+SC-GRPO | — | 86.1% acc. |
| JUDO | ICLR 2026 | Qwen2.5-VL-7B | SFT×2+GRPO | 81.20% | 65.04% |
| AD-FM | AAAI 2026 | Qwen2.5-VL-7B | GRPO (GIoU) | 83.56% | 73.15% |
| ADSeeker | CVPR 2026 | Qwen2.5-VL-7B | RAG (no RL) | 69.90% | ∼74% |
| Reason-IAD | preprint | Qwen3-VL-8B | Training-free | 79.43% | — |
| EAGLE | preprint 2026 | PatchCore+MLLM | Tuning-free | not reported | expert-aug. |
| Triad | ICCV 2025 | LLaVA-OV-7B | SFT | — | 94.1% |
| SAGE | ACM MM 2025 | InternVL2 | SFT+DPO | not reported | MANTA/MPDD |
| AD-Copilot | preprint | Qwen2.5-VL-7B | 4-stage+RLVR | 82.3% | 3.35× BBox |
| MAU-GPT | AAAI 2026 | ∼4B LLaVA-style | SFT (AMoE-LoRA) | 61.41% | — |
| ReADL | CVPR 2026 | unspecified | GRPO+CGRO | — | competitive |
4.2. Three Replacement Mechanisms
Training-paradigm shift: SFT to GRPO/RL.
Knowledge-grounded inference.
Visual comparison: the new architectural frontier.
4.3. Benchmarks: From MMAD to the Second Generation
4.4. Failure Modes
4.5. Honest Assessment
5. Generalization Mechanism 3—Geometric and Multimodal Priors
| Method | Venue | Modality | Benchmark(s) | Key metric | Notes |
| M3DM [50] | CVPR 2023 | RGB-D | MVTec-3D-AD | 94.0% O-AUROC | Foundation baseline |
| CPMF [53] | PR 2024 | RGB-D | MVTec-3D-AD | >95% O-AUROC | Per-modality scoring |
| PointAD [56] | NeurIPS 2024 | Point cloud | Real3D-AD, AS | Zero-shot 3D AD | PLM zero-shot scoring |
| G2SF [109] | ICCV 2025 | RGB-D | MVTec-3D-AD, Eyecandies | SOTA AUPRO@1% | Anisotropic memory metric |
| PASDF [110] | ICCV 2025 | Point cloud | Real3D-AD, AS | 80.2% O-AUROC | First detect-plus-repair via continuous SDF |
| SiM3D [59] | ICCV 2025 | RGB+PC, multi-view | SiM3D | Baselines fail synth2real | Single-instance; CAD-to-real benchmark |
| Reg2Inv [111] | NeurIPS 2025 | Point cloud | Real3D-AD, AS | SOTA on both | Registration-as-learning |
| CASL [112] | AAAI 2026 | Point cloud | Real3D-AD, AS | +5.6% O-AUROC vs. SSL | Multi-scale curvature beats PointMAE |
| MiniShift [58] | AAAI 2026 | Point cloud | MiniShift, Real3D-AD, AS | 80.4/92.3% O-AUROC | 500k-pt, <1% anomaly cov.; deep nets near-random |
| U-MV [113] | AAAI 2026 | Multi-view RGB | Real-IAD, MANTA | SOTA on both | Homography alignment; pose-free; 2D only |
| HPF-APC [114] | CVPR 2026 | Point cloud | Real3D-AD, AS | 84.2% O-AUROC | Patch codebook; angular/planar deformation |
| SeDiR [115] | CVPR 2026 | Point cloud | Real3D-AD, AS | +2.8/+9.1% vs. MC3D-AD | Multi-class; category-entanglement fix |
| UniMMAD [60] | CVPR 2026 | 12 modalities | 9 datasets, 66 classes | SOTA on all 9 | Cross-MoE decoder; anchor paper |
| IB-IUMAD [116] | CVPR 2026 | RGB-D | MVTec-3D-AD, Eyecandies | 91.0% O-AUROC | Info-bottleneck fusion; incremental |
| PIRN [117] | ICLR 2026 | RGB-D | MVTec-3D-AD, Eyecandies | Best 10-shot Eyecandies | Prototype codebook; few-shot multimodal |
| IMDD-1M [10] | CVPR 2026 | RGB+text | IMDD-1M | <5% data ≈ expert | 1.24M image-text pairs; 2D only |
| BTP [94] | CVPR 2026 | PC+CLIP 2D proj. | Real3D-AD, AS | Zero-shot; >PointAD pt-level | Cross-tag Section 3.3 |
| GS-CLIP [95] | CVPR 2026 | PC+CLIP 2D proj. | Real3D-AD, AS | Zero-shot SOTA | Geometry-aware prompts; cf. Section 3.3 |
5.1. The MVTec-3D-AD Saturation Problem
5.2. Stronger Geometric and Multimodal Representations
Rotation-invariance and geometric-feature rethinking.
RGB-D fusion quality.
5.3. Generalist, Language-Grounded, and Zero-Shot 3D
Multi-class unified 3D and multimodal AD.
Language-grounded industrial defects.
3D-meets-VFM revisited.
5.4. Open Problems
5.5. Honest Assessment
6. Generalization Mechanism 4—Universal Task Priors
6.1. The Arc From "One Model Per Class" To "One Model Per Domain"
6.2. Routing and Adapter Generalists
MoE as the convergent design pattern.
CLIP-adapter universal detectors.
6.3. Reference- and Prototype-Based Generalists
Language-free visual generalists.
Meta-learning with richer reference sets.
6.4. Multi-Class Unified and Open-Set Tracks
Multi-class unified track.
Open-set defect recognition.
6.5. Honest Assessment
7. Generalization Mechanism 5—Generated Abnormal Priors
7.1. Break from the AnomalyDiffusion Era: Five Simultaneous Dimensions
7.2. Method Families
Few-shot fine-tuning: MAGIC and SeaS.
Training-free one-shot: O2MAG and TF-IDG.
Zero-shot VLM-grounded: AnomalyPainter, Anomagic, AnoStyler.
Architecture alternatives: ARAS/QARAD and SynSur.
7.3. Evaluating Synthesis
Synthesis-as-benchmark: ASBench.
Reality check: what happens off MVTec-AD.
7.4. Honest Assessment
8. Evaluation Frontier—What Counts as a Real Advance?
8.1. Background: Established Benchmarks Are Saturated Or Constrained
8.2. The New Benchmark Catalog
| Failure mode exposed | Recommended benchmarks | Pri. task | What is hard |
| Open-world product diversity | Kaputt [8]; Real-IAD-Variety [156]; Real-IAD D3 [157]; 3CAD [158]; PKU-GoodsAD [159]; MANTA [160] | 2D, RGB-D, MV, Text | 48k items, arbitrary poses, ≤3 refs; viewpoint+material variation, tiny objects, real 3C parts |
| Multi-domain breadth, “one model many domains” | ADNet (380 categories); MMR-AD [9]; Omni-AD [161]; M3-AD [48]; MMAD [44] | 2D, VQA | 380+ categories across 8+ domains; reflection-aware multi-dimensional |
| High-resolution / 3D realism | MiniShift [58]; 3D-ADAM [162]; IEC3D-AD [163]; SiM3D [59]; MulSen-AD [164]; Real-IAD D3 [157] | 3D / RGB-D | 500k pts, % anomaly cov.; synth-to-real (CAD→scan); single-instance multiview |
| View-illumination interplay / harder 2D realism | M2AD [165]; MVTec-AD 2 [166]; VISION Datasets [167]; Texture-AD [168]; HSS-IAD [169] | 2D | illumination drift, harder logical anomalies, heterogeneous same-sort |
| Domain-specific specialized inspection | CPS2D-AD [170] (IC substrates); WFDD [171] (fabrics); InsPLAD [172] (power lines); CableInspect-AD [173]; CrashCar101; PeanutAD; CID; CXR-AD; RAD | 2D | micrometer-scale defects; specialized appearance and supervision |
| MLLM / VQA / reasoning evaluation | MMAD [44]; MMR-AD [9]; M3-AD [48]; MAU-Set [98]; Chat-AD [49]; Anomaly-OV [42] | VQA | multiple-choice and free-form defect QA; reasoning consistency |
| Open-vocabulary defect understanding | IMDD-1M [10] (1.24M pairs, 63 domains); UniSpector [142]; MAU-Set [98] | 2D, Text | language-grounded defect classes; cross-domain open-set |
| Synthesis evaluation (decoupled from detection) | ASBench [11]; Defect Spectrum [174] | 2D | synthesis quality vs. downstream detection delta |
| Long-tail / online / deployment dynamics | LTOAD [175] (eight streaming configs) | 2D | long-tailed online updates; class-agnostic concepts |
- MVTec-AD saturation (Section 8.1): PatchCore’s 99.1% I-AUROC (2022) is within 0.5 pp of the best 2026 numbers, inside the test-set noise floor [12,127,135,136].
- Open-world ceiling: Kaputt best unsupervised AUROC 56.96% under the prescribed ≤3-reference protocol over 48,376 unique items—no method clears 60% [8].
- Cross-domain transfer wall: UniSpector InsA drops from 69.1% to 14.1% AP50 between Real-IAD and 3CAD—a five-fold drop in the same model across imaging conditions [142].
- Class-count scaling wall: CCL drops from 90.6% I-AUROC in the all-in-all setting (MVTec+VisA+BTAD) to 65.2% on the COCO-derived COCOAD ()—a sharp bend rather than gradual decay [135].
8.3. A Three-Tier Evaluation Recommendation
Protocols and threshold rationale
8.4. Summary: Evaluation Is The Bottleneck
9. Cross-Cutting Bottlenecks and Failure Modes
9.1. The Cross-Domain Generalization Gap
9.2. The Pose-And-Identity-Variation Problem
9.3. Temporal Non-Stationarity
9.4. The Detection-Vs-Reasoning Trade-Off In MLLM-AD
9.5. The MLLM Grounding Gap
9.6. The Generation–Detection Evaluation Gap
9.7. The CAD-to-production Gap (3D)
9.8. The Medical-Industrial Transfer Gap
9.9. Failure-Mode Summary Matrix
| Bottleneck | M1 Visual | M2 Reasoning | M3 Geometric | M4 Universal | M5 Synthesis |
| Cross-domain generalization (Section 9.1) | × | × | × | × | × |
| Pose-and-identity (Section 9.2) | × | × | — | × | — |
| Temporal non-stationarity (Section 9.3) | × | × | × | × | × |
| Detection-vs-reasoning (Section 9.4) | — | × | — | (×) | (×) |
| MLLM grounding (Section 9.5) | — | × | — | — | — |
| Generation–detection evaluation (Section 9.6) | — | — | — | — | × |
| CAD-to-production (Section 9.7) | — | — | × | — | × |
| Medical-industrial transfer (Section 9.8) | × | × | × | × | × |
10. Outlook and Research Agenda
10.1. Five Concrete Problems
Problem 1: Open-world performance on Tier-1 benchmarks
Problem 2: Maintain detection AUC while gaining MLLM reasoning accuracy
Problem 3: Zero-shot 3D AD with language grounding (no 2D projection)
Problem 4: Predictive synthesis-quality metric
Problem 5: Generalist with >90% AUROC on a held-out industrial domain
10.2. Three Structural Recommendations
R1: Adopt three-tier evaluation (Section 8.3)
R2: Mandate tripartite synthesis evaluation
R3: Cross-group governance for MLLM-AD benchmarks
10.3. Adjacent Emerging Directions IAD Will Likely Absorb Next
10.4. On The "Paradigm Shift" Framing
- The shift is real at the method level for at least three of the five mechanisms: SFT→RL in Section 4; pure-VFM-replaces-CLIP-text in Section 3; the MoE convergent design pattern in Section 6; the training-free/VLM-grounded synthesis explosion in Section 7; the language-grounded industrial-defect substrate (IMDD-1M) in Section 5.
- The shift is not yet real at the deployment level: none of the five mechanisms beats classical methods (PatchCore, EfficientAD, DRAEM) at deployment-faithful operating points on hard benchmarks.
- The shift is partially circular in evaluation infrastructure (Section 4.4 MMAD/AD-Copilot overlap; Section 8.1 MVTec saturation; Section 9.6 metric mismatch).
10.5. Deployment Economics And The Certification Frontier
11. Conclusions
Disclosures.
References
- Bergmann, P.; Fauser, M.; Sattlegger, D.; Steger, C. MVTec AD – A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019; pp. 9592–9600. [Google Scholar] [CrossRef]
- Liu, J.; Xie, G.; Wang, J.; Li, S.; Wang, C.; Zheng, F.; Jin, Y. Deep Industrial Image Anomaly Detection: A Survey. Mach. Intell. Res. 2024, 21, 104–135. [Google Scholar] [CrossRef]
- Xie, G.; Wang, J.; Liu, J.; Lyu, J.; Liu, Y.; Wang, C.; Zheng, F.; Jin, Y. IM-IAD: Industrial Image Anomaly Detection Benchmark in Manufacturing. IEEE Trans. Cybern. 2024, 54, 2720–2733, Benchmark+protocol paper, not a survey. [Google Scholar] [CrossRef]
- Cao, Y.; Xu, X.; Zhang, J.; Cheng, Y.; Huang, X.; Pang, G.; Shen, W. A Survey on Visual Anomaly Detection: Challenge, Approach, and Prospect, 2024, [arXiv:cs.CV/2401.16402]. Cite key retained as `xieanomalysurvey2024’ for backward compatibility; actual first author is Cao. 2024. [Google Scholar] [CrossRef]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the Proceedings of the 38th International Conference on Machine Learning. PMLR, Proceedings of Machine Learning Research. 2021; Vol. 139, pp. 8748–8763. [Google Scholar]
- Oquab, M.; Darcet, T.; Moutakanni, T.; Vo, H.V.; Szafraniec, M.; Khalidov, V.; Fernandez, P.; Haziza, D.; Massa, F.; El-Nouby, A.; et al. DINOv2: Learning Robust Visual Features without Supervision. Transactions on Machine Learning Research 2024. [Google Scholar]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment Anything. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2023; pp. 4015–4026. [Google Scholar]
- Höfer, S.; Henning, D.F.; Amiranashvili, A.; Morrison, D.; Tzes, M.; Posner, I.; Matvienko, M.; Rennola, A.; Milan, A. Kaputt: A Large-Scale Dataset for Visual Defect Detection. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 24224–24233. [Google Scholar]
- Yao, X.; Qian, Z.; Shi, C.; Song, J.; Zhang, C. MMR-AD: A Large-Scale Multimodal Dataset for Benchmarking General Anomaly Detection with Multimodal Large Language Models. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 43072–43082. [Google Scholar]
- Ni, T.C.; Chen, C.C.; Yang, Y.F. Towards Open-Vocabulary Industrial Defect Understanding with a Large-Scale Multimodal Dataset. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 13059–13068. [Google Scholar]
- Zhang, Q.; Zhang, S.; Liu, J.; Wang, J.; Lei, X.; Xie, G.; Jiang, G.; Lu, Z. ASBench: Image Anomalies Synthesis Benchmark for Anomaly Detection. IEEE Transactions on Artificial Intelligence 2026. Accepted for publication. [Google Scholar] [CrossRef]
- Roth, K.; Pemula, L.; Zepeda, J.; Schölkopf, B.; Brox, T.; Gehler, P. Towards Total Recall in Industrial Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022; pp. 14318–14328. [Google Scholar] [CrossRef]
- Rolih, B.; Fučka, M.; Skočaj, D. SuperSimpleNet: Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection. In Proceedings of the Pattern Recognition: 27th International Conference, ICPR 2024, Proceedings, Part X; Springer, 2025; Vol. 15310, Lecture Notes in Computer Science; pp. 47–65. [Google Scholar] [CrossRef]
- Jeong, J.; Zou, Y.; Kim, T.; Zhang, D.; Ravichandran, A.; Dabeer, O. WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023; pp. 19606–19616. [Google Scholar] [CrossRef]
- Bergmann, P.; Fauser, M.; Sattlegger, D.; Steger, C. Uninformed Students: Student-Teacher Anomaly Detection With Discriminative Latent Embeddings. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020; pp. 4183–4192. [Google Scholar] [CrossRef]
- Xie, G.; Wang, J.; Liu, J.; Lyu, J.; Liu, Y.; Wang, C.; Zheng, F.; Jin, Y. IM-IAD: Industrial Image Anomaly Detection Benchmark in Manufacturing. IEEE Trans. Cybern. 2024, 54, 2720–2733. [Google Scholar] [CrossRef]
- Lin, Y.; Chang, Y.; Tong, X.; Yu, J.; Liotta, A.; Huang, G.; Song, W.; Zeng, D.; Wu, Z.; Wang, Y.; et al. A survey on RGB, 3D, and multimodal approaches for unsupervised industrial image anomaly detection. Inf. Fusion 2025, 121, 103139. [Google Scholar] [CrossRef]
- Liang, H.; Guo, B.; Huang, Y.; Lyu, J.; Gao, C.; Cao, Y.; Wang, J.; Yu, R.; Shen, L.; Li, P. 3D Anomaly Detection: A Survey. ResearchGate preprint, 2025. Living survey accompanying M-3LAB awesome-3d-anomalydetection repo. [CrossRef]
- Wang, Y.; Xu, X.; Liu, J.; Lei, X.; Xie, G.; Jiang, G.; Lu, Z. A Survey on Industrial Anomalies Synthesis. arXiv 2025. [Google Scholar] [CrossRef]
- Lu, X.; Liu, H.; Shang, F.; Hui, Y.; Wan, L. PDD: Manifold-Prior Diverse Distillation for Medical Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 28534–28544. [Google Scholar]
- Li, H.; Zhuang, Z.; Lin, J.; Liu, Y.; Chen, Y.; Peng, Q.; Yu, L.; Wang, L. FDP: A Frequency-Decomposition Preprocessing Pipeline for Unsupervised Anomaly Detection in Brain MRI. In Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence (AAAI); 2026; Vol. 40, pp. 6118–6126. [Google Scholar] [CrossRef]
- Cheng, Y.; Cao, Y.; Yao, H.; Luo, W.; Jiang, C.; Zhang, H.; Shen, W. A comprehensive survey for real-world industrial surface defect detection: Challenges, approaches, and prospects. J. Manuf. Syst. 2026, 84, 152–172. [Google Scholar] [CrossRef]
- Bachem, O.; Lucic, M.; Krause, A. Practical Coreset Constructions for Machine Learning. arXiv 2017, arXiv:stat. [Google Scholar] [CrossRef]
- Deng, H.; Li, X. Anomaly Detection via Reverse Distillation From One-Class Embedding. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022; pp. 9737–9746. [Google Scholar]
- Batzner, K.; Heckler, L.; König, R. EfficientAD: Accurate Visual Anomaly Detection at Millisecond-Level Latencies. In Proceedings of the Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), January 2024; pp. 127–137. [Google Scholar] [CrossRef]
- Zavrtanik, V.; Kristan, M.; Skočaj, D. DRAEM - A Discriminatively Trained Reconstruction Embedding for Surface Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021; pp. 8330–8339. [Google Scholar] [CrossRef]
- Zavrtanik, V.; Kristan, M.; Skočaj, D. DSR - A Dual Subspace Re-Projection Network for Surface Anomaly Detection. In Proceedings of the Computer Vision – ECCV 2022; Springer, 2022; Vol. 13691, Lecture Notes in Computer Science; pp. 539–554. [Google Scholar] [CrossRef]
- Liu, Z.; Zhou, Y.; Xu, Y.; Wang, Z. SimpleNet: A Simple Network for Image Anomaly Detection and Localization. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023; pp. 20402–20411. [Google Scholar]
- You, Z.; Cui, L.; Shen, Y.; Yang, K.; Lu, X.; Zheng, Y.; Le, X. A Unified Model for Multi-class Anomaly Detection. Proc. Adv. Neural Inf. Process. Syst. 2022, 35, 4571–4584. [Google Scholar]
- Lu, R.; Wu, Y.; Tian, L.; Wang, D.; Chen, B.; Liu, X.; Hu, R. Hierarchical Vector Quantized Transformer for Multi-class Unsupervised Anomaly Detection. In Proceedings of the Advances in Neural Information Processing Systems, 2023; Vol. 36. [Google Scholar]
- Gao, B.B. Learning to Detect Multi-class Anomalies with Just One Normal Image Prompt. In Proceedings of the Computer Vision – ECCV 2024; Springer, 2024; pp. 454–470. [Google Scholar] [CrossRef]
- Gao, B.B. MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning. In Proceedings of the Advances in Neural Information Processing Systems, 2024; Vol. 37. [Google Scholar]
- Li, X.; Zhang, Z.; Tan, X.; Chen, C.; Qu, Y.; Xie, Y.; Ma, L. PromptAD: Learning Prompts with only Normal Samples for Few-Shot Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024; pp. 16838–16848. [Google Scholar] [CrossRef]
- Zhou, Q.; Pang, G.; Tian, Y.; He, S.; Chen, J. AnomalyCLIP: Object-Agnostic Prompt Learning for Zero-Shot Anomaly Detection. In Proceedings of the International Conference on Learning Representations (ICLR), 2024. [Google Scholar]
- Cao, Y.; Zhang, J.; Frittoli, L.; Cheng, Y.; Shen, W.; Boracchi, G. AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection. In Proceedings of the Computer Vision – ECCV 2024; Springer Nature Switzerland, 2024; Vol. 15093, Lecture Notes in Computer Science; pp. 55–72. [Google Scholar] [CrossRef]
- Qu, Z.; Tao, X.; Prasad, M.; Shen, F.; Zhang, Z.; Gong, X.; Ding, G. VCP-CLIP: A Visual Context Prompting Model for Zero-Shot Anomaly Segmentation. In Proceedings of the Computer Vision – ECCV 2024; Lecture Notes in Computer Science; 2024; Vol. 15127, pp. 301–317. [Google Scholar] [CrossRef]
- Cao, Y.; Xu, X.; Cheng, Y.; Sun, C.; Du, Z.; Gao, L.; Shen, W. Personalizing Vision-Language Models With Hybrid Prompts for Zero-Shot Anomaly Detection. IEEE Trans. Cybern. 2025, 55, 1917–1929. [Google Scholar] [CrossRef] [PubMed]
- Li, S.; Cao, J.; Ye, P.; Ding, Y.; Tu, C.; Chen, T. ClipSAM: CLIP and SAM collaboration for zero-shot anomaly segmentation. Neurocomputing 2025, 618, 129122. [Google Scholar] [CrossRef]
- Gu, Z.; Zhu, B.; Zhu, G.; Chen, Y.; Tang, M.; Wang, J. AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2024, Vol. 38, 1932–1940. [Google Scholar] [CrossRef]
- Cao, Y.; Xu, X.; Sun, C.; Huang, X.; Shen, W. Towards Generic Anomaly Detection and Understanding: Large-scale Visual-linguistic Model (GPT-4V) Takes the Lead. arXiv 2023. [Google Scholar]
- Zhang, Y.; Cao, Y.; Xu, X.; Shen, W. LogiCode: An LLM-Driven Framework for Logical Anomaly Detection. IEEE Trans. Autom. Sci. Eng. 2025, 22, 7712–7723. [Google Scholar] [CrossRef]
- Xu, J.; Lo, S.Y.; Safaei, B.; Patel, V.M.; Dwivedi, I. Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 20370–20382. [Google Scholar]
- Deng, H.; Luo, H.; Zhai, W.; Guo, Y.; Cao, Y.; Kang, Y. VMAD: Visual-Enhanced Multimodal Large Language Model for Zero-Shot Anomaly Detection. IEEE Trans. Autom. Sci. Eng. 2026, 23, 3607–3618. [Google Scholar] [CrossRef]
- Jiang, X.; Li, J.; Deng, H.; Liu, Y.; Gao, B.B.; Zhou, Y.; Li, J.; Wang, C.; Zheng, F. MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection. In Proceedings of the The Thirteenth International Conference on Learning Representations (ICLR), 2025. [Google Scholar]
- Li, Y.; Cao, Y.; Liu, C.; Xiong, Y.; Dong, X.; Huang, C. IAD-R1: Reinforcing Consistent Reasoning in Industrial Anomaly Detection. Proc. AAAI Conf. Artif. Intell. 2026, 40, 6583–6591. [Google Scholar] [CrossRef]
- Kang, H.; Lee, W.; Kim, J.; Park, H. JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA. In Proceedings of the International Conference on Learning Representations (ICLR), 2026. [Google Scholar]
- Liao, J.; Su, Y.; Tu, R.C.; Jin, Z.; Sun, W.; Li, Y.; Xu, X.; Tao, D.; Yang, X. AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 15234–15242. [Google Scholar] [CrossRef]
- Huang, C.; Li, Y.; Cao, Y.; Wang, W.; Huang, H.; Wen, J.; Ren, W.; Cao, X. M3-AD: Reflection-aware Multi-modal, Multi-category, and Multi-dimensional Benchmark and Framework for Industrial Anomaly Detection. arXiv 2026. [Google Scholar]
- Jiang, X.; Guo, Y.; Li, J.; Liu, Y.; Gao, B.B.; Deng, H.; Liu, J.; Zhao, H.; Wang, C.; Zheng, F. AD-Copilot: A Vision-Language Assistant for Industrial Anomaly Detection via Visual In-context Comparison. arXiv 2026. [Google Scholar]
- Wang, Y.; Peng, J.; Zhang, J.; Yi, R.; Wang, Y.; Wang, C. Multimodal Industrial Anomaly Detection via Hybrid Fusion. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023; pp. 8032–8041. [Google Scholar]
- Pang, Y.; Wang, W.; Tay, F.E.H.; Liu, W.; Tian, Y.; Yuan, L. Masked Autoencoders for Point Cloud Self-Supervised Learning. In Proceedings of the Computer Vision – ECCV 2022; Springer, 2022; Vol. 13662, Lecture Notes in Computer Science; pp. 604–621. [Google Scholar] [CrossRef]
- Caron, M.; Touvron, H.; Misra, I.; Jégou, H.; Mairal, J.; Bojanowski, P.; Joulin, A. Emerging Properties in Self-Supervised Vision Transformers. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021; pp. 9630–9640. [Google Scholar] [CrossRef]
- Cao, Y.; Xu, X.; Shen, W. Complementary Pseudo Multimodal Feature for Point Cloud Anomaly Detection. Pattern Recognit. 2024, 156, 110761. [Google Scholar] [CrossRef]
- Liu, J.; Xie, G.; Chen, R.; Li, X.; Wang, J.; Liu, Y.; Wang, C.; Zheng, F. Real3D-AD: A Dataset of Point Cloud Anomaly Detection. In Proceedings of the Advances in Neural Information Processing Systems, 2023; Datasets and Benchmarks Track; Vol. 36. [Google Scholar]
- Bergmann, P.; Jin, X.; Sattlegger, D.; Steger, C. The MVTec 3D-AD Dataset for Unsupervised 3D Anomaly Detection and Localization. Proceedings of the Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2022) - 2022, Volume 5, 202–213. [Google Scholar] [CrossRef]
- Zhou, Q.; Yan, J.; He, S.; Meng, W.; Chen, J. PointAD: Comprehending 3D Anomalies from Points and Pixels for Zero-shot 3D Anomaly Detection. Proc. Adv. Neural Inf. Process. Syst. 2024, Vol. 37, 84866–84896. [Google Scholar]
- Xue, L.; Gao, M.; Xing, C.; Martín-Martín, R.; Wu, J.; Xiong, C.; Xu, R.; Niebles, J.C.; Savarese, S. ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023; pp. 1179–1189. [Google Scholar]
- Cheng, Y.; Sun, Y.; Zhang, H.; Shen, W.; Cao, Y. Towards High-Resolution 3D Anomaly Detection: A Scalable Dataset and Real-Time Framework for Subtle Industrial Defects. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 3327–3334. [Google Scholar] [CrossRef]
- Costanzino, A.; Ramirez, P.Z.; Lella, L.; Ragaglia, M.; Oliva, A.; Lisanti, G.; Di Stefano, L. SiM3D: Single-instance Multiview Multimodal and Multisetup 3D Anomaly Detection Benchmark. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 20944–20953. [Google Scholar]
- Zhao, Y.; Pang, Y.; Zhang, L.; Liu, H.; Zuo, J.; Lu, H.; Zhao, X. UniMMAD: Unified Multi-Modal and Multi-Class Anomaly Detection via MoE-Driven Feature Decompression. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 28502–28511. [Google Scholar]
- Li, C.L.; Sohn, K.; Yoon, J.; Pfister, T. CutPaste: Self-Supervised Learning for Anomaly Detection and Localization. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021; pp. 9664–9674. [Google Scholar] [CrossRef]
- Hu, T.; Zhang, J.; Yi, R.; Du, Y.; Chen, X.; Liu, L.; Wang, Y.; Wang, C. AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2024, Vol. 38, 8526–8534. [Google Scholar] [CrossRef]
- Zhang, X.; Xu, M.; Zhou, X. RealNet: A Feature Selection Network with Realistic Synthetic Anomaly for Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024; pp. 16699–16708. [Google Scholar]
- Hu, J.; Huang, Y.; Lu, Y.; Xie, G.; Jiang, G.; Zheng, Y.; Lu, Z. AnomalyXFusion: Multi-modal Anomaly Synthesis with Diffusion. arXiv 2024. [Google Scholar]
- Xu, X.; Wang, Y.; Wang, J.; Lei, X.; Xie, G.; Jiang, G.; Lu, Z. FAST: Foreground-aware Diffusion with Accelerated Sampling Trajectory for Segmentation-oriented Anomaly Synthesis. In Proceedings of the Advances in Neural Information Processing Systems, 2025. [Google Scholar]
- Dai, Z.; Zeng, S.; Liu, H.; Li, X.; Xue, F.; Zhou, Y. SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 23135–23144. [Google Scholar]
- Zou, Y.; Jeong, J.; Pemula, L.; Zhang, D.; Dabeer, O. SPot-the-Difference Self-supervised Pre-training for Anomaly Detection and Segmentation. Proceedings of the Computer Vision – ECCV 2022 2022, Vol. 13690, Lecture Notes in Computer Science. 392–408. [Google Scholar] [CrossRef]
- Jezek, S.; Jonak, M.; Burget, R.; Dvorak, P.; Skotak, M. Deep Learning-Based Defect Detection of Metal Parts: Evaluating Current Methods in Complex Conditions. In Proceedings of the 2021 13th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT), 2021; pp. 66–71. [Google Scholar] [CrossRef]
- Mishra, P.; Verk, R.; Fornasier, D.; Piciarelli, C.; Foresti, G.L. VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization. In Proceedings of the 2021 IEEE 30th International Symposium on Industrial Electronics (ISIE), June 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Wang, C.; Zhu, W.; Gao, B.B.; Gan, Z.; Zhang, J.; Gu, Z.; Qian, S.; Chen, M.; Ma, L. Real-IAD: A Real-World Multi-View Dataset for Benchmarking Versatile Industrial Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024; pp. 22883–22892. [Google Scholar]
- Li, W.; Xu, X.; Gu, Y.; Zheng, B.; Gao, S.; Wu, Y. Towards Scalable 3D Anomaly Detection and Localization: A Benchmark via 3D Anomaly Synthesis and A Self-Supervised Learning Network. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024; pp. 22207–22216. [Google Scholar] [CrossRef]
- Gao, B.B.; Zhou, Y.; Yan, J.; Cai, Y.; Zhang, W.; Wang, M.; Liu, J.; Liu, Y.; Wang, L.; Wang, C. AdaptCLIP: Adapting CLIP for Universal Visual Anomaly Detection. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 4095–4103. [Google Scholar] [CrossRef]
- Zhu, J.; Ong, Y.S.; Shen, C.; Pang, G. Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 22241–22251. [Google Scholar] [CrossRef]
- Qu, Z.; Tao, X.; Gong, X.; Qu, S.; Chen, Q.; Zhang, Z.; Wang, X.; Ding, G. Bayesian Prompt Flow Learning for Zero-Shot Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 30398–30408. [Google Scholar] [CrossRef]
- Ma, W.; Zhang, X.; Yao, Q.; Tang, F.; Wu, C.; Li, Y.; Yan, R.; Jiang, Z.; Zhou, S.K. AA-CLIP: Enhancing Zero-Shot Anomaly Detection via Anomaly-Aware CLIP. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025; pp. 4744–4754. [Google Scholar]
- Shao, Y.; Wang, L.; Li, C.; Chen, P.; Liu, Q. PromptMoE: Generalizable Zero-Shot Anomaly Detection via Visually-Guided Prompt Mixtures. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 8878–8886. [Google Scholar] [CrossRef]
- Park, J.Y.; Seo, J.; Kang, M.; Park, Y.R. MoECLIP: Patch-Specialized Experts for Zero-shot Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 35534–35544. [Google Scholar]
- Chen, Q.; Qu, Z.; Luo, W.; Yao, H.; Cao, Y.; Jiang, Y.; Duan, Y.; Luo, H.; Lv, C.; Zhang, Z. CoPS: Conditional Prompt Synthesis for Zero-Shot Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings, June 2026. [Google Scholar]
- Hu, M.; Huo, Y.; Dou, M.; Yin, J.; Zhao, P.; Wang, Y.; Hu, C.; Hu, B.; Wang, Q. FB-CLIP: Fine-Grained Zero-Shot Anomaly Detection with Foreground-Background Disentanglement. arXiv 2026. [Google Scholar]
- Li, X.; Xue, F.; Zhou, Y. MuSc-V2: Zero-Shot Multimodal Industrial Anomaly Classification and Segmentation with Mutual Scoring of Unlabeled Samples. IEEE Transactions on Pattern Analysis and Machine Intelligence 2026. Early Access. [Google Scholar] [CrossRef]
- He, J.; Cao, M.; Peng, S.; Xie, Q. RareCLIP: Rarity-aware Online Zero-shot Industrial Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 24478–24487. [Google Scholar]
- Fučka, M.; Zavrtanik, V.; Skočaj, D. AnomalyVFM – Transforming Vision Foundation Models into Zero-Shot Anomaly Detectors. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 35555–35566. [Google Scholar]
- Zhai, G.; Zhou, Y.; Deng, X.; Heckler-Kram, L.; Navab, N.; Busam, B. Foundation Visual Encoders Are Secretly Few-Shot Anomaly Detectors. In Proceedings of the International Conference on Learning Representations (ICLR), 2026. [Google Scholar]
- Lendering, C.; Akdag, E.; Bondarau, E. SubspaceAD: Training-Free Few-Shot Anomaly Detection via Subspace Modeling. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 28557–28566. [Google Scholar]
- Damm, S.; Laszkiewicz, M.; Lederer, J.; Fischer, A. AnomalyDINO: Boosting Patch-Based Few-Shot Anomaly Detection with DINOv2. In Proceedings of the Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), February 2025; pp. 1319–1329. [Google Scholar] [CrossRef]
- Hou, Y.; Li, P.; Liu, Z.; Wang, Y.; Ruan, Y.; Qiu, J.; Xu, K. VisualAD: Language-Free Zero-Shot Anomaly Detection via Vision Transformer. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026. [Google Scholar]
- Xu, C.; Lv, C.; Chen, Q.; Zhang, F.; Zhang, Z. MRAD: Zero-Shot Anomaly Detection with Memory-Driven Retrieval. In Proceedings of the The Fourteenth International Conference on Learning Representations (ICLR), 2026. [Google Scholar]
- Cai, M.; Zhang, Z.; Wu, G.; Chai, T.; Zhu, X. RAID: Retrieval-Augmented Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 21367–21378. [Google Scholar]
- Tian, L.; Li, Y.; Dai, Y.; Chen, W.; Liu, X.; Chen, B. FastRef: Fast Prototype Refinement for Few-Shot Industrial Anomaly Detection. arXiv 2025. [Google Scholar]
- Qu, Z.; Tao, X.; Gong, X.; Qu, S.; Zhang, X.; Wang, X.; Shen, F.; Zhang, Z.; Prasad, M.; Ding, G. DictAS: A Framework for Class-Generalizable Few-Shot Anomaly Segmentation via Dictionary Lookup. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 20519–20528. [Google Scholar]
- Bhunia, A.; Li, C.; Bilen, H. Odd-One-Out: Anomaly Detection by Comparing with Neighbors. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 20395–20404. [Google Scholar]
- Wang, F.; Zhang, T.; Wang, Y.; Qiu, Y.; Liu, X.; Guo, X.; Cui, Z. Distribution Prototype Diffusion Learning for Open-set Supervised Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 20416–20426. [Google Scholar] [CrossRef]
- Qu, Z.; Tao, X.; Bao, X.; Wang, D.; Qu, S.; Zhang, Z.; Wang, X. AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026; pp. 14126–14136. [Google Scholar]
- Li, K.; Li, G.; Zhou, M.; Li, M.; Han, D.; Wan, J. Back to Point: Exploring Point-Language Models for Zero-Shot 3D Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 14167–14177. [Google Scholar]
- Deng, Z.; Liu, A.; Wang, Y. GS-CLIP: Zero-shot 3D Anomaly Detection by Geometry-Aware Prompt and Synergistic View Representation Learning. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 35587–35596. [Google Scholar]
- Jin, Y.; Feng, Y.; Zhang, J.; Wang, P.; Liu, Q.; Wang, Y. Reasoning-Driven Anomaly Detection and Localization with Image-Level Supervision. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026. [Google Scholar]
- Li, Y.; Yuan, S.; Wang, H.; Li, Q.; Liu, M.; Xu, C.; Shi, G.; Zuo, W. Triad: Empowering LMM-based Anomaly Detection with Expert-guided Region-of-Interest Tokenizer and Manufacturing Process. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 21917–21926. [Google Scholar]
- Wang, Z.; Fan, Z.; Tan, S.; Zhong, Y.; Yuan, Y.; Li, H.; Jiang, H.; Zhang, W.; Shao, F.; Wang, H.; et al. MAU-GPT: Enhancing Multi-type Industrial Anomaly Understanding via Anomaly-aware and Generalist Experts Adaptation. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 26787–26795. [Google Scholar] [CrossRef]
- Chao, Y.; Liu, J.; Tang, J.; Wu, G. AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection. arXiv 2025. [Google Scholar]
- Zhang, K.; Zhang, Z.; Sun, X.; Wang, A.; Nie, J.; Chen, Q.; Hao, H.; Guo, J.; Zhang, J. ADSeeker: A Knowledge-Grounded Reasoning Framework for Industry Anomaly Detection and Reasoning. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026; pp. 21379–21388. [Google Scholar]
- Chen, P.; Huang, C.; Cao, Y.; Liu, C.; Wang, W.; Wang, W.; Yang, M.; Shen, L.; Ren, W.; Cao, X. Towards Explainable Industrial Anomaly Detection via Knowledge-Guided Latent Reasoning. arXiv 2026. [Google Scholar] [CrossRef]
- Peng, X.; Huang, X.; Choi, S.H. EAGLE: Expert-Augmented Attention Guidance for Tuning-Free Industrial Anomaly Detection in Multimodal Large Language Models. arXiv 2026. [Google Scholar]
- Li, Z.; Yu, Z.; Ye, Q.; Xie, W.; Zhuo, W.; Shen, L. IAD-GPT: Advancing Visual Knowledge in Multimodal Large Language Model for Industrial Anomaly Detection. IEEE Trans. Instrum. Meas. 2025, 74, 1–12. [Google Scholar] [CrossRef]
- Zang, G.; Li, X.; Di, D.; Nie, L.; Zhan, D.; Song, Y.; Fan, L. SAGE: A Visual Language Model for Anomaly Detection via Fact Enhancement and Entropy-aware Alignment. In Proceedings of the Proceedings of the 33rd ACM International Conference on Multimedia, 2025; pp. 5030–5039. [Google Scholar] [CrossRef]
- Zhao, S.; Lin, Y.; Han, L.; Zhao, Y.; Wei, Y. OmniAD: Detect and Understand Industrial Anomaly via Multimodal Reasoning. arXiv 2025. [Google Scholar] [CrossRef]
- Fučka, M.; Zavrtanik, V.; Skočaj, D. SALAD – Semantics-Aware Logical Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 21843–21852. [Google Scholar]
- Xu, Y.; Zhang, H.; Ma, Y.; Zhu, Y.; Ting, K.M. SCoNE: Spherical Consistent Neighborhoods Ensemble for Effective and Efficient Multi-View Anomaly Detection. Proc. AAAI Conf. Artif. Intell. 2026, 40, 16083–16090. [Google Scholar] [CrossRef]
- Zhang, Q.; Shao, M.; Chen, X.; Lv, X.; Xu, K. Wave-MambaAD: Wavelet-driven State Space Model for Multi-class Unsupervised Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); 2025; pp. 20868–20877. [Google Scholar]
- Tao, C.; Cao, X.; Du, J. G2SF: Geometry-Guided Score Fusion for Multimodal Industrial Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 20551–20560. [Google Scholar]
- Zheng, B.; Gan, J.; Xu, X.; Chen, X.; Li, W.; Huang, X.; Ni, N.; Wu, Y. Bridging 3D Anomaly Localization and Repair via High-Quality Continuous Geometric Representation. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025. [Google Scholar]
- Yu, Y.; Chen, Z.; Xu, X.; Zhang, L.; Yang, H.; Nie, Y.; He, S. Registration is a Powerful Rotation-Invariance Learner for 3D Anomaly Detection. In Proceedings of the Advances in Neural Information Processing Systems, 2025. [Google Scholar]
- Zha, Y.; Xue, Y.; Fan, C.; Wang, Y.; Dai, T.; Chen, K.; Xia, S.T. CASL: Curvature-Augmented Self-supervised Learning for 3D Anomaly Detection. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 12340–12348. [Google Scholar] [CrossRef]
- Chen, X.; Xu, X.; Zheng, B.; Liu, Y.; Wu, Y. Unsupervised Multi-View Visual Anomaly Detection via Progressive Homography-Guided Alignment. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 3065–3073. [Google Scholar] [CrossRef]
- Kang, X.; Li, Z.; Lan, T.; Gong, D.; Khoshelham, K.; Nan, L. Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for 3D Shape Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026. [Google Scholar]
- Kim, S.; Lee, W.; Cho, M. A Semantically Disentangled Unified Model for Multi-category 3D Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 33036–33045. [Google Scholar]
- Long, K.; Ma, L.; Liu, J.; Liu, L.; Xie, G. Towards an Incremental Unified Multimodal Anomaly Detection: Augmenting Multimodal Denoising From an Information Bottleneck Perspective. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 14116–14125. [Google Scholar]
- Li, Y.; Yang, X.; Zhang, J.; Tian, S.; Liao, J.; Liu, F. PIRN: Prototypical-based Intra-modal Reconstruction with Normality Communication for Multi-modal Anomaly Detection. In Proceedings of the The Fourteenth International Conference on Learning Representations (ICLR), 2026. [Google Scholar]
- Costanzino, A.; Ramirez, P.Z.; Lisanti, G.; Di Stefano, L. Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024; pp. 17234–17243. [Google Scholar] [CrossRef]
- Ye, J.; Zhao, W.; Yang, X.; Cheng, G.; Huang, K. PO3AD: Predicting Point Offsets toward Better 3D Point Cloud Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 1353–1362. [Google Scholar]
- Huang, C.; Guan, H.; Jiang, A.; Zhang, Y.; Spratling, M.W.; Wang, Y.F. Registration Based Few-Shot Anomaly Detection. In Proceedings of the Computer Vision – ECCV 2022; Springer, 2022; Vol. 13684, Lecture Notes in Computer Science; pp. 303–319. [Google Scholar] [CrossRef]
- Zhu, H.; Xie, G.; Hou, C.; Dai, T.; Gao, C.; Wang, J.; Shen, L. Towards High-resolution 3D Anomaly Detection via Group-Level Feature Contrastive Learning. In Proceedings of the Proceedings of the 32nd ACM International Conference on Multimedia, 2024; pp. 4680–4689. [Google Scholar]
- Lin, Y.; Yan, H.; Tong, X.; Chang, Y.; Wang, H.; Zhou, Z.; Gao, S.; Wang, Y.; Zhang, W. Commonality in Few: Few-Shot Multimodal Anomaly Detection via Hypergraph-Enhanced Memory. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 7015–7023. [Google Scholar] [CrossRef]
- Cheng, J.; Gao, C.; Zhou, J.; Wen, J.; Dai, T.; Wang, J. MC3D-AD: A Unified Geometry-aware Reconstruction Model for Multi-category 3D Anomaly Detection. In Proceedings of the Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), 2025; pp. 837–845. [Google Scholar] [CrossRef]
- Tang, J.; Lu, H.; Xu, X.; Wu, R.; Hu, S.; Zhang, T.; Cheng, T.W.; Ge, M.; Chen, Y.C.; Tsung, F. An Incremental Unified Framework for Small Defect Inspection. In Proceedings of the Computer Vision – ECCV 2024; Lecture Notes in Computer Science, 2024; pp. 307–324. [Google Scholar] [CrossRef]
- Zhu, J.; Pang, G. Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024; pp. 17826–17836. [Google Scholar]
- Yao, X.; Chen, Z.; Gao, C.; Zhai, G.; Zhang, C. ResAD: A Simple Framework for Class Generalizable Anomaly Detection. Proc. Adv. Neural Inf. Process. Syst. 2024, Vol. 37, 125287–125311. [Google Scholar]
- Gu, Z.; Zhu, B.; Zhu, G.; Chen, Y.; Ge, W.; Tang, M.; Wang, J. AnomalyMoE: Towards a Language-free Generalist Model for Unified Visual Anomaly Detection. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 4348–4356. [Google Scholar] [CrossRef]
- Lee, Y.; Kim, S.; Moon, D.; Jang, S.; Yoon, H. Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 35577–35586. [Google Scholar]
- Luo, W.; Cao, Y.; Yao, H.; Zhang, X.; Lou, J.; Cheng, Y.; Shen, W.; Yu, W. Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 9974–9983. [Google Scholar]
- Gu, Z.; Zhu, B.; Zhu, G.; Chen, Y.; Tang, M.; Wang, J. UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 15194–15203. [Google Scholar]
- Sadikaj, Y.; Zhou, H.; Halilaj, L.; Schmid, S.; Staab, S.; Plant, C. MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 22978–22988. [Google Scholar]
- Wang, Y.; Wang, X.; Gong, Y.; Xiao, J. Normal-Abnormal Guided Generalist Anomaly Detection. In Proceedings of the Advances in Neural Information Processing Systems, 2025. [Google Scholar]
- Lu, R.; Liu, G.; Li, K.; Tian, L.; Zhang, J. MaskAD: Parallel Masked Autoencoder for Multi-class Unsupervised Anomaly Detection. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 15457–15465. [Google Scholar] [CrossRef]
- He, H.; Bai, Y.; Zhang, J.; He, Q.; Chen, H.; Gan, Z.; Wang, C.; Li, X.; Tian, G.; Xie, L. MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection. In Proceedings of the Advances in Neural Information Processing Systems, 2024; Vol. 37. [Google Scholar]
- Fan, L.; Huang, J.; Di, D.; Su, A.; Song, T.; Pagnucco, M.; Song, Y. Salvaging the Overlooked: Leveraging Class-Aware Contrastive Learning for Multi-Class Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 21419–21428. [Google Scholar] [CrossRef]
- Guo, J.; Lu, S.; Zhang, W.; Chen, F.; Li, H.; Liao, H. Dinomaly: The Less Is More Philosophy in Multi-Class Unsupervised Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 20405–20415. [Google Scholar] [CrossRef]
- Guo, J.; Lu, S.; Fan, L.; Li, Z.; Di, D.; Song, Y.; Zhang, W.; Zhu, W.; Yan, H.; Chen, F.; et al. One Dinomaly2 Detect Them All: A Unified Framework for Full-Spectrum Unsupervised Anomaly Detection. arXiv 2025. [Google Scholar] [CrossRef]
- Wei, S.; Jiang, J.; Xu, X. UniNet: A Contrastive Learning-guided Unified Framework with Feature Selection for Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 9994–10003. [Google Scholar]
- Zhang, Z.; Cai, M.; Wang, H.; Wu, G.; Chai, T.; Zhu, X. CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering. In Proceedings of the Proceedings of the 42nd International Conference on Machine Learning. PMLR, 2025; Vol. 267, Proceedings of Machine Learning Research. pp. 74540–74564. [Google Scholar]
- Wang, X.; Wang, X.; Bai, H.; Lim, E.G.; Xiao, J. DecAD: Decoupling Anomalies in Latent Space for Multi-Class Unsupervised Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 21568–21577. [Google Scholar]
- Beizaee, F.; Lodygensky, G.A.; Desrosiers, C.; Dolz, J. Correcting Deviations from Normality: A Reformulated Diffusion Model for Multi-Class Unsupervised Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 19088–19097. [Google Scholar] [CrossRef]
- Kim, G.; Kim, M.; Lee, K.; Kim, M.; Jeon, H.; Han, J.; Lim, H.; Yim, J. UniSpector: Towards Universal Open-set Defect Recognition via Spectral-Contrastive Visual Prompting. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 6261–6270. [Google Scholar]
- Yao, X.; Luo, Y.; Qian, Z.; Zhang, C. ADPretrain: Advancing Industrial Anomaly Detection via Anomaly Representation Pretraining. In Proceedings of the Advances in Neural Information Processing Systems, 2025; Vol. 38. [Google Scholar]
- So, Y.; Kang, S. AnoStyler: Text-Driven Localized Anomaly Generation via Lightweight Style Transfer. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 15734–15742. [Google Scholar] [CrossRef]
- Jiang, Y.; Luo, W.; Zhang, H.; Chen, Q.; Yao, H.; Shen, W.; Cao, Y. Anomagic: Crossmodal Prompt-driven Zero-shot Anomaly Generation. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 5485–5493. [Google Scholar] [CrossRef]
- Lai, Z.; Lu, Y.; Li, X.; Lin, J.; Qu, Y.; Li, M.; Cao, L. AnomalyPainter: Vision-Language-Diffusion Synergy for Realistic and Diverse Unseen Industrial Anomaly Synthesis. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 5800–5808. [Google Scholar] [CrossRef]
- Qian, L.; Zhu, B.; Chen, Y.; Tang, M.; Wang, J. Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection. Proc. Proc. AAAI Conf. Artif. Intell. (AAAI) 2026, Vol. 40, 15626–15634. [Google Scholar] [CrossRef]
- Kühn, P.J.; Pommeranz, M.; Kuijper, A.; Sinha, S.N. SynSur: An end-to-end generative pipeline for synthetic industrial surface defect generation and detection. arXiv 2026. [Google Scholar]
- Rao, H.; Wang, Z.; Si, C.; Lyu, Y.; Duan, Y.; Zhao, F.; Shan, C. One-to-More: High-Fidelity Training-Free Anomaly Generation with Attention Control. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026. [Google Scholar]
- Xu, R.; Chiu, Y.T.; Chen, T.I.; Chew, O.; Chuang, Y.Y.; Cheng, W.H. Training-Free Industrial Defect Generation with Diffusion Models. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 24214–24223. [Google Scholar]
- Choi, J.; Kim, M.; Hong, J.H. MAGIC: Few-Shot Mask-Guided Anomaly Inpainting with Prompt Perturbation, Spatially Adaptive Guidance, and Context Awareness. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings, 2026. [Google Scholar]
- Song, J.; Park, D.; Baek, K.; Lee, S.; Choi, J.; Kim, E.; Yoon, S. DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 18718–18727. [Google Scholar] [CrossRef]
- Jin, Y.; Peng, J.; He, Q.; Hu, T.; Wu, J.; Chen, H.; Wang, H.; Zhu, W.; Chi, M.; Liu, J.; et al. Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 30420–30429. [Google Scholar] [CrossRef]
- Sun, H.; Cao, Y.; Dong, H.; Fink, O. Unseen Visual Anomaly Generation. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 25508–25517. [Google Scholar]
- Sakai, S.; He, X.; Gu, C.; Sigal, L.; Hasegawa, T. InvAD: Inversion-based Reconstruction-Free Anomaly Detection with Diffusion Models. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026. [Google Scholar]
- Zhu, W.; Wang, C.; Gao, B.B.; Zhang, J.; Jiang, G.; Hu, J.; Gan, Z.; Wang, L.; Zhou, Z.; Cheng, L.; et al. Real-IAD Variety: Pushing Industrial Anomaly Detection Dataset to a Modern Era. arXiv 2025. [Google Scholar] [CrossRef]
- Zhu, W.; Wang, L.; Zhou, Z.; Wang, C.; Pan, Y.; Zhang, R.; Chen, Z.; Cheng, L.; Gao, B.B.; Zhang, J.; et al. Real-IAD D3: A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 15214–15223. [Google Scholar]
- Yang, E.; Xing, P.; Sun, H.; Guo, W.; Ma, Y.; Li, Z.; Zeng, D. 3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection. Proc. AAAI Conf. Artif. Intell. 2025, 39, 9175–9183. [Google Scholar] [CrossRef]
- Zhang, J.; Ding, R.; Ban, M.; Dai, L. PKU-GoodsAD: A Supermarket Goods Dataset for Unsupervised Anomaly Detection and Segmentation. IEEE Robot. Autom. Lett. 2024, 9, 2008–2015. [Google Scholar] [CrossRef]
- Fan, L.; Fan, D.; Hu, Z.; Ding, Y.; Di, D.; Yi, K.; Pagnucco, M.; Song, Y. MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects. Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025, 25518–25527. [Google Scholar]
- Shi, D.; He, C.; Zhang, S.; Qian, B.; Quan, X.; Zhang, W.; Wei, X. Omni-AD: A Large-scale and Versatile Benchmark for Industrial Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2026; pp. 14157–14166. [Google Scholar]
- McHard, P.M.; Audonnet, F.P.; Summerell, O.; Andraos, S.; Henderson, P.; Aragon-Camarasa, G. 3D-ADAM: A Dataset for 3D Anomaly Detection in Additive Manufacturing. In Proceedings of the Proceedings of the 2026 IEEE International Conference on Robotics and Automation (ICRA), 2026. [Google Scholar]
- Guo, B.; Li, H.; Yu, R.; Liang, H.; Wang, J. IEC3D-AD: A 3D Dataset of Industrial Equipment Components for Unsupervised Point Cloud Anomaly Detection. arXiv 2025. [Google Scholar]
- Li, W.; Zheng, B.; Xu, X.; Gan, J.; Lu, F.; Li, X.; Ni, N.; Tian, Z.; Huang, X.; Gao, S.; et al. Multi-Sensor Object Anomaly Detection: Unifying Appearance, Geometry, and Internal Properties. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 9984–9993. [Google Scholar]
- Cao, Y.; Cheng, Y.; Zhang, Y.; Xu, X.; Zhang, Y.; Sun, Y.; Tan, Y.; Huang, X.; Huang, C.; Shen, W. Visual Anomaly Detection under Complex View-Illumination Interplay: A Large-Scale Benchmark. Pattern Recognit. 2026, 179, 113666. [Google Scholar] [CrossRef]
- Heckler-Kram, L.; Neudeck, J.H.; Scheler, U.; König, R.; Steger, C. The MVTec AD 2 Dataset: Advanced Scenarios for Unsupervised Anomaly Detection. Int. J. Comput. Vis. 2026, 134, 175. [Google Scholar] [CrossRef]
- Bai, H.; Mou, S.; Likhomanenko, T.; Cinbis, R.G.; Tuzel, O.; Huang, P.; Shan, J.; Shi, J.; Cao, M. VISION Datasets: A Benchmark for Vision-based InduStrial InspectiON, 2023. Presented at the CVPR 2023 Workshop on Vision-Based Industrial Inspection.
- Lei, T.; Wang, B.; Chen, S.; Cao, S.; Zou, N. Texture-AD: An Anomaly Detection Dataset and Benchmark for Real Algorithm Development. arXiv 2024. [Google Scholar] [CrossRef]
- Wang, Q.; Gao, S.; Hu, J.; Yu, J.; Tong, X.; Li, Y.; Zhang, W. HSS-IAD: A Heterogeneous Same-Sort Industrial Anomaly Detection Dataset. In Proceedings of the 2025 IEEE International Conference on Multimedia and Expo (ICME); 2025; pp. 1–6. [Google Scholar] [CrossRef]
- Yu, R.; Guo, B.; Li, H. Anomaly Detection of Integrated Circuits Package Substrates Using the Large Vision Model SAIC: Dataset Construction, Methodology, and Application. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 22563–22574. [Google Scholar]
- Chen, Q.; Luo, H.; Lv, C.; Zhang, Z. A Unified Anomaly Synthesis Strategy with Gradient Ascent for Industrial Anomaly Detection and Localization. In Proceedings of the Computer Vision – ECCV 2024; Springer; Lecture Notes in Computer Science; 2024; Vol. 15125, pp. 37–54. [Google Scholar] [CrossRef]
- Vieira e Silva, A.L.B.; Felix, H.d.C.; Simões, F.P.M.a.; Teichrieb, V.; dos Santos, M.; Santiago, H.; Sgotti, V.; Lott Neto, H. InsPLAD: A Dataset and Benchmark for Power Line Asset Inspection in UAV Images. Int. J. Remote Sens. 2023, 44, 7294–7320. [Google Scholar] [CrossRef]
- Arodi, A.; Luck, M.; Bedwani, J.L.; Zaimi, A.; Li, G.; Pouliot, N.; Beaudry, J.; Marceau Caron, G. CableInspect-AD: An Expert-Annotated Anomaly Detection Dataset. Proc. Adv. Neural Inf. Process. Syst.;Datasets Benchmarks Track. 2024, Vol. 37, 64703–64716. [Google Scholar]
- Yang, S.; Chen, Z.; Chen, P.; Fang, X.; Liang, Y.; Liu, S.; Chen, Y. Defect Spectrum: A Granular Look of Large-Scale Defect Datasets with Rich Semantics. In Proceedings of the Computer Vision – ECCV 2024;Lecture Notes in Computer Science; Springer, 2025; Vol. 15065, pp. 187–203. [Google Scholar] [CrossRef]
- Yang, C.A.; Peng, K.C.; Yeh, R.A. Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025; pp. 23419–23430. [Google Scholar]
- Hu, L.; Gan, Z.; Deng, L.; Liang, J.; Liang, L.; Huang, S.; Chen, T. ReplayCAD: Generative Diffusion Replay for Continual Anomaly Detection. In Proceedings of the Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), 2025; pp. 2946–2954. [Google Scholar] [CrossRef]
- Safarov, S.; Park, J.; Jung, Y.G.; Peng, K.C.; Kim, W.; Bang, S.; Camps, O. Memory-Distilled Selection for Noise-Robust Anomaly Detection. In Proceedings of the Proceedings of the 43rd International Conference on Machine Learning (ICML), 2026. [Google Scholar]
- Horwitz, E.; Hoshen, Y. Back to the Feature: Classical 3D Features Are (Almost) All You Need for 3D Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2023; pp. 2968–2977. [Google Scholar] [CrossRef]
- Jiang, X.; Zhao, Y.; Yang, Z.; Zheng, F. AnomalyClaw: A Universal Visual Anomaly Detection Agent via Tool-Grounded Refutation. arXiv 2026. [Google Scholar]
- Liu, J.; Yan, Y.; Li, J.; Zhao, W.; Chu, P.; Sheng, X.; Liu, Y.; Yang, X. IPAD: Industrial Process Anomaly Detection Dataset. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 380–393. [Google Scholar] [CrossRef]
- Li, W.; Gu, Y.; Chen, X.; Xu, X.; Hu, M.; Huang, X.; Wu, Y. Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025; pp. 30409–30419. [Google Scholar] [CrossRef]
- Dabouei, A.; Parayil Shibu, J.; Dalal, V.; Cao, C.; MacWilliams, A.; Kangas, J.; Xu, M. Deep video anomaly detection in automated laboratory setting. Expert Syst. With Appl. 2025, 271, 126581. [Google Scholar] [CrossRef]
| 1 | |
| 2 |
IAD metric vocabulary. I-AUROC (image-level AUROC): whole-image normal/abnormal classification. P-AUROC (pixel-level AUROC): same, per pixel; measures localization. PRO (Per-Region Overlap, 15): mean predicted/ground-truth region overlap, more deployment-faithful than P-AUROC on small defects. AP (Average Precision): area under the precision–recall curve. Bound-FPR metrics (e.g., AP at FPR≤1%, recall at fixed precision): performance at manufacturing operating points, where false positives stop production. |
| 3 |
MLLM-AD training vocabulary. MLLM / VLM: a large language model with a vision encoder, jointly processing image and text. VQA: visual question-answering protocol. SFT (supervised fine-tuning): training on (input, gold-answer) pairs with cross-entropy. GRPO (group relative policy optimization, from DeepSeek-R1): an RL-style post-training step rewarding correct, well-structured answers over the model’s own samples—applied after SFT. RAG (retrieval-augmented generation): retrieves knowledge into context at inference. DPO, RLVR: other post-training objectives. CoT: chain-of-thought. |
| 4 |
3D/multimodal vocabulary. Point cloud: unordered 3D points; RGB-D: paired RGB + depth; MV: multi-view RGB. O-AUROC / P-AUROC: object- and point/pixel-level AUROC. Synth2real: train on synthetic (e.g., CAD) data, test on real scans. Point-MAE / Point-BERT / PointGPT: 3D self-supervised pretraining. ULIP / PointCLIP: 3D–language alignment. SDF: signed-distance function. FPFH: hand-crafted 3D feature (fast point-feature histogram). MoE: mixture-of-experts routing. |
| 5 |
Diffusion / synthesis vocabulary. LDM (latent diffusion model): diffusion in a learned latent space, e.g., Stable Diffusion. DDIM: deterministic sampler, ∼1000 (or fewer) steps. SDXL: a larger LDM with two text encoders. ControlNet: conditions LDM output on edges, depth or masks. DiT (diffusion transformer): transformer diffusion backbone, e.g., Flux. LoRA: low-rank parameter-efficient fine-tuning. DreamBooth: subject-specific LDM fine-tuning from a few references. CLIPScore / DreamSim: image–text / image–image similarity metrics. FID / KID / LPIPS / IS: generation-quality metrics (lower better for FID/KID/LPIPS, higher for IS). |







Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).