Submitted:
10 February 2026
Posted:
10 February 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We validate an unsupervised segmentation pipeline (SLIC + GMM + Graph Cut) for automatically generating semantic masks from marble imagery, demonstrating a practical solution to the data scarcity challenge where obtaining pixel-perfect annotations is economically prohibitive.
- We conduct a systematic benchmark comparing four cGAN architectures trained and evaluated under identical conditions on real industrial data (289 high-resolution marble scans acquired from production lines), providing evidence-based architecture selection guidance for practitioners.
- We implement a dual-evaluation framework contrasting automated metrics (FID, IS, MS-SSIM) with human-centered assessment (Visual Turing Test [10], Mean Opinion Scores from domain experts), revealing significant metric-perception discrepancies with direct implications for deployment decisions.
- We demonstrate that GauGAN achieves human-indistinguishable synthesis quality despite inferior FID scores, while Pix2Pix exhibits the opposite pattern, establishing empirically that automated metrics alone are insufficient for architecture selection in quality-critical manufacturing applications [61,62].
- We provide comprehensive methodology documentation to enable replication and extension to other natural material synthesis tasks (wood, fabric, geological samples).
2. Materials and Methods
2.1. Dataset and Unsupervised Mask Generation
2.2. Conditional GAN Architectures and Training
2.3. Dual-Evaluation Framework
3. Results
3.1. Qualitative Assessment: Visual Comparison Across Architectures
3.2. Quantitative Metrics: Automated Performance Evaluation
3.3. Human-Centered Evaluation: Perceptual Quality Assessment
3.4. The Metric-Perception Divergence
3.5. Computational Efficiency and Deployment Feasibility
4. Discussion
4.1. Synthesis of Findings and Implications
4.2. Training Dynamics and Architectural Insights
4.3. Practical Validation of Unsupervised Mask Generation
4.4. Limitations and Directions for Future Research
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| 2AFC | Two-Alternative Forced Choice |
| CC | Correlation Coefficient |
| cGAN | Conditional Generative Adversarial Network |
| FID | Fréchet Inception Distance |
| FMI | Feature Mutual Information |
| GAN | Generative Adversarial Network |
| GFLOPS | Giga Floating Point Operations Per Second |
| GMM | Gaussian Mixture Model |
| GPU | Graphics Processing Unit |
| IS | Inception Score |
| MOS-MA | Mean Opinion Score on Marble Authenticity |
| MS-SSIM | Multi-Scale Structural Similarity Index |
| MSE | Mean Squared Error |
| PSNR | Peak Signal-to-Noise Ratio |
| SCD | Structural Content Dissimilarity |
| SLIC | Simple Linear Iterative Clustering |
| SPADE | Spatially-Adaptive Denormalization |
| VTPR | Visual Turing Pass Rate |
Appendix A
Appendix A.1. Unsupervised Segmentation Pipeline Details

Appendix A.2. Conditional GAN Architecture Diagrams

Appendix A.3. Quantitative Metrics Distributions

Appendix A.4. Excluded Samples Documentation

References
- Jimeno-Morenilla, A.; Azariadis, P.; Molina-Carmona, R.; Kyratzi, S.; Moulianitis, V. Technology Enablers for the Implementation of Industry 4.0 to Traditional Manufacturing Sectors: A Review. Comput. Ind. 2021, 125. [Google Scholar] [CrossRef]
- Loy, J.; Canning, S.; Little, C. Industrial Design Digital Technology. Procedia Technology 2015, 20, 32–38. [Google Scholar] [CrossRef]
- Xian, W.; Sangkloy, P.; Agrawal, V.; Raj, A.; Lu, J.; Fang, C.; Yu, F.; Hays, J. TextureGAN: Controlling Deep Image Synthesis with Texture Patches. 2018. [Google Scholar]
- Weinberger, P.; Gall, A.; Heim, A.; Yosifov, M.; Kastner, J.; Schwarz, L.; Fröhler, B.; Bodenhofer, U.; Sascha, S. Unsupervised Segmentation of Industrial X-Ray Computed Tomography Data with the Segment Anything Model. 2024. [Google Scholar] [CrossRef]
- Mirza, M.; Osindero, S. Conditional Generative Adversarial Nets. 2014. [Google Scholar] [CrossRef]
- Park, T.; Liu, M.-Y.; Wang, T.-C.; Zhu, J.-Y. Semantic Image Synthesis with Spatially-Adaptive Normalization. 2019. [Google Scholar]
- Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017; Institute of Electrical and Electronics Engineers Inc., November 6 2017; Vol. 2017-January, pp. 5967–5976. [Google Scholar]
- Era, I.Z.; Ahmed, I.; Liu, Z.; Das, S. An Unsupervised Approach towards Promptable Defect Segmentation in Laser-Based Additive Manufacturing by Segment Anything. arXiv:2312.04063v3 [cs.CV] 2024.
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Proceedings of the Proceedings of the 31st International Conference on Neural Information Processing Systems NIPS’17, December 4 2017; pp. 6629–6640. [Google Scholar]
- Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved Techniques for Training GANs. In Proceedings of the Advances in Neural Information Processing Systems 29 (NIPS 2016); D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, R. Garnett, Eds.; 2016.
- Gatys, L.A.; Ecker, A.S.; Bethge, M. A Neural Algorithm of Artistic Style. Computing Research Repository (CoRR) 2015. [Google Scholar] [CrossRef]
- Zhou, S.; Gordon, M.L.; Krishna, R.; Narcomey, A.; Fei-Fei, L.; Bernstein, M.S. HYPE: A Benchmark for Human EYe Perceptual Evaluation of Generative Models. Proceedings of the 33rd International Conference on Neural Information Processing Systems 2019, 3449–3461. [Google Scholar]
- Stein, G.; Cresswell, J.C.; Hosseinzadeh, R.; Sui, Y.; Leigh Ross, B.; Villecroze, V.; Liu, Z.; Caterini, A.L.; Eric Taylor, J.T.; Loaiza-Ganem, G. Exposing Flaws of Generative Model Evaluation Metrics and Their Unfair Treatment of Diffusion Models; 2023. [Google Scholar]
- Borji, A. Pros and Cons of GAN Evaluation Measures: New Developments. Computer Vision and Image Understanding 2022, 215. [Google Scholar] [CrossRef]
- Liu, L.; Duan, H.; Hu, Q.; Yang, L.; Cai, C.; Ye, T.; Liu, H.; Zhang, X.; Zhai, G. F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2025, 10982–10994. [Google Scholar]
- Perlin, K. Improving Noise. In Proceedings of the Proceedings of the 29th annual conference on Computer graphics and interactive techniques, San Antonio, TX, 2002; pp. 681–682. [Google Scholar]
- Perlin, K. An Image Synthesizer. In Proceedings of the Proceedings of the 12th annual conference on Computer graphics and interactive techniques, San Francisco, CA, 1985; Vol. 19, pp. 287–296. [Google Scholar]
- Turk, G. Generating Textures on Arbitrary Surfaces Using Reaction-Diffusion; 1991; Vol. 25. [Google Scholar]
- Worley, S. A Cellular Texture Basis Function. SIGGRAPH ’96: Proceedings of the 23rd annual conference on Computer graphics and interactive techniq 1996, 291–294. [Google Scholar]
- Efros, A.A.; Leung, T.K. Texture Synthesis by Non-Parametric Sampling. In Proceedings of the IEEE International Conference on Computer Vision, Corfu, Greece, September 1999. [Google Scholar]
- Efros, A.A.; Freeman, W.T. Image Quilting for Texture Synthesis and Transfer. In Proceedings of the SIGGRAPH ’01: Proceedings of the 28th annual conference on Computer graphics and interactive techniques; ACM Digital Library, 2001; pp. 341–346. [Google Scholar]
- Kwatra, V.; Schödl, A.; Essa, I.; Turk, G.; Bobick, A. Graphcut Textures: Image and Video Synthesis Using Graph Cuts. ACM Transactions on Graphics (TOG) 2003, 22, 277–286. [Google Scholar] [CrossRef]
- Levina, E.; Bickel, P.J. Texture Synthesis and Nonparametric Resampling of Random Fields. Ann. Stat. 2006, 34, 1751–1773. [Google Scholar] [CrossRef]
- Aguerrebere, C.; Gousseau, Y.; Tartavel, G. Exemplar-Based Texture Synthesis: The Efros-Leung Algorithm. Image Processing On Line 2013, 3, 223–241. [Google Scholar] [CrossRef]
- Karras, T.; Laine, S.; Aila, T. A Style-Based Generator Architecture for Generative Adversarial Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 4217–4228. [Google Scholar] [CrossRef] [PubMed]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022, 10674–10685. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv: Machine Learning (stat.ML) 2014.
- Gatys, L.A.; Ecker, A.S.; Bethge, M. Texture Synthesis Using Convolutional Neural Networks. Computing Research Repository (CoRR) 2015. [Google Scholar]
- Nikolay Jetchev; Urs Bergmann; Roland Vollgraf Texture Synthesis with Spatial Generative Adversarial Networks. Computing Research Repository (CoRR) 2016. [CrossRef]
- Bergmann, U.; Jetchev, N.; Vollgraf, R. Learning Texture Manifolds with the Periodic Spatial GAN. Computing Research Repository (CoRR) 2017. [Google Scholar]
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the Proceedings of the IEEE International Conference on Computer Vision; Institute of Electrical and Electronics Engineers Inc., December 22 2017; Vol. 2017-October, pp. 2242–2251. [Google Scholar]
- Zhu, J.-Y.; Zhang, R.; Pathak, D.; Darrell, T.; Efros, A.A.; Wang, O.; Shechtman, E. Toward Multimodal Image-to-Image Translation. 2018. [Google Scholar]
- Tan, Z.; Chen, D.; Chu, Q.; Chai, M.; Liao, J.; He, M.; Yuan, L.; Hua, G.; Yu, N. Efficient Semantic Image Synthesis via Class-Adaptive Normalization. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 4852–4866. [Google Scholar] [CrossRef] [PubMed]
- Zhang, L.; Rao, A.; Agrawala, M. Adding Conditional Control to Text-to-Image Diffusion Models. In Proceedings of the Proceedings of the IEEE International Conference on Computer Vision; Institute of Electrical and Electronics Engineers Inc., 2023; pp. 3813–3824. [Google Scholar]
- Cao, P.; Zhou, F.; Yang, L.; Huang, T.; Song, Q. Image Is All You Need to Empower Large-Scale Diffusion Models for In-Domain Generation. CVPR2025 2025. [Google Scholar]
- Borovec, J. Fully Automatic Segmentation of Stained Histological Cuts. In Proceedings of the Poster 2013 : 17th International Student Conference on Electrical Engineering, Prague, May 16 2013. [Google Scholar]
- Borovec, J.; Svihlík, J.; Kybic, J.; Habart, D. Supervised and Unsupervised Segmentation Using Superpixels, Model Estimation, and Graph Cut. J. Electron. Imaging 2017, 26. [Google Scholar] [CrossRef]
- Andreini, P.; Ciano, G.; Bonechi, S.; Graziani, C.; Lachi, V.; Mecocci, A.; Sodi, A.; Scarselli, F.; Bianchini, M. A Two-Stage GAN for High-Resolution Retinal Image Generation and Segmentation. Electronics (Switzerland) 2022, 11. [Google Scholar] [CrossRef]
- Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2281. [Google Scholar] [CrossRef]
- Giraud, R.; Clément, M. Superpixel Segmentation: A Long-Lasting Ill-Posed Problem. arXiv: Computer Vision and Pattern Recognition (cs.CV) 2024. [Google Scholar]
- Jampani, V.; Sun, D.; Liu, M.-Y.; Yang, M.-H.; Kautz, J. Superpixel Sampling Networks. In Proceedings of the European Conference on Computer Vision – ECCV 2018; Vittorio Ferrari, Martial Hebert, Eds.; Munich, Germany, November 8 2018.
- Fouad, S.; Randell, D.; Galton, A.; Mehanna, H.; Landini, G. Unsupervised Superpixel-Based Segmentation of Histopathological Images with Consensus Clustering. In Proceedings of the Communications in Computer and Information Science; Springer Verlag, 2017; Vol. 723, pp. 767–779. [Google Scholar]
- Boykov, Y.Y.; Jolly, M.-P. Interactive Graph Cuts for Optimal Boundary & Region Segmentation OfObjects in N-D Images. In Proceedings of the Proc. 8th IEEE Int. Conf. on Computer Vision (ICCV), July 2001; Vol. I, pp. 105–112. [Google Scholar]
- Shi, J.; Malik, J. Normalized Cuts and Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22. [Google Scholar]
- Rother, C.; Kolmogorov, V.; Blake, A. “GrabCut”-Interactive Foreground Extraction Using Iterated Graph Cuts. ACM Transactions on Graphics (TOG) 2004, 23, 309–314. [Google Scholar] [CrossRef]
- Liu, B.; Zhang, T.; Yu, Y.; Miao, L. A Data Generation Method with Dual Discriminators and Regularization for Surface Defect Detection under Limited Data. Comput. Ind. 2023, 151. [Google Scholar] [CrossRef]
- Jha, S.B.; Babiceanu, R.F. Deep CNN-Based Visual Defect Detection: Survey of Current Literature. Comput. Ind. 2023, 148. [Google Scholar] [CrossRef]
- He, X.; Chang, Z.; Zhang, L.; Xu, H.; Chen, H.; Luo, Z. A Survey of Defect Detection Applications Based on Generative Adversarial Networks. IEEE Access 2022, 10, 113493–113512. [Google Scholar] [CrossRef]
- Gan, Y.; Ji, Y.; Jiang, S.; Liu, X.; Feng, Z.; Li, Y.; Liu, Y. Integrating Aesthetic and Emotional Preferences in Social Robot Design: An Affective Design Approach with Kansei Engineering and Deep Convolutional Generative Adversarial Network. Int. J. Ind. Ergon. 2021, 83. [Google Scholar] [CrossRef]
- Kumar, V.; Hernández, N.; Jensen, M.; Pal, R. Deep Learning Based System for Garment Visual Degradation Prediction for Longevity. Comput. Ind. 2023, 144. [Google Scholar] [CrossRef]
- Hu, W.; Wang, T.; Chu, F. A Wasserstein Generative Digital Twin Model in Health Monitoring of Rotating Machines. Comput. Ind. 2023, 145. [Google Scholar] [CrossRef]
- Kim, S.; Jang, H.; Yoon, B. Developing a Data-Driven Technology Roadmapping Method Using Generative Adversarial Network (GAN). Comput. Ind. 2023, 145. [Google Scholar] [CrossRef]
- Johnson, J.; Alahi, A.; Fei-Fei, L. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. 2016. [Google Scholar]
- Wang, R. Research on Image Generation and Style Transfer Algorithm Based on Deep Learning. Open Journal of Applied Sciences 2019, 09, 661–672. [Google Scholar] [CrossRef]
- Bernardi, M. Generating Realistic Marble Textures Using Generative Adversarial Networks, Università degli Studi di Padova: Padova, 2023.
- Guo, Y.; Smith, C.; Hašan, M.; Sunkavalli, K.; Zhao, S. MaterialGAN: Reflectance Capture Using a Generative SVBRDF Model. ACM Trans. Graph. 2020, 39. [Google Scholar] [CrossRef]
- INTERNATIONAL TELECOMMUNICATION UNION. Subjective Video Quality Assessment Methods for Multimedia Applications; 2008. [Google Scholar]
- Radiocommunication Bureau, I. Recommendation ITU-R BT.500-14 Methodologies for the Subjective Assessment of the Quality of Television Images 2020.
- Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multi-Scale Structural Similarity for Image Quality Assessment. In Proceedings of the Proceedings of the 37th IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, January 9 2003. [Google Scholar]
- Bermano, A.H.; Gal, R.; Alaluf, Y.; Mokady, R.; Nitzan, Y.; Tov, O.; Patashnik, O.; Cohen-Or, D. State-of-the-Art in the Architecture, Methods and Applications of StyleGAN. Computer Graphics Forum 2022, 41, 591–611. [Google Scholar] [CrossRef]
- Salih, M.E.; Zhang, X.; Ding, M. Two Modifications of Weight Calculation of the Non-Local Means Denoising Method. Engineering 2013, 05, 522–526. [Google Scholar] [CrossRef]
- Chen, Y. 3D Texture Mapping for Rapid Manufacturing; 2007; Vol. 4. [Google Scholar]
- Saad, M.M.; Rehmani, M.H.; O’Reilly, R. Early Stopping Criteria for Training Generative Adversarial Networks in Biomedical Imaging. In Proceedings of the IEEE Irish Signals and Systems Conference (ISSC 2024), May 31 2024; pp. 1–7. [Google Scholar]
- Haghighat, M.; Razian, M.A. Fast-FMI: Non-Reference Image Fusion Metric. In Proceedings of the 8th IEEE International Conference on Application of Information and Communication Technologies, AICT 2014 - Conference Proceedings; Institute of Electrical and Electronics Engineers Inc., 2014. [Google Scholar]
- Fabre-Thorpe, M. The Characteristics and Limits of Rapid Visual Categorization. Front. Psychol. 2011, 2. [Google Scholar] [CrossRef] [PubMed]







| Architecture | Generator LR | Discriminator LR | Encoder LR | λL1 | λlatent | Latent Dim | Total Training Epochs |
|---|---|---|---|---|---|---|---|
| cGAN | 1×10−4 | 5×10−5 | — | — | — | 100 | 6,900 |
| Pix2Pix | 2×10−4 | 2×10−4 | — | 100 | — | — | 5,100 |
| BicycleGAN | 2×10−4 | 2×10−4 | 2×10−4 | 100 | 10 | 8 | 3,400 |
| GauGAN | 5×10−5 | 5×10−5 | — | 350 | — | — | 3,100 |
| Pixel-Based and Structural Metrics | Statistical Metrics | Learned Distributional Metrics | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Architecture | MSE (↓) | PSNR (↑) | MS-SSIM (↑) | SCD (↑) | SD (↑) | CC (↑) | Entropy (↑) | FMI-pixel (↑) | IS (↑) | FID (↓) |
| cGAN | 0.009 ± 0.004 | 20.931 ± 1.898 | 0.636 ± 0.065 | 0.980 ± 0.010 | 0.123 ± 0.015 | 0.780 ± 0.047 | 6.433 ± 0.226 | 0.643 ± 0.093 | 1.790 ± 0.154 | 94.623 |
| Pix2pix | 0.007 ± 0.003 | 21.710 ± 1.883 | 0.680 ± 0.076 | 0.981 ± 0.010 | 0.125 ± 0.011 | 0.829 ± 0.056 | 6.439 ± 0.222 | 0.794 ± 0.154 | 1.940 ± 0.275 | 85.286 |
| BicycleGAN | 0.007 ± 0.004 | 22.165 ± 2.207 | 0.693 ± 0.080 | 0.982 ± 0.009 | 0.119 ± 0.011 | 0.839 ± 0.057 | 6.385 ± 0.214 | 0.826 ± 0.171 | 1.766 ± 0.197 | 100.071 |
| GauGAN | 0.006 ± 0.003 | 22.626 ± 2.514 | 0.713 ± 0.098 | 0.983 ± 0.010 | 0.120 ± 0.014 | 0.847 ± 0.065 | 6.381 ± 0.231 | 0.886 ± 0.225 | 1.903 ± 0.264 | 87.308 |
| Architecture | # of Parameters (Millions) | Computational cost [GFLOPS] | Avg. Time/Epoch (min) | Total Epochs |
|---|---|---|---|---|
| cGAN | 63.9 | 170.2 | 0.11 | 6900 |
| pix2pix | 61.4 | 170.2 | 0.12 | 5100 |
| BicycleGAN | 85.9 | 105.4 | 0.17 | 3400 |
| GauGAN | 53.1 | 1761.3 | 0.82 | 3100 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).