On the Complementarity of Classical Convolution and Quantum Neural Networks in Image Classification

Silvie Illésová; Emmanuel Obeng; Tomáš Bezděk; Vojtěch Novák; Martin Beseda

doi:10.20944/preprints202512.1348.v1

Submitted:

13 December 2025

Posted:

15 December 2025

You are already at the latest version

Abstract

This work deals with the design of a hybrid classification model that uses two complementary parallel data processing branches. The aim was to verify whether the connection of different input representations within a common decision mechanism can support the stability and reliability of classification. The outputs of both branches are continuously integrated and together form the final decision of the model. On the validation set, the model achieved accuracy 0.9750, precision 1.0000, recall 0.9500 and F1-score 0.9744 at a threshold value of 0.5. These results suggest that parallel, complementary processing may be a promising direction for further development and optimization of the model, especially in tasks requiring high accuracy while maintaining robust detection of positive cases.

Keywords:

convolutional neural networks

;

hybrid models

;

image classification

;

quantum machine learning

;

quantum neural networks

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

1. Introduction

This work investigates a hybrid quantum-classical model that combines a compact Convolutional Neural Network (CNN) [1,2,3] with a two-qubit Quantum Neural Network (QNN) [4,5,6] for binary image classification [7,8,9]. The motivation is preliminary testing of such hybrid architectures in a controlled environment where model stability, interpretability, and complementarity between branches can be observed clearly. To enable this type of controlled study, we deliberately use an artificially generated dataset consisting of small

16 \times 16

grayscale images containing one or more geometric shapes embedded in structured noise. This design choice allows us to isolate core visual features such as contrast, edge strength, and spatial compactness, without additional complications arising from real satellite imagery such as compression artifacts, atmospheric distortion, or variable illumination.

The objective of this work is not to outperform large classical architectures on high-resolution, real-world benchmarks. Instead, the aim is to answer whether a minimal quantum component can be trained stably alongside a classical convolutional backbone, and whether the quantum branch contributes a useful and complementary signal to the final decision. The use case should therefore be understood as a proof-of-concept filter for distinguishing prominent structure from background noise in constrained computational settings such as early onboard preprocessing or embedded lightweight detection pipelines.

We demonstrate that the hybrid model can be trained end-to-end without collapse, that its predicted probabilities are smooth and well-calibrated, and that late fusion of the CNN and QNN outputs provides strong performance in this controlled setting. The results indicate that hybrid models of this scale are viable and warrant further exploration on progressively more realistic datasets.

To situate this work within the broader context we point to the fact that in recent years, the field of quantum machine learning and quantum computing in general has broadened. For example the examined benchmarking and feature selection in quantum neural networks [10,11,12,13,14,15] has been examined and optimization in noisy variational settings [16,17,18,19,20] has been studied, and it has been shown, that the choice of the optimization strategies is vital, with multiple viable choices [21,22,23,24] available. Hybrid quantum classical approaches [25,26,27] have been explored in various settings, alongside with profound benchmarking of various phenomena [28,29,30,31]. Related advances in variational eigensolver theory and ansatz construction [32,33,34,35,36,37,38] further motivate this study. In this context, we evaluate whether a minimal two qubit circuit can provide complementary signal when fused with a lightweight convolutional branch. The results indicate that hybrid models of this scale are viable and warrant further exploration on progressively more realistic datasets.

2. Methodology

In this section, the dataset, its generation and the model are described, to precisely define the setup used to obtain the results, that are presented in sec:results.

Data Generation

The dataset is synthetically generated to allow precise control over shape structure and noise characteristics. Its negative samples, labeled by

y = 0

, consist solely of random noise uniformly distributed in a limited intensity range, while the positive samples, labeled as

y = 1

, contain either one or two randomly positioned shapes selected from {square, rectangle, circle, ellipse, right triangle}.

When a shape is selected, and a placement is calculated, it brightens the pixels and also applies a slight halo effect, which simulates diffuse scattering. All images are

16 \times 16

pixels and normalized to

[0, 1]

, and an selected 16 are shown in Figure 1.

Model Architecture

The architecture of the employed model can be be described as follows, the input, which is a single-channel grayscale image,

X \in R^{1 \times 16 \times 16} .

(1)

is processed by two parallel branches, a classical CNN branch that extracts spatial features directly from the image, and a QNN branch that operates on a reduced set of handcrafted global features. The outputs of the two branches are fused at the logit level to form the final classification.

CNN Branch

The classical branch learns local spatial patterns using convolutional filters. The first convolution increases the number of channels from one to eight and applies a nonlinear activation,

H_{1} = σ (W_{1} * X + b_{1}), W_{1} \in R^{8 \times 1 \times 3 \times 3}, H_{1} \in R^{8 \times 16 \times 16} .

(2)

Here,

σ (\cdot)

denotes the ReLU activation, which introduces non-linearity and helps the network learn more expressive features.

To reduce spatial resolution and improve invariance to small spatial shifts, a

2 \times 2

max-pooling operation is applied,

H_{1}^{pool} = {MaxPool}_{2 \times 2} (H_{1}) \in R^{8 \times 8 \times 8} .

(3)

A second convolutional layer further enriches feature complexity by mapping eight channels to sixteen,

H_{2} = σ (W_{2} * H_{1}^{pool} + b_{2}), W_{2} \in R^{16 \times 8 \times 3 \times 3}, H_{2} \in R^{16 \times 8 \times 8} .

(4)

This is again followed by max-pooling to compress spatial dimensions,

H_{2}^{pool} = {MaxPool}_{2 \times 2} (H_{2}) \in R^{16 \times 4 \times 4} .

(5)

The resulting tensor is flattened into a feature vector,

h_{cnn} = vec (H_{2}^{pool}) \in R^{256},

(6)

which is mapped to a single scalar logit by a fully connected layer,

z_{cnn} = w_{cnn}^{⊤} h_{cnn} + b_{cnn}, w_{cnn} \in R^{256}, z_{cnn} \in R .

(7)

This logit represents the classical model’s confidence before sigmoid activation.

QNN Branch

In parallel, the quantum branch operates not on the image directly, but on a global handcrafted feature representation. For each image, six statistical and structural descriptors are extracted,

f \in R^{6},

(8)

these are further described in Table 1. To reduce redundancy and improve conditioning, Principal Component Analsis (PCA) maps these features to a two-dimensional latent representation,

u = W_{PCA} (f - μ), W_{PCA} \in R^{2 \times 6}, u \in R^{2} .

(9)

This vector is then scaled into rotation angles suitable for quantum state encoding,

θ = α u + β, θ \in {[0, 2 π)}^{2} .

(10)

The quantum circuit begins in the ground state,

| ψ_{0} 〉 = | 00 〉 .

(11)

A feature-encoding map embeds the classical angles into the quantum state,

| ψ_{enc} (θ) 〉 = U_{FM} (θ) | ψ_{0} 〉,

(12)

after which a trainable variational layer applies rotations parameterized by

ϕ

,

| ψ_{var} (θ, ϕ) 〉 = U_{VA} (ϕ) | ψ_{enc} (θ) 〉, ϕ \in R^{k} .

(13)

The circuit output is obtained via the expectation value of the Pauli operator

Z \otimes Z

,

q (θ, ϕ) = 〈ψ_{var} (θ, ϕ) | (Z \otimes Z) | ψ_{var} (θ, ϕ)〉, q (θ, ϕ) \in [- 1, 1] .

(14)

A linear projection converts this value to a logit matching the classical branch output scale,

z_{qnn} = w_{q} q (θ, ϕ) + b_{q}, w_{q} \in R, z_{qnn} \in R .

(15)

Fusion and Output

Finally, the two logits are combined through a learnable linear fusion layer,

z_{final} = w_{f, 1} z_{cnn} + w_{f, 2} z_{qnn} + b_{f}, z_{final} \in R .

(16)

The final classification probability is obtained via the sigmoid function,

p (y = 1 ∣ X) = \frac{1}{1 + e^{- z_{final}}} .

(17)

Training Procedure

The dataset is balanced and split into training and validation partitions. Optimization uses the Adam optimizer with distinct learning rates for the quantum and classical components. Across 12 epochs, we record training loss, validation probabilities, and confusion matrices to evaluate model behavior and threshold sensitivity.

Figure 2. Training loss over 12 epochs for the hybrid model.

3. Results

This section evaluates the performance of the hybrid quantum-classical model on the held-out validation set. The model was trained for 12 epochs, during which the training loss decreased smoothly without instability. This indicates that the joint optimization of the classical and quantum parameters was well-behaved, and that the learning-rate separation between the branches was effective.

Using the standard decision threshold of

0.5

, the model achieved strong classification performance on the validation set. The confusion matrix in Figure 3 shows that the hybrid architecture successfully distinguishes between images containing geometric shapes and pure noise backgrounds. The final values are shown in Table 2.

The model produced no false positives, meaning that any sample classified as containing a shape indeed contained one. The remaining classification errors were false negatives, where faint or partially obscured shapes were classified as background. This suggests that the model is conservative in its detection behavior, preferring to avoid false alarms.

The results indicate that the classical and quantum components contribute complementary information, the CNN branch captures local spatial structure and shape boundaries directly from pixel data, while the QNN branch responds to global scene properties such as contrast and edge energy derived from the handcrafted descriptors. In the end, the fusion layer effectively combines these representations to improve confidence and robustness. The complementarity of the two branches is supported quantitatively. the CNN and QNN logits on the validation set are only weakly correlated,

r = 0.28

, indicating that they respond to different aspects of the input. Moreover, removing either branch reduces the F1-score compared to the hybrid value of

0.97

, confirming that both components contribute unique predictive information.

This behavior shows that even a quantum circuit with only two qubits can provide a meaningful signal when integrated in a structured hybrid architecture, particularly when operating on compact, semantically interpretable feature inputs. The experiment therefore demonstrates the feasibility of stable, end-to-end training of hybrid models in small-scale, controlled settings.

4. Conclusions

This work demonstrates that hybrid quantum-classical architectures can be trained stably and effectively, even at very small quantum scales. By combining a compact convolutional network with a two-qubit variational quantum circuit, the model successfully integrates local spatial information with global statistical descriptors derived from the input. The resulting hybrid classifier achieves high validation performance, with an F1-score of

0.97

, and shows clear evidence that the quantum and classical branches contribute complementary information rather than redundant representations.

The experiment therefore provides a controlled proof-of-concept that a quantum component can enhance a lightweight classical model when the quantum input is appropriately structured and semantically meaningful. While the setting is deliberately simplified, the core result is definitive, the quantum branch can act as a useful co-processor. This suggests that hybrid models merit further investigation in more realistic scenarios, including higher-resolution imagery and resource-constrained onboard analysis, where compact yet expressive representations are valuable.

Data Availability Statement

For reproducibility, all data, model configurations, and code used in this study are openly available in the associated GitLab repository at https://gitlab.com/illesova.silvie.scholar/complementarity-of-classical-convolution-and-quantum-neural-networks.

Acknowledgments

Martin Beseda is supported by Italian Government (Ministero dell’Università e della Ricerca, PRIN 2022 PNRR) – cod.P2022SELA7: ”RECHARGE: monitoRing, tEsting, and CHaracterization of performAnce Regressions“ – Decreto Direttoriale n. 1205 del 28/7/2023. Vojtěch Novák is supported by Grant of SGS No. SP2025/072, VSB-Technical University of Ostrava, Czech Republic.

References

Yamashita, R.; Nishio, M.; Do, R.K.G.; Togashi, K. Convolutional neural networks: an overview and application in radiology. Insights into imaging 2018, 9, 611–629. [Google Scholar] [CrossRef] [PubMed]
Wu, J. Introduction to convolutional neural networks. National Key Lab for Novel Software Technology. Nanjing University. China 2017, 5, 495. [Google Scholar]
Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems 2021, 33, 6999–7019. [Google Scholar] [CrossRef]
Schuld, M.; Sinayskiy, I.; Petruccione, F. The quest for a quantum neural network. Quantum Information Processing 2014, 13, 2567–2586. [Google Scholar] [CrossRef]
Jia, Z.A.; Yi, B.; Zhai, R.; Wu, Y.C.; Guo, G.C.; Guo, G.P. Quantum neural network states: A brief review of methods and applications. Advanced Quantum Technologies 2019, 2, 1800077. [Google Scholar] [CrossRef]
Abbas, A.; Sutter, D.; Zoufal, C.; Lucchi, A.; Figalli, A.; Woerner, S. The power of quantum neural networks. Nature Computational Science 2021, 1, 403–409. [Google Scholar] [CrossRef]
Lu, D.; Weng, Q. A survey of image classification methods and techniques for improving classification performance. International journal of Remote sensing 2007, 28, 823–870. [Google Scholar] [CrossRef]
Rawat, W.; Wang, Z. Deep convolutional neural networks for image classification: A comprehensive review. Neural computation 2017, 29, 2352–2449. [Google Scholar] [CrossRef]
Chandra, M.A.; Bedi, S. Survey on SVM and their application in image classification. International Journal of Information Technology 2021, 13, 1–11. [Google Scholar] [CrossRef]
Nembrini, R.; Ferrari Dacrema, M.; Cremonesi, P. Feature selection for recommender systems with quantum computing. Entropy 2021, 23, 970. [Google Scholar] [CrossRef] [PubMed]
Illésová, S.; Rybotycki, T.; Beseda, M. QMetric: Benchmarking Quantum Neural Networks Across Circuits, Features, and Training Dimensions. arXiv arXiv:2506.23765. [CrossRef]
Illésová, S.; Rybotycki, T.; Gawron, P.; Beseda, M. On the Importance of Fundamental Properties in Quantum-Classical Machine Learning Models. arXiv arXiv:2507.10161. [CrossRef]
Alvarez-Estevez, D. Benchmarking quantum machine learning kernel training for classification tasks. IEEE Transactions on Quantum Engineering; 2025. [Google Scholar]
Tychola, K.A.; Kalampokas, T.; Papakostas, G.A. Quantum machine learning—an overview. Electronics 2023, 12, 2379. [Google Scholar] [CrossRef]
Trovato, A.; Beseda, M.; Di Nucci, D. A Preliminary Investigation on the Usage of Quantum Approximate Optimization Algorithms for Test Case Selection. arXiv arXiv:2504.18955. [CrossRef]
Liu, X.; Angone, A.; Shaydulin, R.; Safro, I.; Alexeev, Y.; Cincio, L. Layer VQE: A variational approach for combinatorial optimization on noisy quantum computers. IEEE Transactions on Quantum Engineering 2022, 3, 1–20. [Google Scholar] [CrossRef]
Novák, V.; Zelinka, I.; Snášel, V. Optimization Strategies for Variational Quantum Algorithms in Noisy Landscapes. arXiv arXiv:2506.01715. [CrossRef]
Illésová, S.; Novák, V.; Bezděk, T.; Beseda, M.; Possel, C. Numerical Optimization Strategies for the Variational Hamiltonian Ansatz in Noisy Quantum Environments. arXiv arXiv:2505.22398. [CrossRef]
Illésová, S.; Bezděk, T.; Novák, V.; Senjean, B.; Beseda, M. Statistical Benchmarking of Optimization Methods for Variational Quantum Eigensolver under Quantum Noise. arXiv 2025, arXiv:2510.08727. [Google Scholar] [CrossRef]
Larson, J.; Menickelly, M.; Shi, J. A novel noise-aware classical optimizer for variational quantum algorithms. INFORMS Journal on Computing 2025, 37, 63–85. [Google Scholar] [CrossRef]
Abbas, A.; Ambainis, A.; Augustino, B.; Bärtschi, A.; Buhrman, H.; Coffrin, C.; Cortiana, G.; Dunjko, V.; Egger, D.J.; Elmegreen, B.G.; et al. Challenges and opportunities in quantum optimization. Nature Reviews Physics 2024, 1–18. [Google Scholar] [CrossRef]
Zelinka, I.; Kojecký, L.; Lampart, M.; Nowaková, J.; Plucar, J. iSOMA swarm intelligence algorithm in synthesis of quantum computing circuits. Applied Soft Computing 2023, 142, 110350. [Google Scholar] [CrossRef]
Powell, M.J.; et al. The BOBYQA algorithm for bound constrained optimization without derivatives; Cambridge NA Report NA2009/06, University of Cambridge: Cambridge, 2009; Volume 26, p. 1. [Google Scholar]
Nomura, M.; Shibata, M. cmaes: A simple yet practical python library for cma-es. arXiv 2024, arXiv:2402.01373. [Google Scholar] [CrossRef]
Novák, V.; Zelinka, I.; Přibylová, L.; Martínek, L.; Benčurik, V. Predicting Post-Surgical Complications with Quantum Neural Networks: A Clinical Study on Anastomotic Leak. arXiv arXiv:2506.01708. [CrossRef]
Horáčková, L.; Kojecký, L.; Fiore, U.; Vozňák, M.; Zelinka, I. Evolving Quantum Circuits: iSOMA-Driven Synthesis of Toffoli Gates. In Proceedings of the Proceedings of the 27th International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, 2025; Association for Computing Machinery; p. MSWiM ’25. [Google Scholar]
Gupta, M.K.; Beseda, M.; Gawron, P. How Quantum Computing-Friendly Multispectral Data can be? In Proceedings of the IGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium, 2022; pp. 4153–4156. [Google Scholar] [CrossRef]
Eisert, J.; Hangleiter, D.; Walk, N.; Roth, I.; Markham, D.; Parekh, R.; Chabaud, U.; Kashefi, E. Quantum certification and benchmarking. Nature Reviews Physics 2020, 2, 382–390. [Google Scholar] [CrossRef]
Bílek, A.; Hlisnikovský, J.; Bezděk, T.; Kukulski, R.; Lewandowska, P. Experimental study of multiple-shot unitary channels discrimination using the IBM Q computers. arXiv 2025, arXiv:2505.17731. [Google Scholar] [CrossRef]
Resch, S.; Karpuzcu, U.R. Benchmarking quantum computers and the impact of quantum noise. ACM Computing Surveys (CSUR) 2021, 54, 1–35. [Google Scholar] [CrossRef]
Lewandowska, P.; Beseda, M. Benchmarking gate-based quantum devices via certification of qubit von Neumann measurements. arXiv 2025, arXiv:2506.03514. [Google Scholar] [CrossRef]
Tilly, J.; Chen, H.; Cao, S.; Picozzi, D.; Setia, K.; Li, Y.; Grant, E.; Wossnig, L.; Rungger, I.; Booth, G.H.; et al. The variational quantum eigensolver: a review of methods and best practices. Physics Reports 2022, 986, 1–128. [Google Scholar] [CrossRef]
Rajamani, A.; Beseda, M.; Lasorne, B.; Senjean, B. How an Equi-ensemble Description Systematically Outperforms the Weighted-ensemble Variational Quantum Eigensolver. arXiv 2025, arXiv:quant. [Google Scholar]
Magnusson, E.; Fitzpatrick, A.; Knecht, S.; Rahm, M.; Dobrautz, W. Towards efficient quantum computing for quantum chemistry: Reducing circuit complexity with transcorrelated and adaptive ansatz techniques. Faraday Discussions 2024, 254, 402–428. [Google Scholar] [CrossRef] [PubMed]
Beseda, M.; Illésová, S.; Yalouz, S.; Senjean, B. State-Averaged Orbital-Optimized VQE: A quantum algorithm for the democratic description of ground and excited electronic states. Journal of Open Source Software 2024. [Google Scholar] [CrossRef]
Illésová, S.; Beseda, M.; Yalouz, S.; Lasorne, B.; Senjean, B. Transformation-Free Generation of a Quasi-Diabatic Representation from the State-Average Orbital-Optimized Variational Quantum Eigensolver. Journal of Chemical Theory and Computation 2025, 21, 5457–5480. [Google Scholar] [CrossRef] [PubMed]
Fedorov, D.A.; Peng, B.; Govind, N.; Alexeev, Y. VQE method: a short survey and recent developments. Materials Theory 2022, 6, 2. [Google Scholar] [CrossRef]
Ciaramelletti, C.; Beseda, M.; Consiglio, M.; Lepori, L.; Apollaro, T.J.G.; Paganelli, S. Detecting quasidegenerate ground states in topological models via the variational quantum eigensolver. Phys. Rev. A 2025, 111, 022437. [Google Scholar] [CrossRef]

Figure 1. Artificially generated samples of the dataset, that were used in the training and validation process.

Figure 3. Confusion matrix at decision threshold 0.5.

Table 1. Global descriptors used as input to the QNN branch.

Descriptor	Definition	Interpretation
Mean intensity	$\frac{1}{H W} \sum_{i, j} X_{i, j}$	Overall brightness.
Horizontal variation	$\frac{1}{H (W - 1)} \sum_{i, j} \| X_{i, j + 1} - X_{i, j} \|$	Texture change left-right.
Vertical variation	$\frac{1}{(H - 1) W} \sum_{i, j} \| X_{i + 1, j} - X_{i, j} \|$	Texture change top-bottom.
Global contrast	$max X_{i, j} - min X_{i, j}$	Dynamic range of intensities.
Bright pixel share	$\frac{1}{H W} \sum_{i, j} 1 [X_{i, j} > τ]$	Presence of strongly illuminated regions.
Laplacian edge energy	$\frac{1}{H W} \sum_{i, j} \| 4 X_{i, j} - X_{i \pm 1, j} - X_{i, j \pm 1} \|$	Strength of edges and boundaries.

Table 2. Validation performance of the hybrid model at threshold

0.5

.

Table 2. Validation performance of the hybrid model at threshold

0.5

.

Metric	Value	Definition
Accuracy	0.9750	Share of all predictions that are correct.
Precision	1.0000	Share of predicted positives that are correct.
Recall	0.9500	Share of actual positives that are detected.
F1-score	0.9744	Harmonic mean of precision and recall.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

On the Complementarity of Classical Convolution and Quantum Neural Networks in Image Classification

Abstract

Keywords:

Subject:

1. Introduction

2. Methodology

Data Generation

Model Architecture

CNN Branch

QNN Branch

Fusion and Output

Training Procedure

3. Results

4. Conclusions

Data Availability Statement

Acknowledgments

References

MDPI Initiatives

Important Links

Subscribe