Submitted:
03 November 2023
Posted:
06 November 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We introduce a novel transformer-based network called SMAformer, it takes advantage of the flexibility of the attention mechanism in the transformer to effectively extract intricate long-range information in both spatial and spectral dimensions.
- We use the sparse attention mechanism to replace the self-attention mechanism. This enables us to achieve a sparse representation of spatial information while eliminating interfering noise in space.
- Extensive experiments on three benchmark datasets, Cave, Harvard, and Pavia Center, show that SMAformer outperforms state-of-the-art methods.
2. Related Work
2.1. Traditional Work
2.2. Deep Learning Based Work
3. Method
3.1. Network Architecture
3.2. Spectral Mix Attention Block
3.3. Sparse Spectral Mix Attention Block
3.4. Loss Function
4. Experiments
4.1. Experimental Settings
4.2. Datasets
4.3. Evaluation Metrics
4.3.1. PSNR
4.3.2. SSIM
4.3.3. SAM
4.3.4. ERGAS
4.3.5. UIQI
4.4. Quantitative Analysis
4.5. Qualitative Analysis
4.6. Ablation Study
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Sun, L.; Zhao, G.; Zheng, Y.; Wu, Z. Spectral–spatial feature tokenization transformer for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing 2022, 60, 1–14. [Google Scholar] [CrossRef]
- Uzkent, B.; Hoffman, M.J.; Vodacek, A. Real-time vehicle tracking in aerial video using hyperspectral features. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016, pp. 36–44.
- Hong, D.; Yao, J.; Meng, D.; Xu, Z.; Chanussot, J. Multimodal GANs: Toward crossmodal hyperspectral–multispectral image segmentation. IEEE Transactions on Geoscience and Remote Sensing 2020, 59, 5103–5113. [Google Scholar] [CrossRef]
- Aiazzi, B.; Baronti, S.; Selva, M. Improving component substitution pansharpening through multivariate regression of MS + Pan data. IEEE Transactions on Geoscience and Remote Sensing 2007, 45, 3230–3239. [Google Scholar] [CrossRef]
- Chavez, P.; Sides, S.C.; Anderson, J.A.; others. Comparison of three different methods to merge multiresolution and multispectral data- Landsat TM and SPOT panchromatic. Photogrammetric Engineering and remote sensing 1991, 57, 295–303. [Google Scholar]
- Burt, P.J.; Adelson, E.H. The Laplacian pyramid as a compact image code. In Readings in computer vision; Elsevier, 1987; pp. 671–679.
- Loncan, L.; De Almeida, L.B.; Bioucas-Dias, J.M.; Briottet, X.; Chanussot, J.; Dobigeon, N.; Fabre, S.; Liao, W.; Licciardi, G.A.; Simoes, M.; others. Hyperspectral pansharpening: A review. IEEE Geoscience and remote sensing magazine 2015, 3, 27–46. [Google Scholar] [CrossRef]
- Starck, J.L.; Fadili, J.; Murtagh, F. The undecimated wavelet decomposition and its reconstruction. IEEE transactions on image processing 2007, 16, 297–309. [Google Scholar] [CrossRef]
- Bungert, L.; Coomes, D.A.; Ehrhardt, M.J.; Rasch, J.; Reisenhofer, R.; Schönlieb, C.B. Blind image fusion for hyperspectral imaging with the directional total variation. Inverse Problems 2018, 34, 044003. [Google Scholar] [CrossRef]
- Akhtar, N.; Shafait, F.; Mian, A. Bayesian sparse representation for hyperspectral image super resolution. Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3631–3640.
- Dian, R.; Fang, L.; Li, S. Hyperspectral image super-resolution via non-local sparse tensor factorization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5344–5353.
- Li, S.; Dian, R.; Fang, L.; Bioucas-Dias, J.M. Fusing hyperspectral and multispectral images via coupled sparse tensor factorization. IEEE Transactions on Image Processing 2018, 27, 4118–4130. [Google Scholar] [CrossRef]
- Kawakami, R.; Matsushita, Y.; Wright, J.; Ben-Ezra, M.; Tai, Y.W.; Ikeuchi, K. High-resolution hyperspectral imaging via matrix factorization. CVPR 2011. IEEE, 2011, pp. 2329–2336.
- Akhtar, N.; Shafait, F.; Mian, A. Sparse spatio-spectral representation for hyperspectral image super-resolution. Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6-12 September 2014, Proceedings, Part VII 13. Springer, 2014, pp. 63–78.
- Shen, D.; Liu, J.; Wu, Z.; Yang, J.; Xiao, L. ADMM-HFNet: A matrix decomposition-based deep approach for hyperspectral image fusion. IEEE Transactions on Geoscience and Remote Sensing 2021, 60, 1–17. [Google Scholar] [CrossRef]
- Liu, J.; Shen, D.; Wu, Z.; Xiao, L.; Sun, J.; Yan, H. Patch-aware deep hyperspectral and multispectral image fusion by unfolding subspace-based optimization model. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2022, 15, 1024–1038. [Google Scholar] [CrossRef]
- Yao, J.; Hong, D.; Chanussot, J.; Meng, D.; Zhu, X.; Xu, Z. Cross-attention in coupled unmixing nets for unsupervised hyperspectral super-resolution. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX 16. Springer, 2020, pp. 208–224.
- Yang, J.; Zhao, Y.Q.; Chan, J.C.W. Hyperspectral and multispectral image fusion via deep two-branches convolutional neural network. Remote Sensing 2018, 10, 800. [Google Scholar] [CrossRef]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Learning a deep convolutional network for image super-resolution. Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part IV 13. Springer, 2014, pp. 184–199.
- Dong, C.; Loy, C.C.; Tang, X. Accelerating the super-resolution convolutional neural network. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer, 2016, pp. 391–407.
- Palsson, F.; Sveinsson, J.R.; Ulfarsson, M.O. Multispectral and hyperspectral image fusion using a 3-D-convolutional neural network. IEEE Geoscience and Remote Sensing Letters 2017, 14, 639–643. [Google Scholar] [CrossRef]
- Xie, Q.; Zhou, M.; Zhao, Q.; Xu, Z.; Meng, D. MHF-Net: An interpretable deep network for multispectral and hyperspectral image fusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 2020, 44, 1457–1473. [Google Scholar] [CrossRef]
- Hu, J.F.; Huang, T.Z.; Deng, L.J.; Dou, H.X.; Hong, D.; Vivone, G. Fusformer: A transformer-based fusion network for hyperspectral image super-resolution. IEEE Geoscience and Remote Sensing Letters 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Jia, S.; Min, Z.; Fu, X. Multiscale spatial–spectral transformer network for hyperspectral and multispectral image fusion. Information Fusion 2023, 96, 117–129. [Google Scholar] [CrossRef]
- Nunez, J.; Otazu, X.; Fors, O.; Prades, A.; Pala, V.; Arbiol, R. Multiresolution-based image fusion with additive wavelet decomposition. IEEE Transactions on Geoscience and Remote sensing 1999, 37, 1204–1211. [Google Scholar] [CrossRef]
- Yokoya, N.; Yairi, T.; Iwasaki, A. Coupled nonnegative matrix factorization unmixing for hyperspectral and multispectral data fusion. IEEE Transactions on Geoscience and Remote Sensing 2011, 50, 528–537. [Google Scholar] [CrossRef]
- Zhang, K.; Wang, M.; Yang, S.; Jiao, L. Spatial–spectral-graph-regularized low-rank tensor decomposition for multispectral and hyperspectral image fusion. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2018, 11, 1030–1040. [Google Scholar] [CrossRef]
- Xu, Y.; Wu, Z.; Chanussot, J.; Wei, Z. Hyperspectral images super-resolution via learning high-order coupled tensor ring representation. IEEE transactions on neural networks and learning systems 2020, 31, 4747–4760. [Google Scholar] [CrossRef]
- Dian, R.; Li, S.; Guo, A.; Fang, L. Deep hyperspectral image sharpening. IEEE transactions on neural networks and learning systems 2018, 29, 5345–5355. [Google Scholar] [CrossRef]
- Chen, X.; Li, H.; Li, M.; Pan, J. Learning A Sparse Transformer Network for Effective Image Deraining. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5896–5905.
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Yasuma, F.; Mitsunaga, T.; Iso, D.; Nayar, S.K. Generalized assorted pixel camera: postcapture control of resolution, dynamic range, and spectrum. IEEE transactions on image processing 2010, 19, 2241–2253. [Google Scholar] [CrossRef] [PubMed]
- Chakrabarti, A.; Zickler, T. Statistics of real-world hyperspectral images. CVPR 2011. IEEE, 2011, pp. 193–200.
- Yuanji, W.; Jianhua, L.; Yi, L.; Yao, F.; Qinzhong, J. Image quality evaluation based on image weighted separating block peak signal to noise ratio. International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003. IEEE, 2003, Vol. 2, pp. 994–997.
- Yuhas, R.H.; Goetz, A.F.; Boardman, J.W. Discrimination among semi-arid landscape endmembers using the spectral angle mapper (SAM) algorithm. JPL, Summaries of the Third Annual JPL Airborne Geoscience Workshop. Volume 1: AVIRIS Workshop, 1992.
- Wald, L. Quality of high resolution synthesised images: Is there a simple criterion? Third conference" Fusion of Earth data: merging point measurements, raster maps and remotely sensed images". SEE/URISCA, 2000, pp. 99–103.
- Wang, Z.; Bovik, A.C. A universal image quality index. IEEE signal processing letters 2002, 9, 81–84. [Google Scholar] [CrossRef]
- Dian, R.; Li, S.; Fang, L. Learning a low tensor-train rank representation for hyperspectral image super-resolution. IEEE transactions on neural networks and learning systems 2019, 30, 2672–2683. [Google Scholar] [CrossRef]
- Wang, W.; Zeng, W.; Huang, Y.; Ding, X.; Paisley, J. Deep blind hyperspectral image fusion. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4150–4159.
- Ran, R.; Deng, L.J.; Jiang, T.X.; Hu, J.F.; Chanussot, J.; Vivone, G. GuidedNet: A general CNN fusion framework via high-resolution guidance for hyperspectral image super-resolution. IEEE Transactions on Cybernetics 2023. [Google Scholar] [CrossRef]








| Title | PSNR↑ | SSIM ↑ | SAM↓ | ERGAS↓ | UIQI↑ |
|---|---|---|---|---|---|
| LTTR[38] | 41.20 | 0.980 | 4.25 | 1.90 | 0.984 |
| MHFnet[22] | 45.23 | 0.988 | 4.88 | 0.71 | 0.981 |
| DBIN[39] | 45.02 | 0.981 | 3.38 | 0.71 | 0.992 |
| ADMM-HFNet[15] | 45.48 | 0.992 | 3.39 | 0.71 | 0.992 |
| SpfNet[16] | 46.29 | 0.990 | 4.24 | 1.46 | 0.980 |
| Fusformer[23] | 42.18 | 0.993 | 3.07 | 1.25 | 0.992 |
| GuidedNet[40] | 45.41 | 0.991 | 4.03 | 0.97 | - |
| Ours | 46.56 | 0.994 | 2.92 | 0.64 | 0.995 |
| Title | PSNR↑ | SSIM ↑ | SAM↓ | ERGAS↓ | UIQI↑ |
|---|---|---|---|---|---|
| LTTR[38] | 40.06 | 0.999 | 4.69 | 1.29 | 0.993 |
| MHFnet[22] | 44.50 | 0.981 | 3.68 | 1.21 | 0.991 |
| DBIN[39] | 45.33 | 0.983 | 3.04 | 1.09 | 0.995 |
| ADMM-HFNet[15] | 45.53 | 0.983 | 3.04 | 1.08 | 0.995 |
| SpfNet[16] | 45.09 | 0.984 | 2.31 | 0.65 | 0.997 |
| Fusformer[23] | 41.96 | 0.995 | 3.33 | 2.86 | 0.995 |
| GuidedNet[40] | 41.64 | 0.981 | 2.85 | 1.20 | - |
| Ours | 47.86 | 0.995 | 2.25 | 0.75 | 0.997 |
| STGAES | PSNR↑ | SSIM ↑ | SAM↓ | ERGAS↓ | UIQI↑ | Time↓ |
|---|---|---|---|---|---|---|
| 1-stage | 43.06 | 0.986 | 3.69 | 1.19 | 0.993 | 0.45 |
| 2-stages | 45.38 | 0.990 | 3.20 | 0.97 | 0.991 | 0.51 |
| 3-stages | 46.56 | 0.994 | 2.92 | 0.64 | 0.995 | 0.55 |
| 4-stages | 46.01 | 0.995 | 2.91 | 0.78 | 0.993 | 0.68 |
| Method | PSNR↑ | SSIM ↑ | SAM↓ | ERGAS↓ | UIQI↑ |
|---|---|---|---|---|---|
| w/o SMAB | 44.38 | 0.991 | 3.24 | 0.96 | 0.990 |
| w/o SSMAB | 45.02 | 0.990 | 2.96 | 0.77 | 0.992 |
| Ours | 46.56 | 0.994 | 2.92 | 0.64 | 0.995 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).