ARTICLE | doi:10.20944/preprints202304.0086.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Image fusion; generative adversarial network (GAN); local binary patterns (LBP); multi-modal images
Online: 6 April 2023 (10:03:31 CEST)
Image fusion is the process of combining multiple input images from single or multiple imaging modalities into a fused image, which is expected to be more informative for human or machine perception as compared to any of the input images. In this paper, we propose a novel method based on deep learning for fusing infrared images and visible images, named the LBP-based proportional input generative adversarial network (LPGAN). In the image fusion task, the preservation of structural similarity and image gradient information is contradictory, and it is difficult for both to achieve good performance at the same time. To solve this problem, we innovatively introduce Local Binary Patterns (LBP) into Generative Adversarial Networks (GANs), which effectively utilize the texture features of the source images, so that the network has stronger feature extraction ability and anti-interference ability. In the feature extraction stage, we introduce a pseudo-siamese network for the generator to extract the detailed features and the contrast features. At the same time, considering the characteristic distribution of different modal images, we propose a 1:4 scale input mode. Extensive experiments on the publicly available TNO dataset and CVC14 dataset show that the proposed method achieves the state-of-the-art performance. We also test the universality of LPGAN through the fusion of RGB and infrared images on the RoadScene dataset. In addition, LPGAN is applied to multi-spectral remote sensing image fusion. Both qualitative and quantitative experiments demonstrate that our LPGAN can not only achieve good structural similarity, but also retain rich detailed information.