Zhu, L.; Song, J.; Wei, X.; Jun, L. Adversarial Learning Based Semantic Correlation Representation for Cross-Modal Retrieval. Preprints2020, 2020010288. https://doi.org/10.20944/preprints202001.0288.v1
APA Style
Zhu, L., Song, J., Wei, X., & Jun, L. (2020). Adversarial Learning Based Semantic Correlation Representation for Cross-Modal Retrieval. Preprints. https://doi.org/10.20944/preprints202001.0288.v1
Chicago/Turabian Style
Zhu, L., Xiangxiang Wei and Long Jun. 2020 "Adversarial Learning Based Semantic Correlation Representation for Cross-Modal Retrieval" Preprints. https://doi.org/10.20944/preprints202001.0288.v1
Abstract
With the rapid development of Internet and the widely usage of smart devices, massive multimedia data are generated, collected, stored and shared on the Internet. This trend makes cross-modal retrieval problem become a hot issue in this years. Many existing works pay attentions on correlation learning to generate a common subspace for cross-modal correlation measurement, and others uses adversarial learning technique to abate the heterogeneity of multi-modal data. However, very few works combine correlation learning and adversarial learning to bridge the inter-modal semantic gap and diminish cross-modal heterogeneity. This paper propose a novel cross-modal retrieval method, named ALSCOR, which is an end-to-end framework to integrate cross-modal representation learning, correlation learning and adversarial. CCA model, accompanied by two representation model, VisNet and TxtNet is proposed to capture non-linear correlation. Beside, intra-modal classifier and modality classifier are used to learn intra-modal discrimination and minimize the inter-modal heterogeneity. Comprehensive experiments are conducted on three benchmark datasets. The results demonstrate that the proposed ALSCOR has better performance than the state-of-the-arts.
Keywords
Cross-modal retrieval; Adversarial learning; Semantic correlation; Deep learning
Subject
Computer Science and Mathematics, Information Systems
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.