Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval

Version 1 : Received: 12 March 2024 / Approved: 13 March 2024 / Online: 13 March 2024 (07:29:35 CET)

A peer-reviewed article of this Preprint also exists.

Ma, T.; Organisciak, D.; Ma, W.; Long, Y. Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval. Electronics 2024, 13, 1660. Ma, T.; Organisciak, D.; Ma, W.; Long, Y. Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval. Electronics 2024, 13, 1660.

Abstract

The pursuit of Artificial Intelligence (AI) that emulates human cognitive processes is a cornerstone of ethical AI development, ensuring that emerging technologies can integrate seamlessly into societal frameworks requiring nuanced understanding and decision-making. Zero-Shot Instance Retrieval (ZSIR) stands at the forefront of this endeavour, potentially providing a robust platform for AI systems, particularly large visual language models, to demonstrate and refine cognition-aligned learning without the need for direct experience. In this paper, we critically evaluate current cognition alignment methodologies within traditional zero-shot learning paradigms, using visual attributes and word embedding generated by large AI models. We propose a unified similarity function that quantifies the cognitive alignment level, bridging the gap between AI processes and human-like understanding. Through extensive experimentation, our findings illustrate that this similarity function can effectively mirror the visual-semantic gap, steering the model towards enhanced performance in zero-shot instance retrieval. This work not only benchmarks the cognition alignment of AI, but also sets a new precedent for the development of visual language models attuned to the complexities of human cognition.}

Keywords

Large Visual Language Models; Zero-Shot Instance Retrieval; Cognition-Alignment

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.