From Seeing to Knowing the World: A Survey of Vision World Models

Xiao Yu; Yichen Zhang; Mingzhang Wang; Shifang Zhao; Weizhe Liu; Yuyang Yin; Zhongwei Ren; Ning An; Xinglong Wu; Hao Liu; Houwen Peng; Yao Zhao; Jianchao Yang; Jiashi Feng; Shuicheng Yan; Yunchao Wei; Xiaojie Jin

doi:10.20944/preprints202604.2072.v1

Submitted:

28 April 2026

Posted:

29 April 2026

You are already at the latest version

Abstract

Acquiring world knowledge directly from visual observation is fundamental to Artificial General Intelligence (AGI). To support this capability, the Vision World Model (VWM) has emerged as a key paradigm, which learns how the world evolves over time from visual streams. However, recent progress has been driven by diverse research communities, resulting in inconsistent problem formulations, disconnected taxonomies, and divergent evaluation protocols. We argue that addressing this gap requires a conceptual shift: vision should not be treated merely as an input modality, but as the primary driver shaping how world models are represented, learned, and evaluated. Guided by this vision-centric perspective, we introduce a unified framework that organizes VWM research into three core components: vision encoding, knowledge learning, and controllable simulation, and use it to analyze existing model designs and evaluation methodologies. Finally, we outline future research directions that emphasize stronger physical and causal grounding, more meaningful evaluation beyond visual appearance, and scaling toward more general and reliable world modeling capabilities.

Keywords:

survey

;

world model

;

vision world model

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

From Seeing to Knowing the World: A Survey of Vision World Models

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe