Active Vision for Social Navigation

Jack Vice; Gita Sukthankar

doi:10.20944/preprints202604.0068.v1

Submitted:

01 April 2026

Posted:

02 April 2026

You are already at the latest version

Abstract

Traditional social navigation systems often treat perception and motion as decoupled tasks, leading to reactive behaviors and perceptual surprise due to limited field of view. While active vision—the ability to choose where to look—offers a solution, most existing frameworks decouple sensing from execution to simplify the learning process. This article introduces a novel joint reinforcement learning (RL) framework (Active Vision for Social Navigation) that unifies locomotion and discrete gaze control within a single, end-to-end policy. Unlike existing factored approaches, our method leverages a model-based RL architecture with a latent world model to explicitly address the credit assignment problem inherent in active sensing. Experimental results in cluttered, dynamic environments demonstrate that our joint policy outperforms factored sensing-action approaches by prioritizing viewpoints specifically relevant to social safety, such as checking blind spots and tracking human trajectories. Our findings suggest that tight sensorimotor coupling is essential for reducing perceptual surprise and ensuring safe, socially aware navigation in unstructured spaces.

Keywords:

social navigation

;

active vision

;

reinforcement learning

Subject:

Computer Science and Mathematics - Robotics

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Active Vision for Social Navigation

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe