Preprint
Article

This version is not peer-reviewed.

AI-Driven Multi-Modal Assessment of Visual Impression in Architectural Event Spaces: A Cross-Cultural Behavioral and Sentiment Analysis

Submitted:

22 December 2025

Posted:

24 December 2025

You are already at the latest version

Abstract
Visual Impression in Architectural Space (VIAS) plays a central role in how users intuitively respond to surrounding environment, where visual stimuli such as signage, layout, and spatial density immediately shape attention, movement, and engagement. While designers intentionally deploy these visual attractors, the resulting perceptual and behavioural responses remain uncertain and vary across cultural and methodological contexts. To address this challenge, this study reframes urban public space, taking event-space as a case study, by integrating architecture and data-science into a framework that combines VIAS theory, behaviour-perception analysis, and sentiment-aware linguistic modelling. Firstly, we introduce a visual behavioural layer that identifies how spatial attractors such as advertising banners, product displays and event layouts. Secondly, we construct an expanded dataset from previous research comprising eight native participants interviewed in their native language, enabling linguistically accurate and culturally grounded comparison with the previous English-based mixed cohort. Thirdly, we develop a multi-modal sentiment-weighted keyword extraction algorithm that captures participant-initiated perceptual themes while suppressing interviewer influence and modality-specific bias, enabling alignment between verbal impressions and visual-behavioural evidence. Finally, we compare three interview modalities (onsite, video-based and virtual-environment) against behavioural observation data collected at a small-scale event in Matsue City, Japan. Results demonstrate that onsite participants exhibit systematic positive bias driven by the festive atmosphere, while remote modalities elicit more balanced assessments of visual clarity, signage effectiveness, stall arrangement, and missing spatial amenities. Furthermore, cross-linguistic analysis reveals cultural differences: native participants emphasise holistic spatial atmosphere, whereas international participants identify discrete visual focal points. By integrating visual attractors, behavioural metrics, and sentiment-aware linguistic patterns, the proposed framework provides a replicable method for explaining how designed visual elements trigger, reinforce, or contradict actual user behaviour. The findings offer evidence-based guidance for designing inclusive temporary event spaces, highlighting how architectural visual elements can be validated and refined through multi-modal computational analysis.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated