Geospatial AIoT for Personalized Slow Tourism Experiences Meeting Urban Senior Care Needs and Promoting Mental Well-Being

Shuriya B

doi:10.20944/preprints202603.0861.v1

Submitted:

09 March 2026

Posted:

11 March 2026

You are already at the latest version

Abstract

Inclusive communication remains a critical challenge for individuals with hearing impairments, speech disorders, or multilingual barriers, particularly in educational and urban settings. This paper proposes a multimodal LLM framework that transforms auditory inputs such as noisy speech, accents, or sign language into coherent speech synthesis and written outputs, enabling seamless accessibility. Leveraging transformer-conformer architectures, our system fuses audio spectrograms, lip-reading visuals, and textual context via cross-modal attention mechanisms, achieving superior performance in real-time transcription (WER < 5% on diverse datasets) and voice cloning tailored to user prosody. Key innovations include adaptive noise suppression for hearing aid integration, ethical personalization to preserve speaker identity, and deployment on edge devices for low-latency applications like VR classrooms. Evaluations on benchmarks (e.g., LibriSpeech, VoxCeleb) and user trials with 50 participants (including seniors and hard-of-hearing students) demonstrate 30% improvements in comprehension accuracy and user satisfaction over baselines like Whisper and GPT-4V. By bridging auditory-to-text/speech gaps, this framework advances AI pedagogies for immersive learning, promotes equity in communication, and sets foundations for scalable IoT-enhanced inclusive tools. Future directions explore federated learning for privacy-preserving multilingual expansions.

Keywords:

multimodal large language models

;

inclusive communication

;

speech recognition

;

hearing enhancement

;

voice synthesis

;

cross-modal fusion

;

transformer-conformer

Subject:

Computer Science and Mathematics - Computer Science

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Geospatial AIoT for Personalized Slow Tourism Experiences Meeting Urban Senior Care Needs and Promoting Mental Well-Being

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe