The Power of Words: Leveraging Deep Learning Techniques to Predict Hotel Ratings from User Reviews

Milena Nikolić; Miloš Stojanović; Marina Marjanović

doi:10.20944/preprints202604.0921.v1

Submitted:

10 April 2026

Posted:

14 April 2026

You are already at the latest version

Abstract

Online reviews represent a major source of information for evaluating customer experience and supporting decision making in the hospitality industry, yet rating prediction from review content remains challenging because review text is often short, noisy, and internally inconsistent. This study presents a deep learning framework for predicting hotel ratings from guest reviews while explicitly addressing data quality before model training. Data reliability is treated as a central modeling concern. The proposed methodology combines review titles, review texts, and associated tags with a structured preprocessing pipeline that incorporates sentiment inconsistency detection, textual similarity analysis, deviation analysis based on correlation, and reviewer behavior profiling to identify unreliable observations. On the filtered corpus, we evaluate multiple predictive architectures, including LSTM, Bidirectional LSTM variants, and DistilBERT, for review-level rating prediction, and we further examine hotel-level temporal forecasting through aggregated historical review signals over a 30-day horizon. The results indicate that model performance depends strongly on both data reliability and architectural choice. Among recurrent models, BiLSTM with self-attention achieves the best performance, while DistilBERT yields the strongest overall results. Ablation analysis confirms that the full preprocessing pipeline consistently improves prediction quality, and the forecasting experiments indicate that aggregated review features contain useful information for short-term hotel rating dynamics. The study contributes a systematic and practically relevant framework for rating prediction and hospitality analytics in support of reputation management.

Keywords:

hotel reviews

;

rating prediction

;

sentiment analysis

;

deep learning

;

natural language processing

;

anomaly detection

;

hospitality analytics

;

recurrent neural networks

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

The Power of Words: Leveraging Deep Learning Techniques to Predict Hotel Ratings from User Reviews

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe