This article presents PrevOccupAI-HAR, a new publicly available dataset designed to advance smartphone-based human activity recognition (HAR) in office environments. PrevOccupAI-HAR comprises two sub-datasets: (1) a model development dataset collected under controlled conditions, featuring 20 subjects performing nine sub-activities associated to three main activity classes (sitting, standing, and walking), and (2) a real-world dataset captured in an unconstrained office setting captured from 13 subjects carrying out their daily office work for six hours continuously. Three machine learning models, namely k-nearest neighbors (KNN), support vector machine (SVM), and random forest, were trained on the model development dataset to classify the three main classes independently of sub-activity variation. The models achieved accuracies of 90.94 %, 92.33 %, and 93.02 % for the KNN, SVM, and Random Forest, respectively, on the development dataset. When deployed on the real-world dataset, the models attained mean accuracies of 69.32 %, 79.43 %, and 77.81 %, reflecting performance degradations between 21.62 % and 12.90 %. Analysis of sequential predictions revealed frequent short-duration misclassifications, predominantly between sitting and standing, resulting in unstable model outputs. The findings highlight key challenges in transitioning HAR models from controlled to real-world contexts and point to future research directions involving temporal deep learning architectures or post-processing methods to enhance prediction consistency.