Preprint
Article

This version is not peer-reviewed.

Smart Bedside Traceability of Caregiver-Patient Interactions Using Wearables and Tiny Localization Anchors

Submitted:

01 June 2026

Posted:

02 June 2026

You are already at the latest version

Abstract
Caregiver–patient traceability is essential for measuring care workload and interaction time in shared hospital rooms, where a single caregiver attends multiple patients and manual documentation is intrusive, time-consuming, and prone to errors. This paper proposes a non-invasive smart bedside sensing approach based on a commercial smartwatch worn by the caregiver and two compact ultra-wideband (UWB) localization anchors placed near the monitored beds. The smartwatch provides Magnetic, Angular Rate, and Gravity (MARG) data, while the UWB anchors enable bed-level proximity estimation without any device worn by the patient. The system was evaluated in a shared hospital room of approximately 5.0 × 4.8 m², with two hospital beds separated by 1.3–1.7 m. A dataset of four scenes totalling 19.9 min was collected, producing 1,140 one-second labeled samples and 1,074 valid windows after past–delayed sliding-window segmentation. Five supervised models were evaluated, including Long Short-Term Memory (LSTM) networks, a hybrid Convolutional Neural Network plus LSTM (CNN+LSTM) architecture, Extreme Gradient Boosting (XGBoost), and a non-linear Support Vector Machine (SVM). The Simple LSTM achieved the best performance, with 96% accuracy and a macro F1-score of 0.89. The results demonstrate accurate identification of the attended bed from wrist-worn and UWB signals alone, supporting scalable and objective caregiver–patient traceability in shared hospital rooms with minimal infrastructure.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  

1. Introduction

The increasing demand for healthcare and long-term care services, driven by population aging and the growing prevalence of chronic conditions, has intensified the need for objective, continuous, and scalable methods to characterize care workload and caregiver–patient interactions [1]. This need is particularly acute in multi-occupancy care spaces—shared hospital rooms and residential bedrooms—where a single caregiver attends multiple patients sequentially and manual documentation is both disruptive and unreliable [2]. Accurately determining who is being attended, for how long, and in which spatial context is fundamental to support workload assessment, resource allocation, quality-of-care auditing, and evidence-based staffing decisions. Yet these interactions are still commonly documented through manual records or self-reported logs, which are time-consuming, intrusive, and prone to incompleteness [3].
Ambient Assisted Living (AAL) technologies, wearable devices, and non-invasive sensing systems have opened new opportunities for monitoring health- and care-related behaviors in situ [4,5]. By enabling the continuous acquisition of contextual and behavioral data, these technologies support the extraction of indicators related to mobility, routines, social interaction, and caregiver support [6]. Nevertheless, transforming raw sensor streams into actionable caregiver–patient interaction records remains challenging, especially in shared bedrooms where several individuals coexist simultaneously and the spatial separation between care recipients may be as small as one or two meters. Real-Time Locating Systems (RTLS) and wearable inertial sensing have been separately explored for staff tracking and nurse workflow analysis [7,8]. Ultra-Wideband (UWB) technology is especially promising for indoor localization due to its centimeter-level ranging accuracy and suitability for compact, low-power devices [9]. In parallel, wearable Magnetic, Angular Rate, and Gravity (MARG) sensors embedded in commercial smartwatches can capture body motion and orientation with minimal disruption to care workflows. However, combining UWB proximity with wrist-worn MARG data for fine-grained, patient-level interaction traceability in shared bedrooms remains an open and insufficiently addressed problem [10].
Existing approaches to caregiver–patient proximity detection suffer from one or more of the following limitations: (i) they rely on patient-worn devices, which raises acceptability and infection-control concerns in clinical settings; (ii) they deploy dense RTLS infrastructure (many anchors, dedicated servers), which is costly and difficult to scale; or (iii) they resolve location at room level rather than bed level, which is insufficient when two care recipients occupy the same room. There is therefore a clear need for lightweight, non-invasive systems that can distinguish between beds within a shared room using only caregiver-worn sensing and minimal bedside hardware.
This paper addresses that need with the following three contributions:
  • A lightweight sensing configuration that requires only a caregiver-worn commercial smartwatch and two compact bedside UWB anchors, with no device worn by the patient and no dedicated localization server.
  • A multimodal dataset of four scenes collected in a shared hospital room, designed to capture realistic caregiver–patient interaction patterns under clinically representative conditions.
  • A comparative evaluation of machine learning and deep learning models for bed-level interaction classification, demonstrating accurate caregiver–patient traceability from MARG and UWB signals alone.
The remainder of this paper is organized as follows. Section 2 reviews related work on care time measurement, AAL monitoring, indoor localization, wearable sensing, and multimodal human activity recognition. Section 3 describes the sensing system, deployment scenario, data acquisition protocol, preprocessing pipeline, and classification models. Section 4 presents the experimental results. Section 5 discusses the findings and their implications. Section 6 summarizes the main conclusions and outlines future research directions.

3. Materials and Methods

3.1. Sensing Architecture and Hardware

The proposed system combines three hardware elements, depicted in Figure 1: a commercial smartwatch worn by the caregiver, two compact UWB anchors fixed at the headboard of each monitored bed, and a Raspberry Pi 4B edge gateway. Care recipients carry no device. Figure 2 shows the wearable and UWB components alongside the smartwatch application interface, which displays real-time ranging status and the estimated attended zone.
The wearable sensing platform is a Google Pixel Watch 4 [24], which provides continuous MARG data from its on-board accelerometer, gyroscope, and magnetometer. The acquired stream comprises tri-axial acceleration { a c c x , a c c y , a c c z } , tri-axial angular rate { g y r x , g y r y , g y r z } , tri-axial magnetic field { m a g x , m a g y , m a g z } , and rotation-vector components { r o t x , r o t y , r o t z , r o t w , r o t _ a c c u r a c y } , yielding 14 channels in total. The ambient localization component relies on Qorvo DWM3001CDK UWB development kits [25]. One-to-one Two-Way Ranging (TWR) sessions between a UWB tag carried by the caregiver and each bedside anchor produce two bed-level proximity streams, denoted d i s t a n c e _ 00 _ 01 (Bed A) and d i s t a n c e _ 00 _ 02 (Bed B). The Raspberry Pi 4B gateway hosted a local Mosquitto MQTT broker using the default configuration on Transmission Control Protocol (TCP) port 1883. UWB ranging measurements from the two bedside anchors were transmitted as lightweight MQTT messages within the local wireless network and stored locally with gateway-side timestamps. Messages were published using Quality of Service (QoS) 0 (at-most-once delivery), which minimizes transmission overhead and is suitable for real-time sensor streaming where occasional missing samples can be handled during preprocessing. No cloud broker or external server was used. In the recorded dataset, the effective UWB update interval was approximately 2.7–2.9 s in median per anchor, with occasional longer gaps associated with ranging dropouts. The smartwatch orchestrates each acquisition session by assigning a unique identifier shared across all data streams, ensuring precise temporal alignment between MARG and UWB data during postprocessing.

3.2. Deployment Scenario and Dataset

Data were collected in a shared hospital room equipped with two hospital beds, bedside clinical equipment, wall-mounted medical infrastructure, and clinical training mannequins (Figure 3). The room measured approximately 5.0 × 4.8 m2, with a ceiling height of approximately 2.7 m. The two beds were arranged in parallel with a mattress-edge separation of 1.3–1.7 m, representing a compact shared clinical space in which bed-level discrimination is realistically challenging. One UWB anchor was placed near the headwall of each bed at a height of approximately 1.2–1.5 m, yielding a horizontal anchor separation of approximately 3.0 m.
A single caregiver performed structured interaction sessions at each bed across four scenes. Ground-truth labels were recorded in real time via the smartwatch interface, which timestamped the start and end of each interaction with Bed A, Bed B, or neither (Class 0, corresponding to transitions or non-care activities). Table 2 summarizes the dataset statistics.

3.3. Data Preprocessing and Temporal Segmentation

Raw recordings were stored in long format and transformed into a multivariate second-level representation by aggregating sensor samples using the per-channel median and assigning labels by modal class within each one-second interval. The resulting tensor combines 14 MARG channels with 2 UWB ranging channels, yielding D = 16 features per time step. Temporal samples were generated using a past–delayed sliding-window strategy (Figure 4): each window of L = p + 1 + d = 11 time steps is centered on the instant to be labeled, with p = 5 past seconds and d = 5 delayed (future) seconds. Sequential models received the window as a tensor W t R 11 × 16 , while classical methods used its flattened form z t R 176 . Windows were retained only when at least one valid UWB measurement was available from each anchor within the segment; missing readings caused by occasional ranging dropouts were encoded as 1 . Feature standardization (zero mean, unit variance) was fitted exclusively on the training scenes of each cross-validation fold to prevent data leakage.

3.4. Classification Models and Evaluation Protocol

Five supervised classifiers were trained and evaluated, spanning sequential deep learning architectures and classical machine learning baselines. Two LSTM variants were considered. The Simple LSTM uses a single recurrent layer with 64 hidden units, followed by dropout ( p = 0.3 ), a dense layer with 32 neurons and Rectified Linear Unit (ReLU) activation, and a softmax output layer. The Stacked LSTM uses two recurrent layers with 128 and 64 hidden units, respectively, with dropout ( p = 0.3 ) after each recurrent block, followed by a dense layer with 64 neurons, dropout ( p = 0.2 ), and a softmax output. The CNN+LSTM model first applies two one-dimensional convolutional layers with 32 and 64 filters, kernel size 3, same padding, ReLU activation, layer normalization, and dropout ( p = 0.2 ). The resulting sequence is then processed by two LSTM layers with 128 and 64 hidden units, followed by dropout, a dense layer with 64 neurons, and a final softmax classifier.
All neural models were trained for 40 epochs with a batch size of 2 using sparse categorical cross-entropy loss L = c = 1 C y c log y ^ c . The Adam optimizer was used with a learning rate of 10 3 . No early stopping was applied, so all neural models were trained for the same number of epochs in every fold. The random seed was fixed to 42 for reproducibility. As tree-based baseline, XGBoost was configured with 300 estimators, maximum depth 3, learning rate 0.05, subsampling ratio 0.9, column subsampling ratio 0.9, histogram-based tree construction, and multiclass log-loss on the flattened window vector z t R 176 . A non-linear SVM with an RBF kernel K ( z i , z j ) = exp ( γ z i z j 2 ) , C = 1.0 , γ = scale , balanced class weights, and probabilistic outputs completed the evaluation as a standard non-parametric baseline.
All models were evaluated under a leave-one-scene-out cross-validation protocol. In each fold, one complete scene was held out for testing and the remaining three scenes were used for training. This protocol evaluates generalization across recording sessions and avoids mixing temporally adjacent windows from the same scene between training and testing. Feature standardization was fitted exclusively on the training scenes of each fold and then applied to the held-out scene, preventing information leakage from the test scene. Performance was measured using overall accuracy and macro-averaged precision, recall, and F1-score, complemented by per-class F1-scores for Class 0 (no interaction), Class A (Bed A), and Class B (Bed B). In addition to the aggregated global metrics, per-scene results were reported as mean and standard deviation across the four held-out scenes.

4. Results

Following the pipeline described in Section 3.3, the four scenes produced 1,140 one-second labeled samples after temporal aggregation, distributed as 62 /468 /610 samples for classes 0, A, and B, respectively. These values correspond to the second-level representation reported in Table 2, not to the final windows used for model evaluation. The centered 11-second sliding-window strategy removes the first and last five seconds of each scene as possible window centers; therefore, 40 candidate centers are lost across the four scenes, resulting in 1,100 candidate windows. Of these, 26 windows were discarded because they did not contain at least one valid UWB measurement from each anchor within the segment. The final evaluation set therefore comprised 1,074 valid windows, distributed as 33 /468 /573 for classes 0, A, and B, respectively. Table 3 summarizes the global classification performance of the five evaluated models.
The Simple LSTM achieved the best overall performance, with 96% accuracy and a macro F1-score of 0.89. The strong per-class scores for Beds A and B (F1 of 0.96 and 0.98, respectively) confirm that the model reliably discriminates between the two beds when the caregiver is actively attending either one. The lower F1-score for Class 0 (0.75) reflects the inherent difficulty of the transition class: the caregiver is moving between beds or performing non-patient tasks, and the UWB signals are ambiguous during these brief intervals. The CNN+LSTM achieved comparable results (accuracy 0.95, macro F1 0.88), with a slight improvement in Class 0 recall relative to the Simple LSTM. XGBoost matched the Simple LSTM in accuracy (0.96) but obtained a substantially lower F1-score for Class 0 (0.35), indicating that the flattened temporal representation loses discriminative information for the minority transition class. The SVM baseline showed the weakest performance across all classes (macro F1 0.64, near-zero F1 for Class 0), confirming that fixed RBF kernels struggle with the complex non-stationary patterns of this multimodal stream. The performance gap between Simple LSTM and Stacked LSTM (macro F1: 0.89 vs. 0.78) suggests that a deeper recurrent architecture overfits the limited training data, and that a single recurrent layer is sufficient at the one-second resolution used here.
Table 4 reports the accuracy obtained in each leave-one-scene-out fold, where each scene is used once as the held-out test set. The Simple LSTM obtained the highest mean accuracy across the four folds and showed stable performance across scenes. CNN+LSTM and XGBoost achieved comparable mean accuracies, although their behavior differed across folds. CNN+LSTM was particularly stable in Scenes 1–3, whereas XGBoost performed strongly in Scenes 1 and 4 but dropped in Scene 2. The Stacked LSTM showed the largest degradation in Scene 2, suggesting that the deeper recurrent architecture was less robust under scene-specific changes. The SVM baseline exhibited the highest variability across scenes, indicating lower robustness to changes in the multimodal feature distribution.
The confusion matrices in Figure 5 provide a more detailed view of the classification behavior of each model. For the Simple LSTM model, shown separately in Figure 6, most errors are concentrated in the minority Class 0: 12 of the 33 transition windows were classified as Bed A, while none were classified as Bed B. In contrast, the attended-bed classes were highly stable, with only 16 Bed A windows classified as Bed B and 11 Bed B windows classified as Bed A. This behavior is important for the intended application because caregiver–patient interaction time is mainly estimated from the accumulated duration of classes A and B. Therefore, occasional errors in the transition class have a limited effect on per-bed interaction-time estimation, whereas direct confusions between Bed A and Bed B would be more critical. The Simple LSTM and CNN+LSTM models showed the lowest number of such direct bed-to-bed confusions, supporting their suitability for bedside traceability.

5. Discussion

The experimental results show that bed-level caregiver–patient traceability can be achieved with a minimal sensing configuration composed of a caregiver-worn smartwatch, two compact UWB anchors, and a local edge gateway. The proposed system does not attempt to reconstruct a full indoor trajectory; instead, it focuses on the more specific and clinically useful question of identifying which bed is being attended at each time window. This design choice reduces infrastructure requirements while preserving the information needed to estimate per-patient bedside interaction time in shared rooms.
The leave-one-scene-out evaluation is particularly relevant because each fold tests the models on a complete scene that was not observed during training. This is stricter than a random window-level split, where neighboring windows from the same recording could appear in both training and testing. Under this protocol, the Simple LSTM achieved the best global balance between accuracy, macro F1-score, and stability across scenes. Its mean accuracy across folds was 0.963 ± 0.031, and the global confusion matrix shows that direct confusions between Bed A and Bed B were limited. This is important because the primary downstream variable is not only the instantaneous class label, but the accumulated time assigned to each patient bed. The results also clarify the role of temporal modeling. The caregiver’s location and wrist motion are not independent from one second to the next: approaching a bed, performing bedside actions, and leaving the interaction area generate short temporal patterns in both MARG and UWB streams. Sequential models can exploit this structure directly, whereas classical models receive a flattened representation of the same window. XGBoost achieved high overall accuracy, but its much lower F1-score for Class 0 indicates that flattening the temporal context is less effective for modeling transitions. The SVM baseline showed the weakest global performance and the highest variability across scenes, suggesting that a fixed kernel is less suitable for the non-stationary multimodal patterns generated during bedside care. The transition class deserves special attention. Class 0 represents short periods in which the caregiver is not clearly attending either bed, including movements between beds and non-care actions. These windows are both under-represented and intrinsically ambiguous: UWB distances may be similar for both anchors, while wrist movements are more variable than during stable bedside interaction. As a result, all models obtained lower performance for this class. However, this limitation has a moderate impact on the intended use case, because caregiver–patient interaction time is estimated mainly by accumulating windows classified as Bed A or Bed B. Direct bed-to-bed confusions are therefore more harmful than transition-to-bed errors. The Simple LSTM produced few Bed A/Bed B swaps, which supports its use as the preferred model for estimating attended-bed duration.
From an application perspective, the classification output can be transformed into interaction-time estimates by accumulating the duration of consecutive windows assigned to each bed. Since each labeled window corresponds to a one-second center instant, the predicted sequence provides a second-level trace of caregiver attention. This does not replace clinical judgment and should be interpreted as an automated estimate rather than an exact manual annotation. Nevertheless, it provides an objective and scalable basis for quantifying bedside presence, comparing workload between patients, and identifying interaction patterns that would be difficult to capture through manual observation alone. Compared with previous proximity-based approaches, the proposed system offers three practical advantages. First, patients do not need to wear any device, which improves acceptability and reduces infection-control concerns. Second, only two UWB anchors are required for a two-bed room, avoiding dense RTLS deployments. Third, all communication and processing are performed locally through the Raspberry Pi gateway, avoiding dependence on cloud infrastructure. These characteristics make the approach suitable for controlled hospital rooms and residential-care environments where privacy, simplicity, and low deployment burden are important.
Several limitations should also be acknowledged. The dataset was collected with a single caregiver and four scenes, which limits generalization to different caregivers, care routines, and room layouts. The current system was validated in a two-bed configuration; rooms with more beds would require additional anchors and a revised classification setup. The ground truth was manually annotated through the smartwatch interface, which may introduce small temporal offsets at the beginning and end of interactions. Finally, although the classification results support interaction-time estimation, the present evaluation does not yet report a dedicated time-estimation error metric, such as mean absolute error in seconds per bed. Future work should therefore validate the full time-aggregation pipeline against manually annotated care episodes, include multiple caregivers, and evaluate the system in longer real-world deployments.

6. Conclusions

This paper has presented a non-invasive smart bedside sensing system for caregiver–patient traceability in shared hospital rooms. The system combines a commercial smartwatch worn by the caregiver, which provides 14-channel MARG data, with two compact UWB anchors placed at the headboard of each monitored bed, which provide bed-level proximity through Two-Way Ranging. It requires no device on the patient and no dedicated localization infrastructure beyond a Raspberry Pi edge gateway, keeping all communication and processing within the local network. Multimodal data were aggregated at one-second resolution, segmented using a past–delayed sliding-window strategy, and classified by five supervised models ranging from classical baselines (SVM, XGBoost) to sequential deep learning architectures (Simple LSTM, Stacked LSTM, CNN+LSTM).
Under a leave-one-scene-out cross-validation protocol, in which each recording session is held out once for testing, the Simple LSTM achieved the best overall behavior, with 96% global accuracy, a macro F1-score of 0.89, and a mean accuracy of 0.963 ± 0.031 across the four folds. Beyond aggregate accuracy, the analysis of the confusion matrices showed that the sequential models produced very few direct Bed A/Bed B confusions, concentrating their errors in the under-represented and intrinsically ambiguous transition class. This error structure is favorable for the intended application, since caregiver–patient interaction time is estimated by accumulating the windows attributed to each bed, and occasional transition errors have a limited effect on per-bed time estimates. The comparison between models further confirmed that temporal context spanning several seconds is essential: sequential architectures, which exploit the short motion and proximity patterns generated when approaching, attending, and leaving a bed, consistently outperformed the flattened-window classical baselines on the transition class.
From a practical standpoint, the proposed approach offers three advantages over previous proximity-based systems: patients wear no device, which improves acceptability and reduces infection-control concerns; only two anchors are needed for a two-bed room, avoiding dense RTLS deployments; and all data remain on a local edge gateway, avoiding dependence on cloud infrastructure. Together, these properties make the system suitable for privacy-sensitive clinical and residential-care environments where simplicity and low deployment burden are essential.
Future work will therefore focus on closing the gap between classification and downstream time estimation by reporting per-bed mean absolute error in seconds against manually annotated care episodes; extending the dataset to multiple caregivers, larger bed configurations, and longer real-world deployments to strengthen generalization; and deploying on-device inference through TinyML frameworks to enable real-time, fully local bedside traceability. By turning wrist-worn and minimal UWB sensing into objective, scalable indicators of bedside attention, the proposed system contributes a practical building block toward automated care-workload assessment in shared hospital rooms.

Author Contributions

Conceptualization, A.P.-R. and J.M.-Q.; methodology, A.P.-R. and J.M.-Q.; software, M.Á.A.-M. and A.E.-E.; validation, A.P.-R., M.Á.A.-M. and J.M.-Q.; formal analysis, A.P.-R., M.Á.A.-M. and J.M.-Q.; investigation, A.P.-R., I.V.-L., M.C.-R., B.R.-M. and J.M.-Q.; resources, I.V.-L., M.C.-R., B.R.-M. and J.M.-Q.; data curation, A.P.-R., A.E.-E., M.C.-R. and B.R.-M.; writing—original draft preparation, A.P.-R., M.Á.A.-M. and J.M.-Q.; writing—review and editing, A.P.-R., A.E.-E., M.Á.A.-M., I.V.-L., M.C.-R., B.R.-M. and J.M.-Q.; visualization, A.P.-R. and M.Á.A.-M.; supervision, J.M.-Q.; project administration, A.P.-R. and J.M.-Q.; funding acquisition, I.V.-L. and J.M.-Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Ethical review and approval were waived for this study because no patients or vulnerable participants were involved, and data were collected in a controlled environment using non-clinical simulated care activities.

Data Availability Statement

The dataset and code used in this study are available upon reasonable request to the corresponding author.

Acknowledgments

The authors acknowledge the support of the project “HERA: Pilotaje de Herramienta de Evaluación y Respuesta Anticipada de cambios conductuales en el seguimiento del paciente con Demencia en el Hogar”, file number AP-0030-2025, led by the principal investigator Maria Isabel Valenzuela López at D.S.A.P. Granada.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Cacchione, P.Z. World Health Organization leads the 2021 to 2030 decade of healthy ageing. Clin. Nurs. Res. 2022, 31, 3–4. [Google Scholar] [CrossRef] [PubMed]
  2. Urwin, S.; Lau, Y.S.; Grande, G.; Sutton, M. The challenges of measuring informal care time: A review of the literature. PharmacoEconomics 2021, 39, 1209–1223. [Google Scholar] [CrossRef] [PubMed]
  3. Westbrook, J.I.; Duffield, C.; Li, L.; Creswick, N.J. How much time do nurses have for patients? A longitudinal study quantifying hospital nurses’ patterns of task time distribution and interactions with health professionals. BMC Health Serv. Res. 2011, 11, 319. [Google Scholar] [CrossRef] [PubMed]
  4. Ordóñez, F.J.; Roggen, D. Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition. Sensors 2016, 16, 115. [Google Scholar] [CrossRef] [PubMed]
  5. Polo-Rodríguez, A.; Romero-Sanchez, J.; Fernández-García, E.; Paloma-Castro, O.; Porcel-Gálvez, A.M.; Medina-Quero, J. Review on Internet of Things for innovation in nursing process—A PubMed-based search. In Proceedings of the Proceedings of the 15th International Conference on Ubiquitous Computing & Ambient Intelligence (UCAmI 2023); Springer, 2023; Volume 835, Lecture Notes in Networks and Systems; pp. 57–70. [Google Scholar] [CrossRef]
  6. Nweke, H.F.; Teh, Y.W.; Al-Garadi, M.A.; Alo, U.R. Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: State of the art and research challenges. Expert Syst. Appl. 2018, 105, 233–261. [Google Scholar] [CrossRef]
  7. Kamel Boulos, M.N.; Berry, G. Real-time locating systems (RTLS) in healthcare: A condensed primer. Int. J. Health Geogr. 2012, 11, 25. [Google Scholar] [CrossRef] [PubMed]
  8. Wichmann, J. Indoor positioning systems in hospitals: A scoping review. Digit. Health 2022, 8, 20552076221081696. [Google Scholar] [CrossRef]
  9. Elsanhoury, M.; Makela, P.; Koljonen, J.; Valisuo, P.; Shamsuzzoha, A.; Mantere, T.; Elmusrati, M.; Kuusniemi, H. Precision positioning for smart logistics using ultra-wideband technology-based indoor navigation: A review. IEEE Access 2022, 10, 44413–44445. [Google Scholar] [CrossRef]
  10. Polo-Rodríguez, A.; Anguita-Molina, M.Á.; Gil, D.; Romero-Sanchez, J.; Fernández, E.; Paloma-Castro, O.; Porcel-Gálvez, A.M.; Medina-Quero, J. Discovering social interactions between caregivers and frail individuals using indoor localization. In Proceedings of the Proceedings of the International Conference on Ubiquitous Computing and Ambient Intelligence (UCAmI 2024); Springer, 2024; Volume 1212, Lecture Notes in Networks and Systems; pp. 319–331. [Google Scholar] [CrossRef]
  11. van den Berg, B.; Spauwen, P. Measurement of informal care: An empirical study into the valid measurement of time spent on informal caregiving. Health Econ. 2006, 15, 447–460. [Google Scholar] [CrossRef] [PubMed]
  12. Hendrich, A.; Chow, M.P.; Skierczynski, B.A.; Lu, Z. A 36-hospital time and motion study: How do medical-surgical nurses spend their time? Perm. J. 2008, 12, 25–34. [Google Scholar] [CrossRef] [PubMed]
  13. Munyisia, E.N.; Yu, P.; Hailey, D. How nursing staff spend their time on activities in a nursing home: An observational study. J. Adv. Nurs. 2011, 67, 1908–1917. [Google Scholar] [CrossRef] [PubMed]
  14. Cicirelli, G.; Marani, R.; Petitti, A.; Milella, A.; D’Orazio, T. Ambient Assisted Living: A review of technologies, methodologies and future perspectives for healthy aging of population. Sensors 2021, 21, 3549. [Google Scholar] [CrossRef] [PubMed]
  15. Caballero, P.; Ortiz, G.; Medina-Bulo, I. Systematic literature review of ambient assisted living systems supported by the Internet of Things. Univers. Access Inf. Soc. 2024, 23, 1631–1656. [Google Scholar] [CrossRef]
  16. Choi, Y.K.; Thompson, H.J.; Demiris, G. Use of an Internet-of-Things smart home system for healthy aging in older adults in residential settings: Pilot feasibility study. JMIR Aging 2020, 3, e21964. [Google Scholar] [CrossRef] [PubMed]
  17. Sauzéon, H.; Edjolo, A.; Amieva, H.; Consel, C.; Pérès, K. Effectiveness of an Ambient Assisted Living platform for supporting aging in place of older adults with frailty: Protocol for a quasi-experimental study. JMIR Res. Protoc. 2022, 11, e33351. [Google Scholar] [CrossRef] [PubMed]
  18. Girolami, M.; Mavilia, F.; Delmastro, F. Sensing social interactions through BLE beacons and commercial mobile devices. Pervasive Mob. Comput. 2020, 67, 101198. [Google Scholar] [CrossRef] [PubMed]
  19. Baronti, P.; Barsocchi, P.; Chessa, S.; Crivello, A.; Girolami, M.; Mavilia, F.; Palumbo, F. Remote detection of social interactions in indoor environments through Bluetooth Low Energy beacons. J. Ambient Intell. Smart Environ. 2020, 12, 203–217. [Google Scholar] [CrossRef]
  20. Garcia, C.; Inoue, S. Relabeling for Indoor Localization Using Stationary Beacons in Nursing Care Facilities. Sensors 2024, 24, 319. [Google Scholar] [CrossRef] [PubMed]
  21. Polo-Rodríguez, A.; Anguita-Molina, M.A.; Rojas-Ruiz, I.; Medina-Quero, J. Multi-occupant tracking with radar and wearable devices for enhanced accuracy in indoor environments. Eng. Appl. Artif. Intell. 2025, 154, 110872. [Google Scholar] [CrossRef]
  22. Anguita-Molina, M.A.; Soto-Hidalgo, J.M.; Medina-Quero, J.; Polo-Rodríguez, A. Evaluation of Edge-Based UWB Indoor Positioning Using Smartwatches and Embedded Anchors. In Proceedings of the UCAmI 2025, LNNS 1818; 2025; pp. 339–350. [Google Scholar] [CrossRef]
  23. Anguita-Molina, M.A.; Cardoso, P.J.S.; Rodrigues, J.M.F.; Medina-Quero, J.; Polo-Rodríguez, A. Multioccupancy Activity Recognition Based on Deep Learning Models Fusing UWB Localization Heatmaps and Nearby-Sensor Interaction. IEEE Internet Things J. 2025, 12, 16037–16052. [Google Scholar] [CrossRef]
  24. Google LLC. Google Pixel Watch 4 technical specifications. 2024. Accessed: May 2026. Available online: https://store.google.com/product/pixel_watch_4_specs.
  25. Qorvo Inc. DWM3001CDK ultra-wideband development kit datasheet. 2024. Accessed: May 2026. Available online: https://www.qorvo.com/products/p/DWM3001CDK.
Figure 1. Deployment of the proposed smart bedside sensing system. One compact UWB anchor (DWM3001CDK) is placed at the headboard of each monitored bed. The caregiver wears a Google Pixel Watch 4 and carries a UWB tag. A Raspberry Pi 4B acts as the edge gateway and Message Queuing Telemetry Transport (MQTT) broker. Patients carry no device.
Figure 1. Deployment of the proposed smart bedside sensing system. One compact UWB anchor (DWM3001CDK) is placed at the headboard of each monitored bed. The caregiver wears a Google Pixel Watch 4 and carries a UWB tag. A Raspberry Pi 4B acts as the edge gateway and Message Queuing Telemetry Transport (MQTT) broker. Patients carry no device.
Preprints 216426 g001
Figure 2. Sensing hardware. (a) Google Pixel Watch 4 worn on the wrist alongside a Qorvo DWM3001CDK UWB development kit. (b) Smartwatch application displaying real-time UWB status, distance to the nearest anchor, estimated zone (Bed A), and acquisition cycle status.
Figure 2. Sensing hardware. (a) Google Pixel Watch 4 worn on the wrist alongside a Qorvo DWM3001CDK UWB development kit. (b) Smartwatch application displaying real-time UWB status, distance to the nearest anchor, estimated zone (Bed A), and acquisition cycle status.
Preprints 216426 g002
Figure 3. Hospital deployment scenario. (a) Overview of the shared room with two hospital beds and clinical training mannequins. (b) Caregiver interacting at a patient bed while wearing the Google Pixel Watch 4. (c) Qorvo DWM3001CDK UWB anchor mounted at the headboard. (d) Smartwatch on the caregiver’s wrist. (e) Raspberry Pi 4B edge gateway.
Figure 3. Hospital deployment scenario. (a) Overview of the shared room with two hospital beds and clinical training mannequins. (b) Caregiver interacting at a patient bed while wearing the Google Pixel Watch 4. (c) Qorvo DWM3001CDK UWB anchor mounted at the headboard. (d) Smartwatch on the caregiver’s wrist. (e) Raspberry Pi 4B edge gateway.
Preprints 216426 g003
Figure 4. Past–delayed sliding-window segmentation strategy ( L = 11 , D = 16 channels). Each window is centered on the instant to be labeled t, with p = 5 past seconds and d = 5 future (delayed) seconds. MARG channels (14) and UWB distance channels (2) are shown as rows. The class label (Bed A, Bed B, or None) is assigned at t.
Figure 4. Past–delayed sliding-window segmentation strategy ( L = 11 , D = 16 channels). Each window is centered on the instant to be labeled t, with p = 5 past seconds and d = 5 future (delayed) seconds. MARG channels (14) and UWB distance channels (2) are shown as rows. The class label (Bed A, Bed B, or None) is assigned at t.
Preprints 216426 g004
Figure 5. Global confusion matrices obtained by aggregating the predictions of the four leave-one-scene-out folds for each evaluated model. Rows correspond to the true class and columns to the predicted class. Classes are defined as follows: 0 = no-bed interaction or transition, A = interaction with Bed A, and B = interaction with Bed B. The matrices show that most classification errors occur in the minority transition class, whereas the attended-bed classes are generally well separated, especially for the sequential models.
Figure 5. Global confusion matrices obtained by aggregating the predictions of the four leave-one-scene-out folds for each evaluated model. Rows correspond to the true class and columns to the predicted class. Classes are defined as follows: 0 = no-bed interaction or transition, A = interaction with Bed A, and B = interaction with Bed B. The matrices show that most classification errors occur in the minority transition class, whereas the attended-bed classes are generally well separated, especially for the sequential models.
Preprints 216426 g005
Figure 6. Global confusion matrix for the Simple LSTM model, obtained by aggregating the predictions across the four leave-one-scene-out folds. Rows correspond to the true class and columns to the predicted class. The model correctly classifies most Bed A and Bed B windows, while the main source of error is the minority transition class 0, which is frequently confused with Bed A.
Figure 6. Global confusion matrix for the Simple LSTM model, obtained by aggregating the predictions across the four leave-one-scene-out folds. Rows correspond to the true class and columns to the predicted class. The model correctly classifies most Bed A and Bed B windows, while the main source of error is the minority transition class 0, which is frequently confused with Bed A.
Preprints 216426 g006
Table 1. Comparison of representative related works on proximity-based caregiver–patient interaction sensing.
Table 1. Comparison of representative related works on proximity-based caregiver–patient interaction sensing.
Work Technology Environment Bed-level Patient device Model
[7] RTLS (IR/BLE) Hospital No Yes Rule-based
[10] UWB Residential No No Threshold
[21] UWB + mmWave Residential No No ConvLSTM
[22] UWB (smartwatch) Indoor No No
[18] BLE beacons Office/lab No Yes None
[20] BLE Nursing care No No Random Forest
[8] Various RTLS Hospital No Yes Review
This work UWB + MARG Hospital Yes No LSTM/XGBoost
Table 2. Dataset statistics for the hospital scenario after aggregation at one-second resolution. Class 0 = no-bed interaction; MARG sampled at 50 Hz; UWB samples per anchor reported separately. The class distribution corresponds to one-second labeled samples before centered-window generation.
Table 2. Dataset statistics for the hospital scenario after aggregation at one-second resolution. Class 0 = no-bed interaction; MARG sampled at 50 Hz; UWB samples per anchor reported separately. The class distribution corresponds to one-second labeled samples before centered-window generation.
Scene Duration (min) 1-s labels (0 / A / B) MARG samples UWB Anc. A UWB Anc. B
Scene 1 3.9 7 / 129 / 101 11,864 80 80
Scene 2 5.2 46 / 91 / 177 15,768 90 92
Scene 3 5.0 3 / 104 / 195 15,154 86 87
Scene 4 4.8 6 / 144 / 137 14,384 100 100
Total 19.9 62 / 468 / 610 57,170 356 359
Table 3. Classification performance of the evaluated models in the hospital scenario. F1-0, F1-A, and F1-B denote per-class F1-scores for the no-interaction class, Bed A, and Bed B, respectively.
Table 3. Classification performance of the evaluated models in the hospital scenario. F1-0, F1-A, and F1-B denote per-class F1-scores for the no-interaction class, Bed A, and Bed B, respectively.
Model Acc. Macro P Macro R Macro F1 F1-0 F1-A F1-B
Simple LSTM 0.96 0.95 0.86 0.89 0.75 0.96 0.98
Stacked LSTM 0.91 0.81 0.76 0.78 0.52 0.90 0.93
CNN+LSTM 0.95 0.89 0.88 0.88 0.74 0.95 0.97
XGBoost 0.96 0.85 0.73 0.76 0.35 0.95 0.99
SVM 0.89 0.81 0.63 0.64 0.11 0.89 0.92
Table 4. Leave-one-scene-out accuracy by held-out scene. The last column reports the mean and standard deviation across the four folds.
Table 4. Leave-one-scene-out accuracy by held-out scene. The last column reports the mean and standard deviation across the four folds.
Model Scene 1 Scene 2 Scene 3 Scene 4 Mean ± SD
Simple LSTM 0.987 0.937 0.936 0.993 0.963 ± 0.031
Stacked LSTM 0.978 0.801 0.979 0.884 0.911 ± 0.085
CNN+LSTM 0.982 0.962 0.951 0.917 0.953 ± 0.027
XGBoost 0.982 0.902 0.961 0.993 0.960 ± 0.040
SVM 0.749 0.892 0.986 0.903 0.882 ± 0.098
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated