Figure 3.
PRISMA diagram of the systematic review
Table 3.
The final selection of articles for the review
|
ID
|
Author
|
Year
|
Country
|
Database
|
Summary
|
| 1 |
Lim et al. |
2024 |
Canada |
WOS, PubMed |
Feasibility of depth cameras & pressure pads as alternatives to force plates. |
| 2 |
Wagner et al. |
2023 |
Poland |
PubMed |
Depth-sensor gait methods compared. |
| 3 |
Raza et al. |
2023 |
Pakistan, Saudi Arabia |
WOS, Scopus |
AI for pose estimation in physiotherapy exercises. |
| 4 |
Maskeliunas et al. |
2023 |
Lithuania |
WOS, Scopus |
BiomacVR for posture & movement analysis in rehabilitation. |
| 5 |
Lim et al. |
2023 |
China |
Scopus, PubMed |
Adaptive Cobot system for assistive rehab training. |
| 6 |
Khan et al. |
2023 |
USA |
WOS |
Quantum neural network for post-stroke exercise assessment. |
| 7 |
Bijalwan et al. |
2023 |
India |
WOS, Scopus |
Automated system for upper limb exercise detection using an RGB-Depth camera. |
| 8 |
Keller et al. |
2022 |
USA |
PubMed |
Unsupervised ML for low back pain exercise strategies. |
| 9 |
Zhao et al. |
2021 |
USA, China |
WOS, Scopus |
Home TKR rehab system development. |
| 10 |
Trinidad-Fernández et al. |
2021 |
Spain, Belgium |
PubMed |
RGB-D camera validates motion capture in spondyloarthritis. |
| 11 |
Hustinawaty et al. |
2021 |
Indonesia |
Scopus |
Kinect SDK for a study of straight leg lift exercise. |
| 12 |
Girase et al. |
2021 |
USA |
PubMed |
Key factors identified for spine, hip, and knee assessment from sit-to-stand. |
| 13 |
Çubukçu et al. |
2021 |
Turkey |
WOS, Scopus |
Kinect-based mentor for shoulder injury telerehab. |
| 14 |
Wei et al. |
2020 |
USA |
WOS, Scopus |
Sensors and DL for automated balance assessment. |
| 15 |
Uccheddu et al. |
2021 |
Italy |
WOS, Scopus |
Hybrid approach for 3D pose estimation proposed. |
| 16 |
Trinidad-Fernández et al. |
2020 |
Belgium, Spain, Australia |
PubMed |
RGB-D camera kinematic assessment results. |
| 17 |
Saratean et al. |
2020 |
Romania |
WOS, Scopus |
Kinect-based physical therapy guidance system. |
| 18 |
Garcia et al. |
2020 |
Brazil |
WOS, Scopus |
RGB-D camera analysis of compensatory trunk movements. |
3.1. RQ 1: Sensor
Camera sensors are crucial to acquiring visual data for movement analysis and physical therapy. The type of sensor chosen directly affects the quality of the captured data, the performance of the deep learning model, and the design of the evaluation protocol. In this review, depth cameras accounted for 65.4% (See
Figure 4). Standard depth camera sensors are shown below:
Kinect series: RGB-D camera introduced by Microsoft, which obtains accurate depth and image data through infrared ranging and color image acquisition technologies, and the main models are Kinect V1, Kinect V2, and so on.
RealSense series: Intel’s RGB-D camera product line, using visual-inertial ranging technology, can obtain high-quality depth and motion data.
Other RGB-D cameras: Besides the mainstream products mentioned above, third-party vendors, such as Xtion Pro, provide some RGB-D camera devices.
Ordinary RGB cameras: Only capture color image data and must combine with other depth estimation algorithms for data processing and analysis.
After reviewing 18 related literature (See
Table 8), the Kinect series is the most widely used sensor, and 12 literature [
8,
21,
22,
25,
26,
27,
28,
29,
31,
32,
33,
37] adopt Kinect V2 as the primary Kinect camera for data acquisition. The Kinect camera is recognized as the mainstream choice in this field because of its high accuracy and reliability. Intel RealSense series has also gained some applications; two papers [
24,
35] used the RealSense L515, D435i, and D415 models, respectively. RealSense cameras are technologically advanced, have excellent performance, and are expected to be used more widely in the future.
Figure 4.
Sensor selection: depth-camera vs. other sensors.
Figure 4.
Sensor selection: depth-camera vs. other sensors.
In addition, three papers [
23,
30,
36] used ordinary RGB cameras and other RGB-D cameras (e.g. Xtion Pro) to collect data. This approach has low hardware requirements but requires the development of appropriate algorithms to process and analyze the data.
Kinect series sensors are generally widely used due to their reliable performance and mature applications. However, emerging RGB-D cameras such as RealSense also show good prospects for development. Depending on the specific research objectives and application scenarios, choosing the right sensor is crucial to obtaining high-quality data. The advantages and disadvantages of different sensors must be weighed to utilize depth cameras in physical therapy fully.
3.2. RQ 2: Dataset
In computer vision-based physiotherapy movement assessment research, datasets are a key driver for its development. High-quality and diverse raw data and labeled information are the basis for developing excellent deep-learning models. At the same time, these datasets support the ability of the models to generalize across different application scenarios, facilitating the establishment of standardized physiotherapy movement assessment methods. In the reviewed literature, researchers used a variety of data types (See
Figure 5), as shown in the following:
RGB-D image/video data: Many studies [
8,
22,
24,
25,
27,
30,
31,
34,
35] have used color image and depth information captured by RGB-D cameras, usually image sequences or videos. This raw data can directly reflect the motion process and provide the basic input for subsequent motion detection and analysis.
Joint and Skeletal Data: Several studies [
23,
26,
32,
33,
37] have utilized the joint positions and skeletal information extracted by depth cameras to construct skeletal datasets. This structured data directly represents the key features of human movement and is used to model and analyze joint motion trajectories.
Combined datasets: There are also some studies [
29,
30,
36] that combine RGB-D data and auxiliary data captured by other sensors (e.g., IMU) to capture the motion process from different perspectives, to obtain more comprehensive and accurate motion information.
Figure 5.
Pie chart of percentage distribution of data types.
Figure 5.
Pie chart of percentage distribution of data types.
Regarding dataset construction (See
Table 4), the main approaches taken by existing studies are public datasets, self-constructed datasets, and the fusion of multiple datasets. Most studies [
8,
21,
22,
24,
25,
28,
30,
31,
32,
33,
34,
36] collected the data themselves. The size of the datasets ranged from tens to hundreds of participants, with some small to medium datasets and some larger datasets, such as Girase et al. containing 411 participants. In data collection and construction, researchers generally consider data from people of different ages, genders, and health conditions. This diversity enhances the robustness of the physical therapy movement dataset and improves the ability to generalize the model.
In addition, different studies have used existing publicly available datasets to accelerate model development by utilizing existing labeled data. For example, Raza et al. used "Multi-Class Exercise Poses for Human Skeleton"
5, and Khan et al. used UI-PRMD [
5]. In contrast, Bijalwan et al. used a fusion of multiple data sets by combining publicly available datasets (UTD-MHAD [
38], mHealth [
39], OU-ISIR [
40], HAPT [
41]) with the self-collected datasets for the construction of combinations.
In general, self-constructed datasets dominate computer vision-based physical therapy exercise and assistance research. Researchers have constructed or utilized multiple datasets according to specific needs, covering different groups and sizes of participants and containing multimodal data from different sensor sources. These datasets provide a solid data foundation for deep learning model training and physiotherapy exercise analysis. Proper utilization and expansion of high-quality datasets will provide strong support for developing this field.
Table 4.
Summary of Data Types and Datasets in Physiotherapy Movement Assessment
Table 4.
Summary of Data Types and Datasets in Physiotherapy Movement Assessment
|
ID
|
Data Type
|
Dataset
|
| 1 |
Joint displacement data series |
10 non-disabled participants: 7 males, 3 females |
| 2 |
RGB-D Images |
5 subjects: 2 males, 3 females |
| 3 |
Skeleton Data |
Multi-Class Exercise Poses for Human Skeleton |
| 4 |
RGB-D Videos |
16 healthy subjects, 10 post-stroke patients |
| 5 |
RGB-D Image |
5 healthy subjects |
| 6 |
Joint-Skeletal |
UI-PRMD |
| 7 |
RGB-D |
UTD-MHAD, mHealth, OU-ISIR, HAPT |
| 8 |
RGB-D |
111 participants: back pain 43, control 26, surgery 4 |
| 9 |
RGB-D & IMU |
/ |
| 10 |
RGB-D Videos & IMU |
17 subjects: 54.35 (±11.75) years |
| 11 |
RGB-D |
10 human objects |
| 12 |
RGB-D Time Series |
3 patient groups and one control group: 78 control, 130 LBP, 90 hip, and 113 knee |
| 13 |
Skeleton Data |
29 shoulder damaged volunteers: 18 males, 11 females |
| 14 |
RGB-D Image |
41 subjects: 26 males, 15 females; 21 healthy subjects and 20 patients with PD |
| 15 |
RGB-D Videos |
/ |
| 16 |
RGB-D & IMU |
30 subjects: 18 65 years with non-specific lumbar pain |
| 17 |
Skeleton Data |
/ |
| 18 |
RGB-D |
14 volunteers: 9 range of movement capture tests, 5 trunk compensation tests |
3.3. RQ 3: Data Processing
Data processing plays a key role in depth camera-based physiotherapy movement analysis studies and directly affects the quality and precision of subsequent analyses. As
Table 5 shows, through a systematic review of 18 works in the literature, the author can summarize several major types of data processing methods and techniques. First, skeletal data extraction and processing are the basis of most studies, e.g., Girase et al., Çubukçu et al., Uccheddu et al. use Kinect SDK 2.0 and OpenPose library to extract human skeletal data from raw depth images, respectively, which provides structured input for subsequent analysis. Second, to ensure the consistency and comparability of the data, some studies such as Wagner et al. and Khan et al. performed coordinate system transformation and alignment operations, which helped to eliminate errors caused by different devices or shooting angles.
Data standardization and normalization is another common processing step, such as the min-max normalization method used in [
27], which helps to eliminate the effects of different scales and allows various features to be compared on the same magnitude. To improve data quality, some studies have used filtering and noise removal techniques, such as the Kalman filter and low-pass Butterworth filter used in [
28], which effectively remove noise and improve signal quality. Feature extraction and selection also play an important role in machine learning, as evidenced by several articles in the literature [
23,
31]. These techniques help to reduce data dimensionality and improve the efficiency and generalization of the model.
For studies involving multimodal data, data synchronization and fusion become key issues. Zhao et al. and Trinidad-Fernández et al. explore the fusion of accelerometer and gyroscope measurements and visual synchronization via timestamps, respectively. In addition, to increase the diversity of training samples and improve the robustness of the model, Wei et al. employs data augmentation techniques, which are particularly useful in deep learning model training to alleviate the problem of insufficient data effectively. Some studies such as [
26] also mention the processing of feature transformations, including operations such as dimensionality reduction and feature combination, which help to extract more meaningful feature representations.
These diverse data processing methods cover the whole process from raw data acquisition to feature extraction, improving data quality and providing more reliable and effective inputs for subsequent algorithmic models. However, it should be noted that different studies used different combinations of processing methods according to their specific objectives and data characteristics. This diversity reflects the complexity and importance of data processing in physiotherapy exercise analysis.
Researchers may need to explore further advanced data processing techniques, such as automated feature engineering and more sophisticated multimodal data fusion methods, to cope with the increasingly complex demands of exercise analysis. Meanwhile, maximizing the extraction of useful information while maintaining the authenticity of the data is also a direction worthy of in-depth research. As technology advances, innovations in data processing methods will continue to drive depth camera-based physiotherapy movement analysis research toward higher accuracy and broader application areas.
Table 5.
Summary of Algorithm and Processing in Physiotherapy Movement Assessment
Table 5.
Summary of Algorithm and Processing in Physiotherapy Movement Assessment
|
ID
|
Algorithm
|
Processing
|
| 1 |
/ |
Joint Displacement Data. |
| 2 |
Savitzky-Golay Filter |
Transform the coordinate system using the KD, CH, and FV data processing methods. |
| 3 |
RF, LR, GRU, LSTM, LogRF |
MediaPipe Pose Marker, Feature Selection, and Hyperparameter Tuning. |
| 4 |
ANN, DNN, CNN, CPM |
Human skeletal movement was observed using visible information. |
| 5 |
Imitation Learning for Adaptive Learning |
/ |
| 6 |
High-Quality Neural Network |
Align the length and center, and perform characteristic transformation. |
| 7 |
HDL of CNN, RNN, CNN-GRU, and CNN-LSTM |
Apply Min-Max normalization. |
| 8 |
PCA, NLPCA, LR, Kaiser and Scree Plot Rules, Pattern Matching Statistics |
Use Kalman filter, sequential second-order, and low pass Butterworth filtering. |
| 9 |
/ |
Fuse accelerometer and gyroscope measurements. |
| 10 |
Statistical analysis |
Synchronize the dataset with the timestamp and visualize it using OpenNI2, NiTE2, and MRPT. |
| 11 |
Detecting and Tracking |
Calibration, skeletalization process, and feature extraction. |
| 12 |
SVM, RF, MLPs, CCNN, Semi-Supervised Learning, Unscented Kalman Filter |
Estimate joint center positions using the standard Kinect 2 Body Tracking library. |
| 13 |
Statistical Analysis |
Use the Kinect SDK 2.0. |
| 14 |
CNN, RF Classifier |
Perform data augmentation. |
| 15 |
OpenPose |
Process video frames from the RGB sensor with the OpenPose library. |
| 16 |
/ |
Synchronize and use OpenNI2 and NiTE2 to create a virtual skeleton representation. |
| 17 |
Effort-Based Parameterization Method |
/ |
| 18 |
PrimeSense |
Use the Kinect SDK 2.0. |
3.4. RQ 4: Algorithm
In the research field of deep camera-based physiotherapy movement assessment, the selection and design of algorithms are crucial and directly affect the accuracy and efficiency of movement recognition, assessment, and analysis. Based on a systematic review of existing literature, algorithms can be broadly categorized into three groups: traditional machine learning algorithms, deep learning algorithms, and dedicated algorithms for specific tasks (See
Figure 6). Each of these algorithms excels in different application scenarios, collectively contributing to the field’s rapid development (See
Table 5).
Traditional machine-learning algorithms have been widely used in several studies. For instance, Random Forest (RF) [
42], Logistic Regression (LR) [
43], Support Vector Machine (SVM) [
44], and Principal Component Analysis (PCA) [
45] are favored for their interpretability and computational efficiency. Raza et al. utilize RF, LR, and LSTM algorithms for human posture estimation. At the same time, PCA, nonlinear PCA (NLPCA), and LR are combined to identify movement strategies in patients with low back pain [
28]. These methods perform well with structured skeletal data and provide reliable tools for clinical assessment, especially in cases with well-defined features and small data sizes.
Figure 6.
Diagram of bars for the algorithm in current literature.
Figure 6.
Diagram of bars for the algorithm in current literature.
With the rapid development of deep learning techniques, more studies are using deep neural network models. Architectures such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory Networks (LSTMs) show significant advantages when dealing with complex time-series motion data. For example, a combination of Artificial Neural Networks (ANN), Deep Neural Networks (DNN), and Convolutional Pose Machine (CPM) achieves accurate analysis of human posture and motion [
24], while a Hybrid Deep Learning (HDL) model combining CNN, RNN, and CNN-GRU effectively improves the detection and recognition accuracy of upper limb rehabilitation movements [
27]. Another study uses a combination of CNN and RF classifiers to estimate the human body’s center of gravity (CoM) [
34]. These deep learning methods automatically extract deep features from raw data, significantly reducing the workload of manual feature engineering and improving model generalization, which is particularly suitable for processing large-scale and high-dimensional visual data.
In addition, some researchers have developed specialized algorithms for specific physiological therapy tasks. For example, Imitation Learning is used to achieve adaptive learning for multifunctional upper limb rehabilitation [
25], while the Effort-Based Parameterization Method (EBPM) provides a theoretical basis for a home rehabilitation guidance system [
37]. Though narrow in application, these specific task-oriented algorithms often provide precise and efficient solutions, reflecting the researchers’ deep understanding and innovative thinking about actual clinical needs.
It should be noted that some studies have adopted the algorithm fusion strategy to utilize the advantages of different algorithms fully. For example, combining SVM, RF, multilayer perceptrons (MLPs), and cascaded convolutional neural networks (CCNNs) with semi-supervised learning and a traceless Kalman filter identifies the key factors of pathological movements [
32]. Another approach integrates algorithms such as KD, CH, and FV with the Zebris FDM platform for accurately estimating gait parameters [
22]. This fusion strategy improves the performance and stability of the model, providing new ideas for solving complex physiological treatment problems.
Data preprocessing and postprocessing play equally important roles in the application of algorithms. Techniques such as coordinate system transformation [
22], feature selection and hyperparameter tuning [
23], min-max normalization [
27], Kalman filtering and Butterworth low-pass filtering [
28] improve the quality of the data and performance of the algorithms. The OpenPose library, for instance, is used to process video frames from RGB sensors for joint estimation [
35]. These processing techniques complement the core algorithm, forming a complete analysis flow.
In general, movement analysis based on depth research cameras for physiological treatments has shown a trend of diversification, specialization, and convergence in algorithm selection and design. Researchers flexibly utilize existing machine learning and deep learning algorithms and develop innovative solutions based on specific application scenarios. This diversified algorithmic application strategy has effectively promoted technological advancements and improvements in clinical practice.
However, several noteworthy issues and research directions remain. First, although deep learning algorithms perform well, their "black-box" nature may affect the interpretability and credibility of models in clinical practice. Balancing model performance and interpretability requires further exploration. Second, while specialized algorithms for specific tasks are effective, their generalizability has yet to be verified. Designing algorithms that are specialized and flexible enough to adapt to different physiological treatment needs is a challenging research direction.
Furthermore, considering the special characteristics of physiological treatment data (e.g., data privacy, high labeling cost), improving algorithm performance under limited data conditions through techniques such as transfer learning and few-shot learning is crucial. With the development of edge computing and IoT technology, designing lightweight and efficient algorithms to meet the demand for real-time processing and feedback is also worthy of in-depth research.
With continuous progress in AI technology, particularly federated learning, interpretable AI, and adaptive learning, more efficient, accurate and safe algorithms will emerge in the future. These advancements will further promote depth camera-based physiotherapy movement analysis technology, providing patients with more personalized, intelligent, and effective rehabilitation programs, ultimately achieving the goal of precision medicine and intelligent rehabilitation.
3.5. RQ 5: Feature
As shown in
Table 6, the summary of current research and innovations in physiotherapy and rehabilitation highlights the integration of advanced technologies and methods. Incorporating computer vision technology, depth cameras, and other sensors has been pivotal in enhancing the effectiveness and accuracy of rehabilitation treatments. These technologies offer cost-effective training feedback, improve gait assessment, increase the accuracy of human posture estimation, enhance patient engagement and the precision of posture and movement analysis, enable personalized adaptive learning, accelerate post-stroke movement assessment, and support clinical decision-making. Additionally, these technologies are used to develop home rehabilitation protocols and remote physiotherapy guidance systems, promoting continuous patient care and recovery. This systematic review categorizes and discusses the features of using depth camera sensors in physiotherapy movement assessment, focusing on the following aspects:
Integration of Depth Sensors with Other Technologies: Research on integrating depth cameras with other sensors, such as pressure mats, is crucial in physiotherapy movement assessment. For example, Lim et al. demonstrated efficient balance training feedback by combining depth cameras and pressure mats.
Gait and Posture Assessment: Gait and posture assessment are critical in physiotherapy. Several studies explore the application of depth sensors in these areas. For instance, Wagner et al. utilized depth sensors to improve gait assessment accuracy, enhancing diagnosis and treatment. Hustinawaty et al. applied Kinect SDK skeletonization to assess the straight leg raise accurately, aiding in lumbar condition diagnosis. Girase et al. used machine learning to automatically detect and classify pathological movements from sit-to-stand transitions.
Upper and Lower Limb Rehabilitation: Upper and lower limb rehabilitation is a key research direction in physiotherapy. Lim et al. explored a personalized adaptive learning system for upper limb rehabilitation, improving patient outcomes. Uccheddu et al. developed a method using RGB-D sensors for precise joint angle estimation in-home rehabilitation. Saratean et al. implemented a Kinect-based remote physiotherapy guidance system, promoting continuous care.
Applications of Machine Learning and Deep Learning: Machine learning and deep learning are widely applied in physiotherapy assessments. Raza et al. improved human pose estimation using the LogRF and random forest algorithms. Khan et al. introduced a hybrid quantum neural network to enhance the speed and accuracy of post-stroke movement assessments. Keller et al. utilized unsupervised learning to analyze motion capture data, identifying movement strategies in low back pain patients. Bijalwan et al. combined deep learning models to enhance spatiotemporal feature modeling in stroke rehabilitation.
Home Rehabilitation and Remote Monitoring: Home rehabilitation and remote monitoring are current research hotspots. Zhao et al. proposed a home rehabilitation protocol for post-knee replacement using convenient technology. Çubukçu et al. developed a system for dynamic monitoring and correction of shoulder movements using Kinect. Garcia et al. used RGB-D cameras to analyze compensatory trunk movements, improving upper limb rehabilitation strategies.
Clinical Applications and Validation: Several studies validate the effectiveness of depth sensors in clinical settings. Trinidad-Fernández et al. accurately assessed movement limitations caused by spinal arthritis using RGB-D cameras, supporting better clinical decision-making. Trinidad-Fernández et al. demonstrated the reliability and speed of kinematic assessments using RGB-D cameras in clinical environments.
Table 6.
Summary of the Research Feature
Table 6.
Summary of the Research Feature
|
ID
|
Feature
|
| 1 |
Demonstrates integration of depth cameras and pressure mats as cost-effective, accessible feedback mechanisms for balance training. |
| 2 |
Advances gait assessment techniques using depth sensor data, improving diagnostic and treatment accuracy. |
| 3 |
Uses the LogRF method and random forest algorithms for improved human pose estimation in physiotherapy. |
| 4 |
Employs virtual reality to increase precision and engagement in posture and movement analysis during rehabilitation. |
| 5 |
Features a personalized adaptive learning system for upper-limb rehabilitation, enhancing patient-specific outcomes. |
| 6 |
Introduces a hybrid quantum neural network to enhance speed and accuracy in post-stroke exercise assessments. |
| 7 |
Combines deep learning models to enhance modeling of spatio-temporal features in stroke rehabilitation. |
| 8 |
Utilizes unsupervised machine learning to analyze movement capture data, identifying movement strategies in low back pain patients. |
| 9 |
Proposes a protocol for home-based rehabilitation post-knee replacement using accessible technology. |
| 10 |
Uses an RGB-D camera to accurately assess movement limitations in spondyloarthritis, supporting better clinical decisions. |
| 11 |
Applies Kinect SDK skeletonization to accurately assess the straight leg raise, aiding lumbar condition diagnoses. |
| 12 |
Automates detection and classification of pathologies from sit-to-stand movements using machine learning. |
| 13 |
Develop a system using Kinect to monitor and correct shoulder exercises dynamically. |
| 14 |
Integrates sensors and deep learning to enable real-time balance evaluations, enhancing therapy effectiveness. |
| 15 |
Develop a hybrid method using RGB-D sensors for accurate joint angle estimation in-home rehabilitation. |
| 16 |
Validates the use of RGB-D cameras for reliable and responsive kinematic assessments in clinical settings. |
| 17 |
Implements a Kinect-based system for remote physiotherapy coaching, facilitating continuous care. |
| 18 |
Analyze compensatory trunk movements with RGB-D cameras to refine upper limb rehabilitation strategies. |
In summary, computer vision-based physiotherapy movement assessment using depth camera sensors have shown diverse applications and significant advancements. These technologies not only demonstrate great potential in enhancing diagnostic accuracy, personalized rehabilitation, and patient engagement but also pave the way for more effective and accessible physiotherapy solutions. Studies indicate that depth sensors are widely applied in gait and posture assessment and upper and lower limb rehabilitation, and, when combined with machine learning and deep learning technologies, have achieved breakthroughs in home rehabilitation and remote monitoring. These studies cover balance training, virtual reality integration, and home rehabilitation, providing real-time, accurate, and personalized feedback mechanisms that improve treatment outcomes and patient participation. Future research should focus on integrating these features into comprehensive systems to enhance further diagnostic accuracy, movement assessment speed, and home rehabilitation efficacy, promoting more efficient and convenient rehabilitation practices. In general, applying depth cameras and advanced algorithms brings innovative solutions to physical therapy, significantly improving the efficiency and coverage of rehabilitation training.
3.6. RQ 6: Scenario
The movement analysis based on depth cameras for physical therapy shows significant value and potential in three major scenarios (See
Table 7): remote, clinical, and local (See
Figure 7). In remote scenarios, the technology breaks through geographical limitations and enables patients to receive real-time rehabilitation guidance and assessment at home, improving the accessibility and continuity of rehabilitation services; in clinical scenarios, it provides medical professionals with accurate exercise data and objective assessment tools, which help to formulate personalized and efficient treatment plans; and in local scenarios, such as at home or in community-based rehabilitation centers, the technology supports autonomous training and daily monitoring and enhances patients’ self-management ability. The integration of these three scenarios optimizes the allocation of rehabilitation resources and realizes an all-round multilevel rehabilitation care system, significantly improving the overall effect of physical therapy and patient experience. With the advancement of technology and in-depth clinical practice, this multi-scenario application mode is reshaping the traditional rehabilitation concept and promoting the development of physical therapy in the direction of intelligence, personalization, and popularization.
In a remote scenario, the main objective is to provide patients with a convenient home rehabilitation program. With the development of telemedicine technology, remote physical therapy movement assessment based on depth cameras has become a reality. For example, Maskeliunas et al. describe BiomacVR, a virtual reality (VR)-based rehabilitation system that combines a VR physical training monitoring environment and upper limb rehabilitation technology for precise interaction and improves patient engagement, which is applied to a real-time physical therapy sports wellness system for telerehabilitation. Authors Lim et al. [
25] propose an adaptive learning system based on imitation learning for multi-purpose upper extremity rehabilitation that allows patients to perform rehabilitation at home. Authors Çubukçu et al. [
33] examine the development of a Kinect 2 sensor-based telerehabilitation system that observes and evaluates exercise in patients with shoulder impairments through a web application used for communication between the patient and the therapist and a console application that helps the patient perform the exercise correctly. Saratean et al. proposes an approach based on effort parameterization for monitoring a home rehabilitation system to ensure correctness and adherence to rehabilitation exercises.
In clinical scenarios, physiotherapy movement assessment research focuses on accurately analyzing and assessing patients’ motion status to provide key data support for clinical diagnosis and treatment. For example, Wagner et al. realize the accurate analysis of patients’ gait through the estimation of gait parameters, which effectively assists clinical diagnosis and the formulation of treatment plans. Due to the wide application of artificial intelligence technology in this field, for example, Maskeliunas et al. [
24] utilize neural network algorithms to observe human skeletal motion through visible information, which can accurately analyze patient posture and movement patterns. In addition, Bijalwan et al. [
27] apply deep learning techniques for the detection and recognition of detecting and recognizing upper limb rehabilitation exercises, which helps clinicians assess the progress of the rehabilitation of patients. In disease-specific studies, such as [
28], machine learning methods are applied to identify exercise strategies for patients with low back pain, providing a scientific basis for developing clinical treatment programs. Trinidad-Fernández et al., which validates and analyzes patients’ trunk movement limitations by synchronizing and visualizing datasets, helps clinicians gain insights into patients’ movement abilities and limitations. Authors Hustinawaty et al. [
31] employ advanced detection and tracking techniques, combining calibration, skeletonization process, and feature extraction, to achieve monitoring and analysis of key movements in the rehabilitation process, providing detailed movement data support for clinical decision-making. Girase et al. [
32] applie semi-supervised learning algorithms to estimate the joint center position through the standard Kinect 2 body tracking library, successfully identifying and classifying critical factors of pathological movements and providing more accurate data support for rehabilitation treatment.
Table 7.
Summary of Scenarios in Physiotherapy Movement Assessment
Table 7.
Summary of Scenarios in Physiotherapy Movement Assessment
|
ID
|
Scenario
|
Objective
|
| 1 |
Local |
Evaluate balance training effectiveness with depth cameras and pressure mats. |
| 2 |
Local |
Enhance gait analysis accuracy using new spatiotemporal methods and depth sensors. |
| 3 |
Local |
Improve exercise correction in physiotherapy with innovative pose estimation. |
| 4 |
Remote, Clinical |
Develop a VR system for precise human posture and motion analysis to boost rehabilitation engagement. |
| 5 |
Remote |
Build a personalized adaptive learning system with collaborative robots for upper-limb rehab. |
| 6 |
Local |
Improve post-stroke exercise assessments with a hybrid quantum neural network. |
| 7 |
Clinical |
Enhance upper extremity rehab post-stroke by modeling spatio-temporal features with deep learning. |
| 8 |
Clinical |
Discover low back pain strategies using unsupervised learning on motion data. |
| 9 |
Local |
Establish a comprehensive home protocol for post-knee replacement recovery. |
| 10 |
Clinical |
Validate RGB-D cameras for precise trunk movement analysis in spondyloarthritis. |
| 11 |
Clinical, Local |
Analyze straight leg raises accurately using Kinect SDK’s skeletonization. |
| 12 |
Clinical |
Automate diagnosis of spinal, hip, and knee pathologies from sit-to-stand movements. |
| 13 |
Remote |
Develop a Kinect-based system to monitor and correct shoulder rehab exercises. |
| 14 |
Local |
Develop an on-demand balance evaluation tool integrating sensors with deep learning. |
| 15 |
Local |
Merge 2D and 3D RGB-D data for precise joint angle estimation in-home rehab. |
| 16 |
Local |
Validate the reliability and responsiveness of kinematic assessments with RGB-D cameras. |
| 17 |
Remote |
Implement a Kinect-based remote physiotherapy coaching system to ensure exercise adherence. |
| 18 |
Clinical |
Analyze compensatory trunk movements in upper limb rehab using RGB-D cameras. |
In local scenarios, physiotherapy exercise and assessment research focus on using advanced algorithms and data processing techniques to automate the evaluation of patient rehabilitation training and posture recognition and improve rehabilitation effects and patient compliance. For example, Lim et al. provides visual feedback to patients through the acquisition and processing of joint displacement data in real-time to help them perform effective balance training at home. Raza et al. applies AI algorithms combined with MediaPipe pose labeling, feature selection, and hyper-parameter tuning to achieve a high-precision estimation of human posture, which provides important data support for rehabilitation training. Khan et al. [
26] realize automated evaluation of exercises through high-quality neural network alignment of length and center and feature transformation, which significantly improves the efficiency and effectiveness of rehabilitation training. In rehabilitation after specific surgeries, Zhao et al. [
29] incorporate accelerometer and gyroscope measurement techniques to perform home rehabilitation training evaluation after total knee replacement. This makes it possible to monitor patient rehabilitation progress in the home environment. Wei et al. [
34] accurately estimate the patient’s center of mass position through data augmentation techniques, providing a scientific basis for balance training and assessment. Uccheddu et al. [
35] achieve an accurate estimation of joint position by processing video frame data from RGB sensors, providing strong support for motion analysis. In addition, Trinidad-Fernández et al. [
36] create virtual skeletal representations to assess patients’ ability to perform functional tasks, further extending the scope and depth of local rehabilitation assessment. These studies fully demonstrate that patient rehabilitation training can be effectively monitored and evaluated in local scenarios with advanced algorithms and data processing techniques, improving the rehabilitation effect and significantly enhancing patient compliance.
3.7. RQ 7: Target
In physiotherapy movement assessment, the selection of an appropriate target for the study is critical. This selection is directly related to the type of visual data to be captured and its depth, which affects the design and application of deep learning models. By carefully selecting targets for the human body, researchers can ensure that the acquired motion data is pertinent and complete, providing high-quality input for movement recognition and assessment.
Figure 8.
Diagram of bars for the studied body parts.
Figure 8.
Diagram of bars for the studied body parts.
As shown in
Figure 8, the existing literature shows that researchers generally focus on the main body parts, such as the entire body, the upper limb, and the lower limb, as well as specific joint parts, such as the knee, ankle, and shoulder (See
Table 8). For example, of the 18 articles, five [
21,
23,
26,
34,
37] provided in-depth studies on the recognition and evaluation of full body motion recognition and evaluation, which focus on how to capture body motion data using a depth camera and analyze it using deep learning models.
The upper limb is also one of the key targets, with a total of four articles [
8,
24,
25,
27] focusing on recognizing and evaluating movements of the upper limb. These studies focused on capturing motion data at the arm and elbow joints to guide upper limb rehabilitation.
The lower limbs have received similarly extensive attention, including the foot, ankle, knee, and hip. Six publications [
22,
29,
31,
32,
35,
36] have also been extensively analyzed aimed at providing assessment and guidance for lower limb rehabilitation. Of particular note, motor analysis of the regions of the lower back and trunk, whose motor status is critical to evaluating general physical mobility, was examined explicitly in 3 publications [
28,
30,
32].
In summary, the existing literature covers movement recognition and assessment of the full body, upper and lower limbs, and key joints. Reasonable selection of the study target is decisive in designing efficient data acquisition schemes, building accurate deep learning models, and deriving results from targeted movement analysis. Ensuring the reliability of the study results and the integrity of the acquired motion data, thus fully utilizing the potential of depth cameras and deep learning techniques in the scope of physiological therapy.
Table 8.
Summary of Target and Sensor in Physiotherapy Movement Assessment
Table 8.
Summary of Target and Sensor in Physiotherapy Movement Assessment
|
ID
|
Target
|
Sensor
|
| 1 |
body |
Kinect V2, Pressure Mat |
| 2 |
foot, knee, ankle |
Kinect V2 |
| 3 |
body |
Ordinary Camera |
| 4 |
upper limb |
Intel RealSense L515/D435i, HTC Vive VR Equipment |
| 5 |
upper limb |
Kinect Camera, Cobot, Force/Torque Sensor |
| 6 |
body |
Kinect |
| 7 |
upper limb |
Kinect V2 |
| 8 |
low-back |
Kinect V2 |
| 9 |
knee |
Kinect V2, IMU(Shimmer) |
| 10 |
trunk |
RGB-D Camera, IMU(MP67B) |
| 11 |
leg |
Kinect |
| 12 |
low-back, hip, knee |
Kinect V2 |
| 13 |
shoulder |
Kinect V2 |
| 14 |
body |
Kinect, WBB |
| 15 |
hip, knee, ankle |
Intel RealSense D415 |
| 16 |
low-back |
RGB-D Camera (Xtion Pro), IMU(MP67B) |
| 17 |
body |
Kinect for Xbox 360 |
| 18 |
upper limb |
Kinect V2 |