1. Introduction
Stress in humans is related to mental health and well-being [
1]. It is the biological response to a situation such as a threat, challenge, or physical and psychological barrier [
2]. The sympathetic nervous system (SNS) and the parasympathetic nervous system (PNS) are two components of the autonomic nervous system (ANS) that directly affect how the body reacts to stress [
3,
4]. In highly stressful events, the SNS executes the
fight or flight survival response. As a result, the body redirects its efforts toward fighting off threats. Given its subjective nature, identifying and monitoring the onset, duration, and severity of stressful events is challenging. This is especially true in workplace situations [
5] where there is often an intelligent choice to ignore stress for professional gain. Recent studies have shown an increase in stress levels in the office environment [
6]. Due to the plasticity of the brain, chronic or persistent stress has been shown to increase the volume of the amygdala, a structure within the limbic system that defines and regulates emotions, stores emotional memories, and, most importantly, executes the fight or flight response [
7]. Similarly, chronic stress is associated with a reduction in the mass of the prefrontal cortex [
8], which is used to
intelligently regulate thoughts, actions, and emotions.
Recent research in the field has introduced various sensor-based solutions for stress detection, as evidenced by studies such as [
4,
9,
10]. Although some of these solutions use only a single type of sensor, others employ multimodal sensing. Traditionally, electrocardiography (ECG) has been used to measure heart rate variability (HRV) for stress detection [
11]. Biomarkers like galvanic skin response (GSR), electrodermal activity (EDA), respiration, and electromyography (EMG) are increasingly recognized for assessing affective states and stress levels [
12,
13,
14], utilising sensing devices. While these traditional sensor types are considered the gold standard and provide excellent opportunities for the measurement of stress-related biomarkers, the ease of use for these devices in a practical scenario becomes a challenge, as experimentation can only be carried out in a designated equipped setting. The focus of research is shifting to developing simpler and more convenient sensing solutions that are applicable to everyday life to measure physiological parameters. Recent advances in technology have led to significant developments in wearable and personal sensing devices with applications in healthcare, for example, the use of a wearable device to capture physiological data for health monitoring [
15,
16,
17,
18,
19,
20]. These devices include chest bands [
15,
16,
21,
22] portable ECG devices [
17,
23] etc. HRV parameters can be measured using wristbands such as Empatica E4 wristband [
18,
24], Microsoft Band 2 [
19,
25], Polar watch [
20,
26], and Fitbit watch [
20,
26] among others. Researchers analyse personal data from these devices to provide relevant insights into the individual’s physical and health status. Although these devices show promise and provide a non-intrusive means of acquiring data for stress detection models, a major limitation of these devices relates to the size, making them uncomfortable for practical use cases [
27].
On the contrary, rPPG technology measures Blood Volume Pulse (BVP) using a camera, eliminating the need for sensor attachments [
28,
29]. By extracting skin pixels from facial data captured by the camera, rPPG technology utilises changes in skin colour corresponding to heartbeat to obtain the BVP signal [
28,
30,
31,
32]. This method simplifies the measurement, reduces sensor complexity, and avoids attachment-related problems. Furthermore, rPPG can be used to capture HRV measures for analysis, especially in healthcare applications. The widespread availability of cameras in the form of webcams or smartphones makes rPPG technology easily accessible to anyone. Due to its advantages, rPPG finds applications in healthcare, fitness and forensic science. Integration rPPG technology into smart mirrors or smartphones increases its potential as a professional health indicator. Although still in an early stage, rPPG-based non-contact affective computing has become a growing area of research in recent years, which can drastically improve human-computer interaction in real time for stress detection. This paper explores the feasibility of end-to-end methods for recognising stress by proposing a rPPG-based stress detection system to leverage non-contact and physiological techniques, facilitating the continuous monitoring of pervasive long-term biomedical signals. The contributions made in this paper are as follows:
A novel system leveraging non-contact and physiological techniques is proposed, enabling the continuous monitoring of pervasive biomedical signals for long-term stress detection.
Hybrid DL networks and models for rPPG signal reconstruction and Heart Rate (HR) estimation to significantly improve accuracy and efficiency in stress detection up to 95.83% with the UBFC-Phys’s dataset.
Extensive experiments and empirical evaluations of Deep Learning (DL) models for stress detection provide valuable insights and comparisons.
The remainder of this paper is structured as follows.
Section 2 presents a comprehensive literature review of the existing approaches, while
Section 3 introduces the methodology, collection protocol and preprocessing steps. In
Section 4, the experimental results are discussed while the conclusion and future work plan are outlined in
Section 5
2. Related Work
The term
stress was initially introduced into medical terminology in 1936, defining it as a
syndrome produced by diverse nocuous agents that seriously threaten homeostasis [
33]. Selye’s experiments demonstrated that prolonged exposure to severe stress could lead to disease and tissue damage [
34]. Recently, research on stress, its causes, and implications has gained traction [
4,
9,
10,
12,
13,
14]. It has been defined as a
complex interactional phenomenon, arising when a situation is deemed important, carries the possibility of damage, and requires psychological, physiological, and/or behavioural actions [
4,
9,
10]. Understanding stress involves distinguishing between stressors, stress responses, and stress biomarkers. Stressors are stimuli that disrupt normal activity, stress responses are symptoms triggered by stressors, and biomarkers reflect interactions between a biological system and potential hazards [
3,
4,
9,
10]. The human body responds to stressors through mechanisms such as the hypothalamic-pituitary-adrenal (HPA) axis, ANS, and the immune system [
35]. The HPA axis releases hormones, including cortisol, in response to stressors, initiating the "fight or flight response", leading to physiological reactions from the ANS, increasing SNS activity and decreasing PNS activity [
3,
4]. Cortisol levels and other physiological measures such as body temperature, respiration rate, pulse rate, HRV, and blood pressure (BP) have been identified as standard stress biomarkers [
15,
16,
17,
21,
22,
23]. Several methods for stress detection include questionnaires, ECG, electroencephalogram (EEG), BP using arm cuff, sampling saliva cortisol and other biomarkers from blood tests [
36,
37,
38]. Self-reporting tools such as the Perceived Stress Scale and Depression Anxiety Stress Scale are widely used to measure perceived stress, but have limitations such as biased responses and subjectivity [
39]. ECG measures changes in heart rhythm due to emotional experiences, providing information about HRV usually requires a visit to a medical facility. EEG captures electrical signals in the brain, correlating brain waves (beta and alpha) to stress, but conventional EEG machines are impractical for managing daily stress [
40,
41]. Biomarkers such as cortisol in salivary and hair samples are associated with chronic stress but are invasive and time-consuming. Blood pressure measured with a sphygmomanometer is accurate, but requires a trained professional [
36,
37,
38]. Ambulatory Blood Pressure Measurement (ABPM) devices offer home monitoring, but lack widespread validation and can be influenced by factors other than stress [
42]. While traditional sensor types are acknowledged as the gold standard, offering excellent opportunities for measuring stress-related biomarkers, their practical use in everyday situations poses a significant challenge. Emerging technologies have focused on developing simpler and more convenient sensing solutions applicable to daily life to measure physiological biomarkers. Wearable and personal sensing devices, such as chest bands, wrist bracelets, and portable ECG devices [
15,
18,
21,
24], have played a pivotal role in this evolution.
Conventional approaches to stress detection have drawbacks that are not in line with modern lifestyles and real-time monitoring. These methods are invasive, prone to bias, incur substantial costs, and require time-consuming travel to clinical settings. Over the past two decades, there has been a noticeable shift towards technology-driven approaches for more efficient, cost-effective, and less intrusive stress measurement compatible with modern lifestyles. Wearable devices, mobile applications, and Machine Learning (ML) algorithms have revolutionised stress detection and measurement. One approach is measuring HRV using wearable devices such as smartwatches, fitness trackers, and chest straps, allowing continuous and long-term monitoring of stress levels [
16,
17,
20,
23,
26]. Typically as HRV measures are inherently nonlinear, ML algorithms and other statistical data-driven methods such as Modified Varying Index Coefficient Autoregression Model (MVICAR) [
43] can be applied in stress detection systems. ML algorithms have enabled accurate and efficient HRV-based stress detection and classification systems [
29,
44,
45,
46,
47]. EDA, which measures the electrical activity of sweat glands, is another method that can be monitored with wearable devices, providing continuous and real-time monitoring of stress levels. Mobile applications using EDA-based biofeedback help individuals manage stress by providing real-time feedback and stress reduction techniques [
16,
25]. However, EDA measurement is sensitive to environmental factors, skin conditions and medications, affecting the precision.
The COVID-19 pandemic has stimulated interest in remote healthcare, leading to research using cameras for the estimation of rPPG signals and real-time monitoring, addressing the need for non-invasive, contactless and accessible methods for stress assessment [
48,
49]. rPPG offers a non-invasive means of measuring BVP remotely. This approach requires only a camera and an ambient light source. With this, HRV measures, pulse rate and breathing rate can be measured using an everyday camera for facial video analysis to remotely detect and monitor stress[
28,
30,
31,
32]. There have been a growing number of research paper, for example Benezeth et al. [
46] proposed an rPPG-based algorithm that estimates HRV using a simple camera, showing a strong correlation between the HRV features and different emotional states. Similarly, Sabour et al. [
29] proposed an rPPG-based stress estimation system with an accuracy of 85.48%. Some other works on the use of rPPG are encouraging, indicating that noncontact measures of some human physiological parameters (e.g., breathing rate (BR) and Heart Rate (HR)) are promising and have great potential for various applications, such as health monitoring [
47,
50] and affective computing [
51,
52,
53]. While these contributions are noteworthy, this paper significantly advances the field by introducing Hybrid Deep Learning (DL) networks and models for rPPG signal reconstruction and Heart Rate (HR) estimation. This novel approach presents about a substantial improvement in accuracy and efficiency in stress detection, achieving up to 95.83% accuracy with the UBFC-Phys’s dataset. The integration of Hybrid DL networks represents a contribution, offering enhanced capabilities for signal reconstruction and stress classification. Considering these, rPPG is well-suited for both business and everyday applications and has the significant advantage of measuring ECG and photoplethysmography (PPG).
Wearable and contactless devices offer promising alternatives for stress measurement, providing convenient and non-invasive methods for continuous monitoring. However, the quality and accuracy of the data generated by these devices can vary. A major limitation to adapting rPPG is evident in the decrease in the signal-to-noise ratio, which requires advanced signal processing. Many articles lack peer review and validation in clinical settings, raising concerns about the reliability of data. Although wearable devices can be sensitive to factors such as movement, heat, and transpiration, leading to inaccurate measurements, ease of use, especially during sleep or physical activities, is another huge limitation. Individuals with skin sensitivities, allergies, or specific health conditions may also find wearing these devices intolerable.
5. Conclusion and Future Work
This paper has successfully established a robust framework for remote stress detection through the analysis of physiological signals derived from facial videos. The primary goal was to ascertain an advanced DL model for stress classification, surpassing the capabilities of traditional ML techniques. The adoption of three DL methods (LSTM, GRU, and CNN) and their refinement through empirical optimization yielded significant achievements, including an impressive 95.83% accuracy in classifying stress from rPPG signals. The outstanding computational efficiency of the best-performing DL model, 1D-CNNv1, aligns seamlessly with the prospect of deploying the framework on edge devices. The exploration of augmentation techniques, particularly linear interpolation and the absence of augmentation, showcased promising outcomes, highlighting their efficacy in enhancing model performance. The proposed methodology holds significant potential to influence stress-related policies, practices, and management, potentially fostering increased user engagement with stress detection tools. However, it is crucial to acknowledge a major limitation inherent in the rPPG approach, centered around privacy concerns stemming from the utilisation of cameras and the diversity of the participants. The privacy issue emphasises the need for user consent and necessitates a careful balance between the potential advantages of the approach and the preservation of individual privacy rights. It is imperative to underscore that the rich insights provided by this approach should be accompanied by stringent privacy measures, ensuring that user consent is sought and respected throughout the stress detection process. Future work will focus on improving signal extraction through alternative physiological sensing tools and optimising parameters in existing toolboxes. Exploring additional augmentation techniques and advancing DL methods, particularly focusing on 1D-CNN, stands as promising paths for further enhancement. Rigorous validation through cross-validation and testing on diverse datasets is paramount to assess model robustness and ensure generalisation across various scenarios. Furthermore, future investigations could also consider the potential influence of participant ethnicity on model accuracy, recognising the importance of addressing diversity in the dataset and its implications for the broader applicability of the stress detection framework.
Author Contributions
Conceptualisation, L.F., PM and I.K.I; Methodology, LF.; Software, LF; Validation, L.F., PM and I.K.I; Formal analysis, L.F., PM and I.K.I; Investigation, L.F; Resources, L.F. and PM.; Writing—original draft preparation, L.F.; Writing—review and editing, L.F., P.M., D.V., S.Y., J.B, and I.K.I; Visualisation, L.F. and P.M.; Supervision, I.K.I
Figure 1.
Stress Detection Framework – The video frames serve as inputs to the pyVHR toolbox enabling the extraction of rPPG signals BPM from facial regions within the frames. The derived BPM signals are subsequently channelled through DL models (LSTM, GRU, and 1D-CNN), culminating in stress classification outcomes.
Figure 1.
Stress Detection Framework – The video frames serve as inputs to the pyVHR toolbox enabling the extraction of rPPG signals BPM from facial regions within the frames. The derived BPM signals are subsequently channelled through DL models (LSTM, GRU, and 1D-CNN), culminating in stress classification outcomes.
Figure 2.
GT BVP signals behaviour during no stress task (T1) and stress task (T2) of subjects s1 to s4
Figure 2.
GT BVP signals behaviour during no stress task (T1) and stress task (T2) of subjects s1 to s4
Figure 3.
1D 3x CNN-2x MLP architecture – labelled "1D-CNNv2". Image from the author.
Figure 3.
1D 3x CNN-2x MLP architecture – labelled "1D-CNNv2". Image from the author.
Figure 4.
Graphs depicting the Time Domain (TD) and Frequency Domain (FD) representations of the GT BVP signals for Subject 1 during tasks T1 and T2
Figure 4.
Graphs depicting the Time Domain (TD) and Frequency Domain (FD) representations of the GT BVP signals for Subject 1 during tasks T1 and T2
Figure 5.
Plot of Estimated BPM extracted from video T1 of subject 1, using the method CuPy CHROM, before and after augmentation using linear interpolation
Figure 5.
Plot of Estimated BPM extracted from video T1 of subject 1, using the method CuPy CHROM, before and after augmentation using linear interpolation
Figure 6.
Plot of Estimated BPM extracted from videos T1 of subject 1, using the method CuPy CHROM, before and after augmentation using white noise
Figure 6.
Plot of Estimated BPM extracted from videos T1 of subject 1, using the method CuPy CHROM, before and after augmentation using white noise
Figure 7.
Validation Loss and Train & Accuracy curves of the GT-1D-CNNv2 model
Figure 7.
Validation Loss and Train & Accuracy curves of the GT-1D-CNNv2 model
Figure 8.
Plot of Estimated BPM extracted from videos T1 of subject 1, using the method CuPy CHROM, before and after augmentation using white noise
Figure 8.
Plot of Estimated BPM extracted from videos T1 of subject 1, using the method CuPy CHROM, before and after augmentation using white noise
Figure 9.
Confusion matrix showing performance across different models
Figure 9.
Confusion matrix showing performance across different models
Table 1.
Parameters and methods used for rPPG with pyVHR toolbox
Table 1.
Parameters and methods used for rPPG with pyVHR toolbox
Table 2.
DL methods implemented
Table 2.
DL methods implemented
Table 3.
GT-PPG DL models’ results
Table 3.
GT-PPG DL models’ results
Table 4.
Comparison of different papers’ results on the UBFC-Phys’s data
Table 4.
Comparison of different papers’ results on the UBFC-Phys’s data
Table 5.
Best DL methods results from the rPPG data
Table 5.
Best DL methods results from the rPPG data
Table 6.
Overfitted results of the rPPG data
Table 6.
Overfitted results of the rPPG data