1. Introduction
In recent decades, loneliness has become a widespread and severe problem [
1]. According to Perlman’s definition, loneliness occurs when a person experiences an issue in their relationship, which could be in terms of quality or quantity [
2]. Although occasional feelings of loneliness are common, they become a serious problem when they develop into a persistent condition [
3]. Overall, loneliness is a deeply personal and emotional experience, characterized by dissatisfaction with one’s social interactions and connections. It is important to note that loneliness is subjective; one person may be alone and feel content, while another may have a large social network and still feel lonely. Loneliness can negatively affect an individual’s physical, emotional, and mental health [
4,
5]. For example, research has shown that feeling lonely is linked to higher blood pressure [
4], a greater risk of developing cardiovascular diseases [
5] and a rise in symptoms of depression [
6]. While loneliness is often perceived as a singular emotional state, it is, in fact, a multifaceted experience [
7].
In a clinical setting, loneliness is usually assessed using the standard scales developed and validated by clinicians and researchers over the decades for this specific purpose [
12,
13]. These scales explore the multidimensional aspects of loneliness using multiple questions. One of the widely used such scales is the University of California Los Angeles scale (UCLA) [
14], designed to categorize loneliness into 2 main categories: social aspect and emotional loneliness. The social dimension of loneliness refers to the subjective experience of feeling socially disconnected and lacking a sense of belonging [
15]. This is commonly seen when an individual has fewer personal relationships or a smaller social circle than they would want. While the emotional element emphasizes the lack of a deep, personal level connection with others, which could be deeply experienced even in the physical presence of others [
16]. Studies have indicated that experiences of social and emotional loneliness do not always coincide. For instance, a person might have a robust peer network yet miss having a close friend, suggesting they feel emotional loneliness without necessarily feeling socially isolated [
17,
18]. This distinction is critical for a nuanced understanding of loneliness, as social and emotional loneliness may arise from different circumstances, have distinct physiological and psychological effects, and therefore may require different intervention strategies [
19]. Specifically, emotional loneliness has been linked with emotional challenges, including negative self-views, introverted behaviour, and distorted views of relationships [
20]. While prior research has identified these distinct types, to the best of our knowledge, no study has investigated how they manifest in behavioural patterns derived from passively sensed data within the student population specifically.
College students represent a particularly salient group for the study of loneliness, given the unique set of challenges and transitions they encounter during this pivotal life stage. As they navigate the transition from adolescence to adulthood, students are often challenged by a new social environment, frequently far from their family support and their established social networks [
21]. This challenge is coupled with the demands of academic stress, the pressure to conform to new social norms, and the challenge of establishing a sense of identity and belonging in an unfamiliar setting. Prior research has shed light on the prevalence of loneliness among this demographic; for instance, an extensive survey of around 33,000 college students found that a majority of two-thirds had difficulties with loneliness and a sense of isolation [
22]. This is alarming, considering that such levels of loneliness have been correlated with a range of negative mental health outcomes, including increased rates of depression, anxiety, and stress. Therefore, identifying behavioural markers of loneliness in the college student experience is a crucial step towards developing more effective interventions. These efforts not only hold the potential to improve student well-being but also have broader implications for public health and educational systems.
In response to the need to better understand and address loneliness among college students, the collection of behavioural data through passive sensing holds promise for detecting loneliness and contributing to a deeper understanding of this experience. Utilizing sensors embedded in ubiquitous devices such as smartphones and wearables such as a fitness tracker or smartwatch, passive sensing data collection is non-intrusive and continuous, thereby offering a comprehensive and objective snapshot of an individual’s daily life [
23]. These data can include physical activity levels, sleep patterns, smartphone usage, and social interaction, as measured through call and message logs, and meeting others measured through Bluetooth device encounters. Unlike self-reported surveys and questionnaires, which can be subject to recall bias and social desirability, passively sensed data provides an unfiltered and unbiased lens into an individual’s behaviour and well-being. In the context of loneliness, this technology has the potential to uncover subtle but telling signs of both social and emotional loneliness. Therefore, passive sensing stands as an excellent tool, not only for detecting loneliness in a timely and precise manner but also for shedding light on the nuanced ways in which loneliness manifests in daily life.
The objective of this study is to investigate social and emotional loneliness using passively sensed data for a student population. Our goal is to determine the subtle differences in data from individuals who are socially and emotionally lonely, evaluate the predictive power of behavioural features in categorizing these loneliness types, and find the behavioural features that are most important for differentiating between these two types of loneliness. The key research questions are:
Can behavioural features extracted from passively sensed data differentiate between socially and emotionally lonely students?
Can behavioural features help to classify social and emotional loneliness?
What behavioural features are most important for predictive models in classifying loneliness and its types?
2. Methods
2.1. Dataset
We used an existing data set of passively sensed data by the University of Washington. The data was collected during the Spring quarter of 2019 over 10 weeks from March to June [
24]. Data was collected from 218 undergraduate students who participated via email and social media invitations. Detailed demographic information about the participants is provided in
Table 1.
The AWARE smartphone application was used for data collection, which works in the background without requiring any user interaction [
25]. Additionally, Fitbit wearable fitness trackers were used to collect data on sleep and physical activity. The dataset includes multiple sensors, which include Bluetooth, Wi-Fi, location, phone usage, call activity, physical activity, and sleep patterns. The study was approved by the University of Washington’s Institutional Review Board (IRB number: STUDY00003244), and all participants provided informed consent. Data confidentiality was ensured through strict adherence to anonymization protocols, with access to identifiable information restricted to the core data team. Additionally, data from participants who chose to withdraw from the study were promptly removed from the dataset.
2.2. Data Preprocessing
We included only those students who completed the post-study loneliness questionnaire leading to 205 students out of 218. The reason for selecting post-study completion was to assess the effects of loneliness throughout the study period, which would provide a comprehensive view of each participant’s experience. We then identified and removed outliers using the z-score method. To handle the issue of missing values, we filled in missing data for each student using the median value for continuous data features specific to a session. For categorical data, we used the mode value of a particular feature for that session. Categorical features underwent one-hot encoding to achieve integer representation. Numerical data were normalized using min-max scaling, which adjusted each value to a range between 0 and 1.
The dataset extracted behavioural features using the Reproducible Analysis Pipeline for Data Streams (RAPIDS) [
26]. The dataset provided day level and segmented daily interval features spanning from 12:00 am to 11:59 pm. These intervals were further subdivided into distinct time segments: morning (6 am - 12 pm), afternoon (12 pm - 6 pm), evening (6 pm - 12 am), and night (12 am - 6 am). The division into specific time segments serves an important role in capturing nuanced behavioural patterns, as individuals tend to engage in distinct activities during different parts of the day. These patterns, like routines and variability, were measured using metrics such as counts, standard deviations, and entropy. A detailed overview of these features is provided in RAPIDS documentation [
26,
27]. A total of 403 features were extracted for each participant. All sensor features were extracted for 5 time segments (day, morning, afternoon, evening, and night) except for sleep features. After preprocessing, the dataset consists of 14,350 samples, with each sample representing one day of data for a participant. For a summarized form of the extracted features, please refer to
Table 2. A sample of the feature matrix scheme is provided in
Figure 1.
2.3. Categorizing Social and Emotional Loneliness
In the dataset, loneliness was assessed using a revised 10-item UCLA Loneliness Scale [
28]. The revised 10-item UCLA Loneliness Scale uses a 4-point response scale for each item, where 1 represents ’never’, 2 ’rarely’, 3 ’sometimes’, and 4 ’often’. With 10 items in total, the minimum possible score is 10 (if a participant responds with 1 for all 10 items), and the maximum possible score is 40 (if a participant responds with 4 for all 10 items). Therefore, the overall score range for this 10-item questionnaire spans from 10 to 40. For our research purposes, we divided the items of scale into two distinct categories of social and emotional loneliness based on the criteria proposed in [
29,
30]. Items in the scale that are related to feeling isolated, feeling left out, lack of companionship, and social interactions are classified under the social loneliness category, while the items related to emotional disconnect, like not feeling close to others, and no one truly understanding me, are put under the emotional loneliness category. Each category consisted of 5 items from the original scale presented in
Table 3. Some items in the scale, denoted by (R), are reverse-scored. This means that for these items, the scoring is inverted: a response of 1 is scored as 4, 2 as 3, 3 as 2, and 4 as 1. This reverse scoring ensures that higher scores consistently indicate greater levels of loneliness across all items.
There is no universally accepted threshold in the literature for determining loneliness cut-off scores to divide into low or high loneliness categories. Many studies have proposed their own cutoff scores such as one by Cacioppo et al. [
5]. In our study, we consider scoring in two dimensions: social loneliness (referred to as social_score) and emotional loneliness (referred to as emotional_score), each ranging from 5 to 20. A cumulative score of 10 is achieved if a participant selected ’rarely’ for all 5 questions, which represents occasional feelings of social or emotional loneliness. Hence, we chose 10 as our cut-off score for each subscale. To categorize loneliness levels, we used the following approach:
Participants with a social_score of above 10 and an emotional_score of 10 and below were labeled as ’socially lonely’.
Participants with an emotional_score of above 10 and social_score of 10 and below were labeled as ’emotionally lonely’.
Participants scoring above 10 on both scales were considered ’both socially and emotionally lonely’.
Finally, those scoring 10 or below on both scales were categorized as ’not lonely’.
We first applied some basic statistics to analyze the different aspects of loneliness among the students in the dataset. This involved calculating the mean, median, first quartile (Q1), third quartile (Q3), and standard deviation (SD) for the four distinct categories: ’socially lonely’, ’emotionally lonely’, ’both socially and emotionally lonely’, and ’not lonely’. This provided an initial overview of the overall distribution and central tendency of loneliness within the dataset.
2.4. Differentiating Social and Emotional Loneliness
We conducted a statistical analysis using behavioural features to address our first research question about differentiating between social and emotional loneliness. Before applying statistical analysis, we used Mutual Information (MI) for feature selection to identify the most relevant behavioural features for differentiating between social and emotional loneliness. We selected features whose cumulative MI reached a threshold of 95% of the total MI across all features.
We then checked the normality of the distribution of behavioural features for both socially and emotionally lonely student groups. This was conducted using the Shapiro-Wilk test [
31], which is a robust method for testing normality in data and it is particularly effective for small sample sizes like ours (24 participants in the socially lonely group and 19 in the emotionally lonely group). The normality check is important because different statistical tests are suitable for different types of data distributions. Parametric tests, such as t-tests or ANOVA, assume that the data follows a normal distribution. If this assumption is violated, non-parametric tests including the Mann-Whitney U test, are more appropriate because they do not rely on the normality assumption. By using the Shapiro-Wilk test, we ensured the validity and reliability of our subsequent statistical analyses in identifying differences between socially and emotionally lonely groups.
Given that the Shapiro-Wilk test indicated a non-normal distribution for majority features, we used a two-sided Mann–Whitney U test to compare the feature distributions between the socially and emotionally lonely groups. This non-parametric test was selected for its ability to compare differences between two independent samples without the assumption of normal distribution. The goal was to determine if there were statistically significant differences in the feature profiles between socially lonely and emotionally lonely groups. The null hypothesis for this test was that there was no difference in the mean of features between the two groups, while the alternative hypothesis stated that the mean of one or more features would differ.
To quantify the magnitude of the differences observed between the socially and emotionally lonely groups, we calculated effect sizes using the statistical technique called bootstrapping. Bootstrapping is a resampling technique that enhances the reliability of statistical inferences, which is particularly useful when working with smaller sample sizes. This method works by repeatedly sampling with replacement from the original dataset to create multiple simulated datasets. This helps with the estimation of sampling distributions and the calculation of robust statistics. In our case, a total of 10K bootstrap samples were used to estimate the distribution of the effect size and resampled features with replacement, which provides a more robust measure in the context of our small sample sizes. This involved resampling the observations (with replacement) 10K times to create multiple bootstrap samples. We selected Cohen’s d as our effect size metric. Cohen’s d is a standardized measure of the difference between two group means, that is expressed in units of standard deviations. This standardization is used for meaningful comparisons across different variables with varying scales which makes it suitable for our analysis of diverse features. Alongside the point estimate, a 95% confidence interval was computed from the bootstrap distribution to calculate the precision of the effect size.
2.5. Predictive Modeling for Loneliness Classification
We trained multiple ML multi-class classification algorithms to address our second research question regarding the power of behavioural features in classifying loneliness types. The dataset has instances labeled as ’socially lonely’, ’emotionally lonely’, ’both lonely’, or ’not lonely’. We selected four widely used machine learning algorithms for our classification task: Support Vector Machine (SVM), XGBoost, Random Forest, and K-Nearest Neighbors (KNN). We chose these ML algorithms because they are widely used in supervised classification, easy to train, and interpretable [
32]. Moreover, these ML algorithms have also been used in similar previous studies [
33,
34,
35].
We used nested cross-validation for model evaluation and selection. In the outer loop, we used leave-one-subject-out cross-validation to assess generalization. Within each iteration of the outer loop, we used an inner loop of stratified three-fold cross-validation for hyperparameter tuning using GridSearchCV. Preprocessing steps were conducted within each inner fold before model training and evaluation, including feature scaling, handling missing values, feature selection using MI, and handling class imbalance with SMOTE. We used the macro-averaged F1 score as the evaluation metric for model selection in cross-validation.
For baseline comparisons, we considered the majority class, random weighted classifier, and decision tree models trained on the original, imbalanced dataset to provide a fair comparison. The majority class model always predicts the most frequent class in the dataset to serve as a naive baseline. The random weighted classifier makes predictions randomly. The decision tree model makes classifications based on a series of feature-based questions to serve as a middle-ground comparison between the simplest baselines and our more complex models. These baselines serve as reference points to evaluate the performance of more complex models. The primary metrics to assess the performance of each model were accuracy, precision, sensitivity, and F1 score. These metrics, which provide a detailed view of the model’s performance were calculated for each loneliness category. Accuracy provided a global view of model performance, while precision, sensitivity, and the F1 score provided insights into the model’s ability to predict each specific class.
2.6. Feature Importance Analysis
To address our third research question regarding determining the most important features used by the predictive models in distinguishing between the different loneliness types, we used the Shapley Additive exPlanations (SHAP) values for XGBoost and Random Forest models [
36]. We chose these two models specifically because they provided the best results in our predictive classification task.
SHAP values measure the impact of each feature on the model’s output for classification. In our multi-class context, SHAP computes values for each class separately to compute feature importance for each loneliness type. Once the classification models were trained, we used the SHAP library to compute the SHAP values for each feature across all data points. This process outputs a SHAP value for each feature for each prediction and each class, indicating the feature’s impact on the model’s output for classification. We averaged the absolute SHAP values for each feature within each class to analyze the importance of class-specific features. Averaging the absolute SHAP values helped us to see the overall influence of each feature on the model’s predictions to provide a clear and robust measure of feature importance within each class. We then ranked the features based on their average absolute SHAP values within each class to identify the most important behavioural indicators for each loneliness category.
3. Results
3.1. Overview of Loneliness in Participants
For the specific dimensions of loneliness, the mean social loneliness score was 10.93, with a median of 11 and an interquartile range (Q1: 9, Q3: 13) with a standard deviation of 2.736. Emotional loneliness scores have a mean of 10.71, a median of 11, and the same interquartile range (Q1: 9, Q3: 13), but a slightly higher standard deviation of 2.905. These statistics include all students to capture the full range of social and emotional loneliness within the dataset.
When classifying participants based on their social and emotional loneliness scores, we found a bit larger subset experiencing social loneliness (11.71%, 24 out of 205) compared to emotional loneliness (9.27%, 19 out of 205). However, a significant portion (42.44%, 87 out of 205) reported feeling both socially and emotionally lonely which highlights the interconnectedness of these experiences. 36.59% (75 out of 205) were classified as feeling neither socially nor emotionally lonely.
3.2. Statistical Differences for Social and Emotional Loneliness
To determine whether behavioural features could differentiate between social and emotional loneliness, we conducted a two-sided Mann-Whitney U test on the non-normally distributed data (as confirmed by the Shapiro-Wilk test).
Table 4 presents the statistically significant (p < 0.05) features only with mean differences and their effect sizes for socially and emotionally lonely groups. We interpret the magnitude of effect sizes (Cohen’s d) using the commonly used approach, where values of 0.2, 0.5, and 0.8 are considered as thresholds for small, medium, and large effects, respectively [
37]. Below are the key results.
Our analysis of location-based features found significant differences between socially lonely (SL) and emotionally lonely (EL) groups. The log-transformed location variance in the evening, which represents the variability in a participant’s geographic position, was lower for the SL group (M = 2.301) compared to the EL group (M = 3.751), with a medium effect size of -0.715. This indicates that emotionally lonely individuals show more varied movement patterns in the evening. On a daily basis, the SL group visited fewer significant places, defined as distinct locations (M = 1.504), compared to the EL group (M = 2.167). The number of transitions between these significant locations was also lower for the SL group (M = 5.463) than the EL group (M = 7.374), with a large effect size of -0.780, indicating less movement between key locations for socially lonely individuals. Interestingly, the normalized location entropy, which measures the evenness of time distribution across significant locations, was higher for the SL group (M = 0.451) compared to the EL group (M = 0.323). This indicates that while socially lonely individuals visit fewer locations, they tend to distribute their time more evenly across these places.
Analysis of phone usage patterns also found significant differences between the two groups. The EL group showed higher overall phone engagement with a total daily unlock duration of 495.535 minutes compared to 400.204 minutes for the SL group (d = -0.535). The EL group took longer to first use their phone after waking, averaging 45.067 minutes versus 28.745 minutes for the SL group with an effect size of -0.578. Another difference was in the maximum duration of a single unlock episode, with the EL group spending up to 18.073 minutes compared to 7.683 minutes for the SL group. Other metrics, including the frequency of phone unlocks, also showed higher values for the EL group.
Analysis of Bluetooth related patterns also found significant differences between groups. It shows that the EL group encountered more unique devices daily (M = 5.516, 95) compared to the SL group (M = 3.701) which indicates increased social proximity or time spent in populated areas. Step count analysis showed the EL group had higher average daily steps (M = 5300.745) than the SL group (M = 4800.335), with a medium to large effect size (d = -0.754). Sleep patterns differed as well with the SL group sleeping longer on average (M = 510.047 minutes) compared to the EL group (M = 407.731 minutes). However, the EL group spent more time awake in bed (M = 88.385 minutes) than the SL group (M = 60.320 minutes) which might indicate sleep quality issues.
3.3. Predictive Power of Behavioural Features in Loneliness Categories
Table 5 presents the predictive performance of different ML classifiers for categorizing loneliness into ’socially lonely’, ’emotionally lonely’, ’both lonely’, and ’not lonely’. We compared these classifiers against three baseline models: Majority Class (BL1: MC), Decision Tree (BL2: DT), and Random Weighted Classifier (BL3: RWC). These baseline models were chosen because they provide simple, interpretable benchmarks to evaluate the performance of more complex classifiers.
The XGBoost model achieved the highest overall accuracy of 78.48%. It outperformed in classifying the ’Both Lonely’ category with an F1-score of 85.44% and showed strong performance in the ’Emotionally Lonely’ category with an F1-score of 70.49%. The XGBoost model showed the best overall balance and highest metrics across all classes. After that, the Random Forest model achieved an accuracy of 75.58%, showing strong performance with an F1-score of 82.41% for the ’Both Lonely’ class and 72.76% for the ’Not Lonely’class. The model provided a good balance of precision and sensitivity across all categories. Support Vector Machine (SVM) performed well with 70.10% accuracy and made strong predictions in the ’Both Lonely’ and ’Not Lonely’ categories. K-Nearest Neighbors (KNN) showed balanced performance with a 65.53% accuracy.
While F1-scores provide a balanced performance measure, investigating precision and sensitivity separately can provide important observations regarding these models. For instance, the XGBoost model showed high precision (88.07%) in detecting students who are ’Both Lonely’, which shows a low false positive rate for this category. However, its sensitivity for ’Socially Lonely’ students was lower (58.59%), which shows some challenges in identifying all cases in this category. The Random Forest model showed a more balanced precision-sensitivity trade-off for the ’Not Lonely’ category (precision: 75.94%, sensitivity: 70.86%) which shows consistent performance in both correctly identifying and capturing instances of this class. These nuances in precision and sensitivity across different loneliness categories highlight the varying challenges in accurately classifying each type of loneliness.
3.4. Important Features for Loneliness Classification
Figure 2 and
Figure 3 present the mean absolute SHAP values for different features across the four loneliness categories: Socially Lonely (SL), Emotionally Lonely (EL), Both Lonely (BL), and Not Lonely (NL). The x-axis represents the normalised mean absolute SHAP value for each feature, which shows the average magnitude of that feature’s contribution to the model’s output. SHAP values represent the impact of a feature on the model’s prediction. The magnitude of a SHAP value shows the feature’s importance for a particular prediction. In our figures, larger mean absolute SHAP values indicate a stronger average influence of that feature on the model’s decisions.
Both XGBoost and Random Forest models identified location-based and phone usage features as highly influential in distinguishing between loneliness categories. However, there were some differences in the relative importance of specific features between the two models. The most influential features in the XGBoost model were the maximum duration of phone usage and location maximum length of stay at clusters. These features showed a strong impact across all loneliness categories but the strongest effect was on the ’Both Lonely’ category. Sleep-related features such as average duration awake also showed some importance specifically for the ’Not Lonely’ category.
In our Random Forest model, the SHAP value analysis also found important findings related to features. Location-based features such as variance and entropy were highly influential and had the greatest impact on almost all categories. The maximum duration of the phone and location entropy were also influential. Interestingly, Bluetooth-related features appeared more important in the Random Forest model as compared to XGBoost.
4. Discussion
The main aim of this study was to explore whether behavioural features from passively sensed data could distinguish between social and emotional loneliness and classify the types of loneliness using machine learning models. The analysis found statistically significant differences in the behavioural markers for social and emotional loneliness groups. The observed differences in behavioural features between the Socially Lonely (SL) and Emotionally Lonely (EL) groups provide evidence that passive sensing can capture distinct patterns associated with these two forms of loneliness.
Location-based behaviours showed significant differences between the two groups. The EL group showed higher location variance, especially in the evening, visited more significant places, and had more location transitions compared to the SL group. In contrast, the SL group’s lower mobility and fewer significant places visited may indicate a lack of interest or opportunity to participate in social interactions, which is a sign of social loneliness [
38]. This behaviour might be linked to the Introversion-Extraversion Dimension of personality. Socially lonely individuals may lean towards introversion and prefer solitary activities and less social stimulation [
39]. There could also be some other factors for this behaviour, such as a lack of social support, which could discourage them from exploring new environments or engaging in social activities. Interestingly, while the SL group visited fewer locations, they showed higher location entropy, which indicates more evenly distributed time across the places they visited. This might indicate a preference for familiar or comfortable environments, maybe somehow to manage feelings of social disconnection. This difference between the two groups shows the nuanced ways in which loneliness manifests. However, it is important to note that the current data cannot definitively establish a causal relationship. While other studies provide further information about the reasons behind loneliness related behaviours [
40,
41,
42,
43], a deeper investigation is still needed while using different methods like combining qualitative methods with behavioural features to gain a more nuanced understanding of the complex relationship between these two loneliness types and daily life behaviours.
The findings from phone usage patterns also show differences between the two types of loneliness. The EL group uses their phones more often and for longer periods, which may show their preference for digital communication instead of face-to-face interactions. This could be explained by the idea of displacement hypothesis, where increased digital engagement takes the place of in-person social interactions and it could lead to increased feelings of isolation or loneliness [
44]. The SL group uses their phones less, maybe because they have a different social situation or do not feel the need for digital contact.
The Bluetooth data shows more unique device encounters for the EL group, which supports the location data in suggesting that emotionally lonely individuals might seek out more populated areas. According to existing research, this behaviour aligns with the emotional loneliness construct, where individuals might feel lonely despite being in social settings [
45]. Significant differences in physical activity and sleep patterns between the two groups provide additional information about the behavioural impacts of loneliness. The SL group is less physically active which might cause or be a result of them withdrawing socially. They also sleep more, which could be due to many reasons like if they are feeling depressed or not active during the day, leading to a need for less sleep at night.
These findings have important implications for understanding the complex nature of loneliness and its impact on student behaviour and well-being. These also findings show that passive sensing can differentiate between social and emotional loneliness and find the behavioural patterns associated with each. The findings can be used for targeted interventions, such as promoting social engagement for socially lonely students or addressing emotional needs and connectivity for emotionally lonely students. To maximize the effectiveness of such interventions, it is important to first gain a deeper understanding of the motivations and experiences behind the behavioural differences that we observed. Qualitative research could explore these behaviours to help us understand why individuals engage in specific patterns and how they perceive the connection between their behaviours and feelings of loneliness.
The machine learning models trained for classification based on loneliness types showed their predictive capabilities of behavioural features extracted from passively sensed data. The best-performing model was XGBoost with an overall accuracy of 78.48%. This highlights the power of ensemble-based techniques for handling complex sensor datasets for loneliness prediction. Moreover, the F1 score for XGBoost is also high for the ’Both Lonely’ category as compared to other models. This shows the overall robustness of XGBoost for the classification of social and emotional loneliness.
The varying precision and sensitivity scores across different models and loneliness categories highlight important observations about the challenges of classifying loneliness types based on behavioural data. For instance, XGBoost’s high precision (88.07%) for the ’Both Lonely’ category indicates that when the model predicts this category, it’s highly likely to be correct. This indicates that students experiencing both social and emotional loneliness may exhibit more distinct behavioural patterns that are easier for the model to identify. On the other hand, the lower sensitivity for the ’Socially Lonely’ category across models indicates that behaviours associated with social loneliness might be more subtle or varied, making it challenging for models to identify all instances. This could imply that social loneliness manifests in more diverse ways in behaviours and might overlap with patterns seen in other loneliness categories, which makes it difficult for models to identify it correctly.
The feature importance analysis for XGBoost and Random Forest models provides important findings about the behavioural features that distinguish between different types of loneliness. The consistent importance of location-based and phone usage features across both models highlights their significance in understanding and categorizing loneliness. This indicates that mobility patterns play a significant role in differentiating loneliness types. Interestingly, our analysis found lower SHAP scores for features distinguishing between SL and EL groups compared to those differentiating the ’Both Lonely’ and ’Not Lonely’ categories. There could be different reasons for this observation. SL and EL might share similar behavioural patterns in some aspects which made it challenging for the models to distinguish between them based solely on passive sensing data. This means the behavioural manifestations of social versus emotional loneliness might be more nuanced and less pronounced in passive sensing data compared to the clear differences between being lonely (in any form) and not lonely. This could be because students experiencing both types of loneliness have shown clearer or more consistent behaviours. At the same time, those who are not lonely have significantly different patterns of movement, phone usage, and sleep.
5. Conclusions
In this study, we explored the multifaceted nature of loneliness among students, specifically focusing on the distinction between social and emotional loneliness using passively sensed data. Our research has addressed three key questions that provide important insights into the behavioural manifestations of different types of loneliness. Our analysis found that behavioural features extracted through passively sensed data can differentiate between socially and emotionally lonely students. The statistical test found statistically significant differences in various behavioural features between these two groups. Socially lonely individuals showed less variance in their locations as compared to emotionally lonely individuals. Additionally, socially lonely students had shorter overall phone usage duration, fewer Bluetooth scans, and fewer steps than emotionally lonely students. We also showed the considerable predictive power of behavioural features in classifying social and emotional loneliness. The XGBoost model achieved the highest overall accuracy of 78.48% and a high F1 score across all types of loneliness. This shows the potential of behavioural features extracted through passive sensing in identifying and differentiating between types of loneliness. We identified the most important behavioural features for predictive models in classifying loneliness and its types. Our analysis identified phone usage and location-based features as critical in distinguishing between loneliness categories in both XGBoost and Random Forest models. The findings highlighted that phone usage duration, location variance, and sleep-related features were particularly significant in differentiating socially, emotionally, and both socially and emotionally lonely individuals from those not experiencing loneliness. Despite these significant findings, our study has some limitations that could be addressed in future research. The generalizability of our findings is limited by the size and diversity of the dataset. The participants, who are college students, form a homogeneous group with possibly similar daily routines and psychological challenges. The dataset that we used does not capture all the characteristics associated with loneliness. Personal relationships [
46], age [
47], major life events [
48], and mental health history [
49] all significantly influence an individual’s experience of loneliness. Future research should aim to include more diverse populations with richer demographic data to broaden the applicability of these findings.
Author Contributions
Conceptualization, M.M.Q., E.Z., E.B.W and D.P; methodology, M.M.Q. and E.Z.; software, M.M.Q.; validation, M.M.Q., E.Z., and D.P.; formal analysis, M.M.Q.; investigation, M.M.Q.; resources, M.M.Q. and E.Z.; data curation, M.M.Q.; writing—original draft preparation, M.M.Q.; writing—review and editing, E.Z., E.B.W. and D.P.; visualization, M.M.Q.; supervision, D.P. and E.B.W.; project administration, D.P. and E.B.W.; funding acquisition, D.P. All authors have read and agreed to the published version of the manuscript.
Funding
This publication has emanated from research supported in part by a Grant from Science Foundation Ireland under Grant number 18/CRT/6222.
Data Availability Statement
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| EL |
Emotionally Lonely |
| MI |
Mutual Information |
| RAPIDS |
Reproducible Analysis Pipeline for Data Streams |
| SHAP |
Shapley Additive exPlanations |
| SL |
Socially Lonely |
| SMOTE |
Synthetic Minority Oversampling Technique |
| UCLA |
University of California Los Angeles |
References
- U.S. Surgeon General’s Advisory. Our Epidemic of Loneliness and Isolation. Available online: https://www.hhs.gov/sites/default/files/surgeon-general-social-connection-advisory.pdf.
- Campbell, M. Loneliness, social anxiety and bullying victimization in young people: A literature review. Psychology and Education 2013, 50, 1–10. [Google Scholar]
- Martín-María, N.; Caballero, F.F.; Miret, M.; Tyrovolas, S.; Haro, J.M.; Ayuso-Mateos, J.L.; Chatterji, S. Differential impact of transient and chronic loneliness on health status: A longitudinal study. Psychology & Health 2020, 35, 177–195. [Google Scholar]
- Hawkley, L.C.; Thisted, R.A.; Masi, C.M.; Cacioppo, J.T. Loneliness predicts increased blood pressure: 5-year cross-lagged analyses in middle-aged and older adults. Psychology and Aging 2010, 25, 132. [Google Scholar] [CrossRef] [PubMed]
- Cacioppo, J.T.; Cacioppo, S. Social relationships and health: The toxic effects of perceived social isolation. Social and Personality Psychology Compass 2014, 8, 58–72. [Google Scholar] [CrossRef]
- Wei, M.; Russell, D.W.; Zakalik, R.A. Adult attachment, social self-efficacy, self-disclosure, loneliness, and subsequent depression for freshman college students: A longitudinal study. Journal of Counseling Psychology 2005, 52, 602. [Google Scholar] [CrossRef]
- Weis, R.S. The experience of emotional and social isolation. The MIT Press: Cambridge, MA, USA, 1973.
- Maes, M.; Vanhalst, J.; Van den Noortgate, W.; Goossens, L. Intimate and relational loneliness in adolescence. Journal of Child and Family Studies 2017, 26, 2059–2069. [Google Scholar] [CrossRef]
- Qualter, P.; Munn, P. The separateness of social and emotional loneliness in childhood. Journal of Child Psychology and Psychiatry 2002, 43, 233–244. [Google Scholar] [CrossRef]
- Lasgaard, M.; Goossens, L.; Bramsen, R.H.; Trillingsgaard, T.; Elklit, A. Different sources of loneliness are associated with different forms of psychopathology in adolescence. Journal of Research in Personality 2011, 45, 233–237. [Google Scholar] [CrossRef]
- Rokach, A. The psychological journey to and from loneliness: Development, causes, and effects of social and emotional isolation. Academic Press, 2019.
- Heinrich, L.M.; Gullone, E. The clinical significance of loneliness: A literature review. Clinical Psychology Review 2006, 26, 695–718. [Google Scholar] [CrossRef]
- Russell, D. Peplau, L.A., Perlman, D., Eds.; The measurement of loneliness. In Loneliness: A Sourcebook of Current Theory, Research and Therapy; John Wiley & Sons: New York, NY, USA, 1982; pp. 81–104. [Google Scholar]
- Knight, R.G.; Chisholm, B.J.; Marsh, N.V.; Godfrey, H.P.D. Some normative, reliability, and factor analytic data for the revised UCLA Loneliness Scale. Journal of Clinical Psychology 1988, 44, 203–206. [Google Scholar] [CrossRef]
- Russell, D.; Cutrona, C.E.; Rose, J.; Yurko, K. Social and emotional loneliness: An examination of Weiss’s typology of loneliness. Journal of Personality and Social Psychology 1984, 46, 1313. [Google Scholar] [CrossRef]
- Vaux, A. Social and emotional loneliness: The role of social and personal characteristics. Personality and Social Psychology Bulletin 1988, 14, 722–734. [Google Scholar] [CrossRef]
- Maes, M.; Vanhalst, J.; Van den Noortgate, W.; Goossens, L. Intimate and relational loneliness in adolescence. Journal of Child and Family Studies 2017, 26, 2059–2069. [Google Scholar] [CrossRef]
- Qualter, P.; Munn, P. The separateness of social and emotional loneliness in childhood. Journal of Child Psychology and Psychiatry 2002, 43, 233–244. [Google Scholar] [CrossRef]
- Lasgaard, M.; Goossens, L.; Bramsen, R.H.; Trillingsgaard, T.; Elklit, A. Different sources of loneliness are associated with different forms of psychopathology in adolescence. Journal of Research in Personality 2011, 45, 233–237. [Google Scholar] [CrossRef]
- Rokach, A. The psychological journey to and from loneliness: Development, causes, and effects of social and emotional isolation; Academic Press: Cambridge, MA, USA, 2019. [Google Scholar]
- Von Soest, T.; Luhmann, M.; Hansen, T.; Gerstorf, D. Development of loneliness in midlife and old age: Its nature and correlates. Journal of Personality and Social Psychology 2020, 118, 388. [Google Scholar] [CrossRef] [PubMed]
- McAlpine, K.J. Depression, anxiety, loneliness are peaking in college students. The Brink 2021, 17. [Google Scholar]
- Torous, J.; Onnela, J.P.; Keshavan, M. New dimensions and new tools to realize the potential of RDoC: Digital phenotyping via smartphones and connected devices. Translational Psychiatry 2017, 7, e1053. [Google Scholar] [CrossRef]
- Xu, X.; Zhang, H.; Sefidgar, Y.; Ren, Y.; Liu, X.; Seo, W.; Brown, J.; Kuehn, K.; Merrill, M.; Nurius, P.; et al. GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behaviour Modeling Generalization. Advances in Neural Information Processing Systems 2022, 35, 24655–24692. [Google Scholar]
- Ferreira, D.; Kostakos, V.; Dey, A.K. AWARE: Mobile context instrumentation framework. Frontiers in ICT 2015, 2, 6. [Google Scholar] [CrossRef]
- Vega, J.; Li, M.; Aguillera, K.; Goel, N.; Joshi, E.; Durica, K.C.; Kunta, A.R.; Low, C.A. RAPIDS: Reproducible analysis pipeline for data streams collected with mobile devices. Journal of Medical Internet Research Preprints. Available online:. 2020. (accessed on 18 August 2020).
- RAPIDS. RAPIDS [Online]. 2021. Available online: https://www.rapids.science (accessed on 21 February 2021).
- Knight, R.G.; Chisholm, B.J.; Marsh, N.V.; Godfrey, H.P.D. Some normative, reliability, and factor analytic data for the revised UCLA Loneliness Scale. Journal of Clinical Psychology 1988, 44, 203–206. [Google Scholar] [CrossRef] [PubMed]
- Maes, M.; Qualter, P.; Lodder, G.M.A.; Mund, M. How (not) to measure loneliness: A review of the eight most commonly used scales. International Journal of Environmental Research and Public Health 2022, 19, 10816. [Google Scholar] [CrossRef] [PubMed]
- Borges, Á.; Prieto, P.; Ricchetti, G.; Hernández-Jorge, C.; Rodríguez-Naveiras, E. Validación cruzada de la factorización del Test UCLA de Soledad. Psicothema 2008, 20, 924–927. [Google Scholar]
- Royston, J.P. Some techniques for assessing multivariate normality based on the Shapiro-Wilk W. Journal of the Royal Statistical Society Series C: Applied Statistics 1983, 32, 121–133. [Google Scholar]
- Garcia-Ceja, E.; Riegler, M.; Nordgreen, T.; Jakobsen, P.; Oedegaard, K.J.; Tørresen, J. Mental health monitoring with multimodal sensing and machine learning: A survey. Pervasive and Mobile Computing 2018, 51, 1–26. [Google Scholar] [CrossRef]
- Wu, C.; Barczyk, A.N.; Craddock, R.C.; Harari, G.M.; Thomaz, E.; Shumake, J.D.; Beevers, C.G.; Gosling, S.D.; Schnyer, D.M. Improving prediction of real-time loneliness and companionship type using geosocial features of personal smartphone data. Smart Health 2021, 20, 100180. [Google Scholar] [CrossRef]
- Jacobson, N.C.; Summers, B.; Wilhelm, S. Digital biomarkers of social anxiety severity: Digital phenotyping using passive smartphone sensors. Journal of Medical Internet Research 2020, 22, e16875. [Google Scholar] [CrossRef]
- Elmer, T.; Lodder, G. Modeling social interaction dynamics measured with smartphone sensors: An ambulatory assessment study on social interactions and loneliness. Journal of Social and Personal Relationships 2023, 40, 654–669. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.-I. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence 2020, 2, 56–67. [Google Scholar] [CrossRef]
- Cohen, J. Statistical Power Analysis for the Behavioral Sciences; Routledge: New York, NY, USA, 2013. [Google Scholar]
- Reis, H.T. Gender effects in social participation: Intimacy, loneliness, and the conduct of social interaction. In The Emerging Field of Personal Relationships; Routledge: New York, NY, USA, 2021; pp. 91–105. [Google Scholar]
- Burger, J.M. Individual differences in preference for solitude. Journal of Research in Personality 1995, 29, 85–108. [Google Scholar] [CrossRef]
- Diehl, K.; Jansen, C.; Ishchanova, K.; Hilger-Kolb, J. Loneliness at universities: Determinants of emotional and social loneliness among students. International Journal of Environmental Research and Public Health 2018, 15, 1865. [Google Scholar] [CrossRef] [PubMed]
- Labrague, L.J.; De los Santos, J.A.A.; Falguera, C. Social and emotional loneliness among college students during the COVID-19 pandemic: The predictive role of coping behaviours, social support, and personal resilience. Unpublished work. 2021. [Google Scholar]
- Martín-Rodríguez, A.; Tornero-Aguilera, J.F.; López-Pérez, P.J.; Clemente-Suárez, V.J. The effect of loneliness in psychological and behavioral profile among high school students in Spain. Sustainability 2021, 14, 168. [Google Scholar] [CrossRef]
- Hemberg, J.; Östman, L.; Korzhina, Y.; Groundstroem, H.; Nyström, L.; Nyman-Kurkiala, P. Loneliness as experienced by adolescents and young adults: An explorative qualitative study. International Journal of Adolescence and Youth 2022, 27, 362–384. [Google Scholar] [CrossRef]
- Valkenburg, P.M.; Peter, J. Online communication and adolescent well-being: Testing the stimulation versus the displacement hypothesis. Journal of Computer-Mediated Communication 2007, 12, 1169–1182. [Google Scholar] [CrossRef]
- Weiss, R. Loneliness: The Experience of Emotional and Social Isolation; MIT Press: Cambridge, MA, USA, 1975. [Google Scholar]
- de Jong-Gierveld, J. Personal relationships, social support, and loneliness. Journal of Social and Personal Relationships 1989, 6, 197–221. [Google Scholar] [CrossRef]
- Shovestul, B.; Han, J.; Germine, L.; Dodell-Feder, D. Risk factors for loneliness: The high relative importance of age versus other factors. PLoS ONE 2020, 15, e0229087. [Google Scholar] [CrossRef]
- Lim, M.H.; Eres, R.; Vasan, S. Understanding loneliness in the twenty-first century: An update on correlates, risk factors, and potential solutions. Social Psychiatry and Psychiatric Epidemiology 2020, 55, 793–810. [Google Scholar] [CrossRef]
- Hayes, S.; Carlyle, M.; Haslam, S.A.; Haslam, C.; Dingle, G. Exploring links between social identity, emotion regulation, and loneliness in those with and without a history of mental illness. British Journal of Clinical Psychology 2022, 61, 701–734. [Google Scholar] [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).