Sleep Depth and Patterns of In-Flight Sleep Behavior Across Time During Global Pandemic Conditions

Fatigue risk to commercial pilots operating under global pandemic conditions had not been investigated prior to COVID-19. Examining how pilots slept during COVID-19 pandemic-specific flights can provide a precedent for estimating fatigue risk for future public health emergencies. Twenty (n=20) pilots flying across five COVID-19 humanitarian missions between Brazil and China wore a sleep-tracking device (the Zulu watch), which has been validated for the estimation of sleep timing (sleep onset and offset), duration, efficiency, and sleep depth (Wake, Interrupted, Light, or Deep Sleep) throughout the mission period. Pilots also reported sleep timing, duration and subjective quality of their in-flight rest periods using a sleep diary. To our knowledge, this is the first report of commercial pilot sleep behavior during ultra-long-range operations under COVID-19 pandemic conditions. Moreover, these analyses provide an estimate of sleep depth during in-flight sleep, which has not been reported previously in the literature.


Introduction
The COVID-19 pandemic resulted in unprecedented changes to commercial aviation operations. Among the unknown risks to pilots were how changes to operations would affect their fatigue risk. Pilots operating routes that are long-haul (LH), defined as flight duty periods (FDPs) longer than 6 hours, or ultra-long-range (ULR), defined as FDPs longer than 12 hours, routinely suffer from fatigue due to sleep disruption [1][2][3]. Schedule mitigation and in-flight napping are fatigue countermeasures used to reduce sleep pressure during LH/ULR flights and aviation is one of the most regulated industries with regards to fatigue [4][5][6][7][8][9]. Maintaining a homebase time zone schedule during ULR rosters may help pilots avoid fatigue related to circadian misalignment or jet lag [2]. However, local environmental and social time cues strongly influence even pilots who have been instructed to retain a homebase time schedule [2].
Several studies have also shown that pilot sleep quality is diminished during layovers and in-flight rest periods [3,5,10]. Diminished sleep quality not only reduces total sleep duration, but is less restorative than sleep of equal duration in a bedroom environment [3]. The restorative value of sleep, as estimated through subjective fatigue and objective performance, is related to sleep depth--namely, slow wave sleep (SWS) and rapid eye movement (REM) sleep stages [11,12]. Polysomnography (PSG) is the gold standard to assess sleep stages, but is considered an impractical method for collecting sleep information in operational environments [13]. Reliable estimation of sleep depth during LH/ULR rosters, particularly during in-flight sleep, is therefore an important step forward toward understanding the quality of sleep across aviation operations.
Pilot fatigue constitutes a well-anticipated threat to aviation safety, but the estimation of fatigue risk and sleep behavior for specific flight rosters has been informed by scientific examination of fatigue factors during actual or simulated aviation operations. Relying on knowledge garnered from previous research has never a viable option for the estimation of sleep depth in aviation, but truly, was not an option for any assessment of fatigue risk in the face of the unprecedented onset of the COVID-19 pandemic. Operators adapted to pandemic conditions as best as possible given the limited information and tools available to predict fatigue. Brazil-based Azul Airlines, for example, estimated pilot fatigue using a biomathematical model prior to conducting 5 separate humanitarian missions to China between May and July of 2020 [14]. During missions, pilots wore a validated wrist actigraph (the Zulu watch, Institutes for Behavior Resources, Baltimore, MD [15]) and reported the sleep duration and quality of their in-flight rest periods using a sleep diary. Each mission consisted of 4 flight legs between 11-15 hours long each going from: 1) Brazil to a layover destination in Europe, 2) the layover destination to a destination in China to pick up COVID-19 relief supplies, 3) China to a return layover destination, and finally, 4) the return layover destination to the home airport in Brazil. Pilots were each provided a 9-hour rest opportunity per FDP, and were instructed to remain on a homebase schedule, i.e., west Brazilian local time (UTC-5).
The Zulu watch is a commercial sleep tracker designed for use in operational environments which has been validated against PSG and actigraphy for sleep-wake determination and against PSG for the estimation of sleep depth [15]. Two-minute epochs within sleep events which are recorded by the Zulu watch are categorized as either interrupted sleep, light sleep, or deep sleep. It should be noted that the Zulu watch estimates sleep depth based on wrist movement using a tri-axial accelerometer and on-wrist detection using a galvanic sensor. Specific differences between NREM-REM sleep stages cannot be estimated by accelerometry alone, but wrist movement can identify bouts of immobility which are known to correspond to periods of restful sleep that could include NREM and REM [16,17]. Previous studies have compared sleep scoring in commercial wearables or mobile apps against PSG under the assumption that sleep stages N1 and N2 are comparable to light sleep, SWS is comparable to deep sleep, and REM is its own category [18][19][20][21]. The Zulu watch does not include a category for REM sleep.
The goals of the current analyses are three-fold. The first goal is to describe observed sleep behavior during pandemic-specific ULR flight conditions and thus, establish an expectation of rest patterns in the hopefully-unlikely event of future global public health emergencies. The second goal is to describe patterns of sleep behavior and sleep depth across ULR in-flight and layover sleep events with respect to pilots' flight schedules and local night as a precedence for future investigations. The third goal is to evaluate the accuracy of Zulu watch measures of sleep duration and sleep timing estimation in operations compared against sleep diary. The Zulu watch has been validated in the laboratory, but the true test of its utility is a) the ability to accurately measure sleep timing and duration compared against self-report in real-world operations and b) the ability of Zulu watch measurements to inform assumptions about sleep behavior in real-world situations such as ULR flight rosters. Agreement between Zulu watch measures of sleep timing (i.e., sleep onset time and offset time) and sleep duration compared to sleep diary report of sleep timing and duration during FDP in-flight sleep were examined using Pearson's r correlation, paired samples t-tests, and Bland-Altman plots. Taking these three study goals together, this paper constitutes the first report of sleep behavior and estimation of sleep depth during in-flight sleep for commercial pilots flying ULR missions across multiple time zones under global pandemic conditions.

Pilot Participation
In total, 40 pilots flew between Brazil and China between May and July 2020 for the Azul's humanitarian missions. Mission ranges from 96 hours to 132 hours in length. Each mission consisted of 4 FDPs which ranged in length from 11 to 14 hours each, 1 turnaround period in China, which lasted between 3 to 6 hours, and 2 layover periods in Europe which lasted between 20 to 41 hours. Thirty-two (32) out of 40 (80%) pilots crewing a COVID-19 humanitarian mission completed the sleep diary and 22 out of 40 (55%) wore a Zulu watch between May and June 2020. Twenty (20; 50%) pilots completed both the sleep diary and wore the Zulu watch. Only pilots who both completed the sleep diary and provided Zulu watch data (N=20) have been included in these analyses. Fifteen (N=15) participants provided Zulu and diary data for all 4 flight legs; N=3 participants provided Zulu and sleep diary data for 3 out of the 4 flight legs; N=1 participant provided Zulu watch data for all FDPs, but only completed the sleep diary for 3 out of 4 FDPs and N=1 participant completed the sleep diary for all 4 FDPs, but only wore the Zulu watch for 3 out of the 4 flight legs.

Sleep Timing and Sleep Duration Across Mission Hours
Pilots reported between 0 to 3 separate sleep events per FDP by diary. In comparison, Zulu watches recorded between 0 to 6 sleep events for the same FDP. There was only one instance in which a pilot did not report any sleep and no sleep event was re-corded by the Zulu watch during a flight leg. For FDPs during which sleep occurred, sleep duration ranged between 30-520 minutes as reported by diary and 20-518 minutes for Zulu watch. Sleep occurred across 24 hours of the day. Figure 1 depicts each pilot participant's sleep behavior with respect to mission FDPs and local night across all hours of the missions. Time is reported in hours elapsed since mission start rather than in base, local, or GMT time to avoid confusion about sleep behavior as pilots circumnavigated the globe.
Pilot sleep opportunities during FDPs were determined in-flight by the crew. Sleep opportunities were decided ad libitum by pilots during layovers, and pilots were instructed not to sleep during the turnaround periods in China. Pilots were confined to the aircraft during turnaround in China, but their activities were not restricted during layover periods. The timing of pilot sleep with respect to the end of the previous FDP or the start of the subsequent FDP ranged from 0 minutes to 2527 minutes (approximately 42 hours). In contrast, pilot sleep began, on average, 50±70 minutes after the onset of local night. There were no differences between sleep onset with regard to local night depending on whether pilots were sleeping in-flight or during a layover period (t=0.19, p=0.85). Sleep duration was not reported during layover or turnaround. * represents significance at p≤0.05; ** represents significance at p≤0.001.

Sleep Quality and Sleep Depth Across Mission
Distribution of sleep quality across categories (Excellent, Good, Fair, and Poor) is depicted in Figure

Agreement between Zulu Watch and Diary Measurements of Sleep
Average sleep duration per sleep event was 281±126 minutes by sleep diary compared to 204±134 minutes for Zulu watches. Diary sleep duration was positively correlated with Zulu sleep duration (r=0.63, p<0.05) and TST (r=0.75, p<0.05), but paired samples t-tests showed that diary reports of sleep duration were significantly higher than Zulu watch sleep duration (t=5.24, p≤0.001) or TST (t=6.49, p≤0.001). Sleep onset time were positively correlated between Zulu watch and diary (r=0.74, p≤0.001), as were time of final awakening (r=0.62, p≤0.001), diary and Zulu sleep duration (r=0.75, p≤0.001) and diary sleep duration and Zulu TST (r=0.63, p≤0.001). Sleep onset time and time of final awakening were not significantly different between Zulu watch and diary (all p>0.05).

Discussion
It must first be mentioned that the circumstances of the 5 ULR flights profiled in this manuscript are exceptional. The purpose of Azul's humanitarian missions was to bring respirators, COVID rapid tests, and medical supplies from mainland China back to Brazil. The humanitarian goal of these missions served as a uniquely motivating factor for each pilot who participated in the missions. Azul Airlines had not previously conducted flights to China, and the pilots were unfamiliar with the destination airports within China. While the missions were conducted by commercial airline pilots on a commercial aircraft, there were no passengers or cabin crew aboard. Pilots were permitted to sleep in either crew rest facilities or in the business class section, per their preference. Moreover, pilots were restricted from leaving the aircraft while in China, and while they were permitted to move freely during layovers in Europe, shutdowns related to COVID-19 most likely limited the availability of social activities. Because these ULR flights were conducted under unprecedented global pandemic conditions, pilot behavior may not generalize to all ULR operations.
To our knowledge, this is the first report of pilot sleep behavior and sleep depth estimation during ULR operation under COVID-19 global pandemic conditions. Despite being instructed to remain on homebase time, the logistical necessity of coordinating inflight sleep opportunities with co-pilots and severe limitations to social time cues, pilots tended to initiate sleep within an hour of the onset of local night. The clustering of pilot sleep around local night can most clearly be seen in Figure 1a. The timing of pilot sleep with respect to FDPs was vastly more variable, occurring anywhere from 0 to 42 hours apart. These findings indicate that environmental light cues may influence sleep behavior over the course of ULR transmeridian travel over and above logistical considerations such as the timing of work or adherence to a home schedule.
Understanding the quality of pilot sleep during ULR operations is important for the mitigation of fatigue. There is a lack of previous data examining sleep quality in the context of aviation or in-flight sleep. In this study, neither subjective nor objective sleep quality changed significantly over the course of the mission. Sleep efficiency remained in a normal range (above 80%) throughout all FDPs and layovers, and pilots largely rated their sleep as "good" or "fair". Estimation of sleep depth remained consistent as well, with the majority of TST being spent in "deep sleep". However, actigraphy devices have a problem with low specificity [22][23][24], meaning that they are not very good at picking up awakenings during sleep intervals. The specificity of the Zulu watch to identify awakenings during a sleep interval compared to PSG under laboratory conditions is 26% [15]. Moreover, while sleep depth estimation by the Zulu watch has been tested against gold-standard PSG under laboratory conditions, no investigations of sleep depth or sleep architecture have ever been conducted during in-flight sleep. While these data represent a step towards understanding the impact of ULR travel on sleep quality, the limitations of the technology must be acknowledged. Extensive future research and advancements to device specificity will be required in order to determine whether sleep quality is truly resilient to transmeridian travel or not.
Another aim of the current analyses was to evaluate agreement between sleep diary and Zulu watch measures of sleep timing and duration. Zulu watch measures of sleep were strongly correlated with diary measures. Time of sleep onset and final awakening were very similar between diary and Zulu watches. However, pilots consistently reported longer sleep duration than was recorded by the Zulu watch. Pilots only reported in-flight sleep, so we could not test agreement between diary and the Zulu watch before or after the mission or during layover periods. It is possible that turbulence or background movement of the airplane in flight could falsely register an awakening on the Zulu watch. However, considering the low specificity of the Zulu watch and actigraphy devices in general, this possibility in not highly likely.
In some ways, the testing of Zulu watch measures of sleep against diary was akin to comparing apples and oranges. The Zulu watch considered periods of awakening as the termination of a sleep episode, while pilots may have reported the total amount of time during which they attempted sleep, regardless of whether any sleep occurred. For this reason, multiple Zulu watch sleep intervals occurred over the course of one diary entry. Despite our best efforts to objectively compare the two measures, researcher bias may have influenced the results.
Validation testing methodology has been established for a laboratory environment [25], but there is little guidance for what constitutes proper validation of sleep measurement in a real-world environment. Previous studies have compared self-report to actigraphy [26][27][28], but it is impossible to say whether the subjective assessment of sleep or the objective measurement is more representative of actual sleep under the circumstances of testing in the field.

Participants
Participants were recruited through Azul Airlines Human Factors Safety Department. Participants provided written informed consent for their participation. All missions were considered eligible for participation regardless of gender, ethnicity, age (over 18), sleep habits, or health status.

Procedures
Mission flights were designed to be carried out with 2 relay crews consisting of 8 pilots. There were 4 flight legs to each mission: 1) Brazil to a European layover destination; 2) layover to China; 3) China to a return layover destination and 4) layover to Brazil. Each flight leg was approximately 12 hours and the planned available rest time for each crew member per stage was approximately 9 hours. The crews were organized so that all pilots would be available to work during any flight leg and that no one pilot would need to fly extra time. In-flight rest periods were freely chosen by the crew during the mission. Each flight leg was approximately 12 hours and the available rest time for each crew member per stage was approximately 9 hours. Aircrew were instructed to remain on home base Brazilian time throughout the mission.
Pilots were assigned the Zulu watch (Institutes for Behavior Resources, Inc. [15]) in May 2020 prior to COVID-19 support missions and wore the watches continuously until the completion of their mission (between May and July, 2020). Crews returned the watch to airline researchers directly upon returning to Brazil from their mission. Data were downloaded by airline researchers using the Zulu Data Extraction application (Institutes for Behavior Resources, Version 2.0). Pilots completed a sleep diary during FDPs.

Zulu Watch
The Zulu watch hardware device collects activity data in 2-min epochs and automatically scores sleep duration and sleep efficiency on-wrist based on a proprietary algorithm for sleep-wake determination. Devices were programmed to detect multiple sleep intervals per day and can detect sleep intervals which are as short as 20 minutes in duration. Data were then exported as one summary file of all scored sleep interval information and as multiple 2-min epoch-by-epoch (EBE) data files for each day during the mission study period. Zulu watch scored sleep interval summary files included sleep onset time and sleep offset time reported as mm/dd/yyyy hh:mm, sleep duration in minutes, and SE as a percentage for any events determined to be a sleep interval by the Zulu watch.
Epoch data are scored as on-wrist "On" or off-wrist "Off." Epochs are scored in a separate data column as 0 for periods of wake, 1 for restless or interrupted sleep, 2 for light sleep, and 3 for deep sleep. The Zulu watch uses a propriety algorithm to estimate sleep depth using only motion and on-wrist detection and cannot differentiate between sleep stages. Zulu watch sleep-depth scoring should be considered an estimation of locomotor inactivity rather than an estimate of neurophysiological sleep architecture.

Sleep Diary
The pilots reported the start time and end time (as mm/dd/yyyy hh:mm), sleep duration in hours and minutes, and categorical subjective quality of any sleep intervals occurring during FDPs. Subjective sleep quality was rated on a 4-point scale as either Poor, Fair, Good, or Excellent by pilots. Pilots were not asked to complete the sleep diary during layovers or ground time in China. All times were reported in Brazilian time.

Reformatting Data for Consistency between Zulu Watch and Diary Measurements
The Zulu watch automatically determines sleep onset and sleep offset regardless of whether the wearer is still attempting sleep. For this reason, while the Zulu watch can provide a measure of sleep duration similar to total sleep time (TST), it does not provide an estimate of time in bed (TIB). Conversely, pilots reported the amount of time that they dedicated to sleep, which more closely resembles a measurement of TIB. However, the term "time in bed; TIB" cannot be considered an accurate description of sleep opportunities in the current analyses since none of the sleep events reported in this manuscript occurred in a bed or bedroom environment. Because of the constitutional difference in data reporting, multiple Zulu watch sleep events occurred over the course of a single diaryreported event. In order to most accurately compare Zulu watch measurements against diary, all minutes of sleep duration recorded by Zulu watch within proximity to 1 diaryreported sleep event were summed. Sleep onset time and sleep offset time were selected from the earliest occurring Zulu watch sleep interval and last occurring Zulu watch sleep interval data, respectively. An estimate of time dedicated to sleep as measured by the Zulu watch was computed as the minutes occurring between the earliest-occurring Zulu watch sleep onset time and the last-occurring sleep offset time for comparison against diary sleep duration. For the purposes of these analyses, sleep duration will refer to time dedicated to sleep (a proxy for TIB), and TST will refer to the time recorded as sleep by the Zulu watch in minutes.

Data Analysis
All statistics were computed using Excel 2013, STATA version 15, and RStudio version 1.3.959. Sleep duration is defined as time (in minutes) dedicated to sleeping based on Zulu watch or diary. Total sleep duration per flight leg was computed by summing all minutes of sleep recorded or reported occurring during each flight leg. All time data were converted to west Brazilian homebase time zone (UTC-5) for consistency. Sleep onset and offset times as reported by Zulu and sleep diary were converted from UTC-5 date time format to numeric values for statistical analysis. Distance between sleep onset and FDPs or local night were computed by subtracting the sleep start time from the end time of previous FDP or start time of subsequent FDPs or by subtracting sleep start time from the start time of local night. Local night start times were extracted from the Sleep, Activity, Fatigue, and Task Effectiveness Fatigue Avoidance Scheduling Tool (SAFTE-FAST) biomathematical modeling software. Differences between sleep distance from night by FDP versus layover were examined using Student's t-test. Differences in sleep quality ratings and Zulu watch sleep depth percentages across flight legs and between missions were compared using repeated measures mixed model analysis. Paired samples t-tests were run to compare differences between Zulu watch and diary-reported measures of in-flight sleep. Mean difference scores were additionally computed between sleep onset time, sleep offset time and sleep duration. Bland-Altman plots examined the mean difference between measures of sleep and single sample t-tests were conducted to determine if a statistically significant difference existed between mean difference scores. Limits of agreement were computed (mean difference ± 1.96 SD) to indicate the range in which the differences between the two measures would occur with 95% probability [29]. The strength of the association between Zulu watch and diary report for measures of sleep onset, offset, and duration was calculated using Pearson correlation coefficients.

Conclusions
This is the first report of sleep behavior and sleep depth estimation in pilots operating ULR flights during global pandemic conditions to our knowledge. Pilots tended to sleep during local night despite attempting to adhere to a homebase time schedule and having to coordinate sleep opportunities with their co-pilots. Subjective sleep quality, SE, and percentage of interrupted, light, and deep sleep remained consistent across the missions, and were not indicative of diminished sleep quality. Zulu watch and diary measures of sleep were similar, but pilots reported longer sleep duration than was measured by the Zulu watch. These analyses can help inform the management of fatigue risk in the planning or future ULR flights or pandemic flight conditions. by Salus Institutional Review Board for the Institutes for Behavior Resources, INC. (Protocol Number Azul2020; 03/08/2021).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: In this section, please provide details regarding where data supporting reported results can be found, including links to publicly archived datasets analyzed or generated during the study. Please refer to suggested Data Availability Statements in section "MDPI Research Data Policies" at https://www.mdpi.com/ethics. You might choose to exclude this statement if the study did not report any data.