Explainable Convolutional Neural Network to Investigate the Age-Related Changes in Multi-Order Functional Connectivity

: Functional connectivity (FC) is a potential candidate that can increase the performance of brain-computer interfaces (BCIs) in the elderly because of its compensatory role in neural circuits. However, it is difficult to decode FC by current machine learning techniques because of a lack of its physiological understanding. To investigate the suitability of FC in BCI for the elderly, we propose the decoding of lower- and higher-order FCs using a convolutional neural network (CNN) in six cognitive-motor tasks. The layer-wise relevance propagation (LRP) method describes how age-re-lated changes in FCs impact BCI applications for the elderly compared to younger adults. Seventeen younger (24.5 ± 2.7 𝑦𝑒𝑎𝑟𝑠) and twelve older (72.5 ± 3.2 𝑦𝑒𝑎𝑟𝑠) adults were recruited to perform tasks related to hand-force control with or without mental calculation. CNN yielded a six-class classification accuracy of 75.3% in the elderly, exceeding the 70.7% accuracy for the younger adults. In the elderly, the proposed method increases the classification accuracy by 88.3% compared to the filter-bank common spatial pattern (FBCSP). LRP results revealed that both lower- and higher-order FCs were dominantly overactivated in the prefrontal lobe depending on task type. These findings suggest a promising application of multi-order FC with deep learning on BCI systems for the elderly.


Introduction
With advancements in science and medical technologies, the average life span of humans has gradually increased [29]. Therefore, there is a growing need for brain-computer interface (BCI) systems for healthy elderly persons going through nonpathological physical and cognitive declines [30,52]. BCI systems connect the brain to a computer, allowing the user to enhance their life [23,51]. As machine learning and intelligent robotic technology advance, the range of BCI applications is growing. The development of a hybrid BCI, such as simultaneous use of near-infrared spectroscopy (NIRS) [69] and electroencephalogram (EEG) system, further increases the potential of BCI applications to real life situations [21]. Recently, attention has turned to the use of BCI systems to enhance the health care [25] and comfortable living of the elderly [26]. Recent studies found that BCI can be a useful supplement for older people in mitigating their physical [20], cognitive [6], and mental health declines [64].
However, the applications of BCI for the elderly are highly limited compared with those for younger adults due to unrevealed effects of aging on the functional measures of the brain [59]. The neuronal population continues to change after the brain is developed across the life span, leading to distinctive changes in brain functions. There is evidence of an age-related decrease in executive functions [36], processing speed [48], and the inhibition of unnecessary cognitive processes [19]. Most of the age-related functional losses are comprehensible as a result of reduced brain activity caused by structural changes such as atrophy or dedifferentiation [40]. Aging leads to shrinkage of the brain due to the decrease in gray and white matter volumes [24]. This volumetric change is especially prominent in the frontal cortex [40]. The aging brain sometimes recruits brain regions in a nonselective manner, an indication of age-related dedifferentiation [31], which causes disinhibition and attentional deficiency by reducing the selectivity and specificity of neural systems [2]. These negative changes in the brain decrease classification performance in BCI research for the elderly compared to their younger counterparts [8].
However, recent findings indicate that the aging brain sometimes works harder than a younger brain. Age-related overactivation is found in various tasks and brain regions as a compensation mechanism for neurocognitive decline of the neural system [42]. In many cases, it is accompanied by similar or better performances in the elderly compared with their younger counterparts [16,44]. Even when the elderly show poorer performances than young adults, age-related overactivation is positively correlated with behavior performance in the older group. These results are evidence of the fact that the aging brain recruits additional brain resources to complement neurophysiological decline [42].
In particular, due to the reduced neural resources in the aging brain, the network of neuronal populations is increased to process reduced neural information more efficiently. The compensation-related utilization of neural circuits hypothesis (CRUNCH) suggests that neural circuits need to be activated more to compensate for the processing inefficiency of the aging brain by recruiting more neural resources [7]. Functional connectivity (FC) shows overactivation in the aging brain during task performance [34,37]. The compensatory increase in FC differs from dedifferentiation in that, while the former is beneficial for neural decline, the latter is associated with degradation from the optimal state of neurological specialization [5,41]. The functional network at the frontal area especially shows greater compensatory overactivation than any other brain regions to, presumably, compensate for structural atrophy [58]. The critical role of the FC in the aging brain suggests that FC can be an optimal feature in BCI for older people.
However, FC association with a task is not a popular feature in BCI applications because it is high-dimensional, and the way it is affected by various conditions such as age, task, and stimulation is not fully understood. To classify FC, previous studies attempted to extract features for a linear classifier [32] or to use network measures characterizing the properties of a connectivity map, such as the clustering coefficient [11]. However, feature extraction requires prior knowledge of the input data and often removes relevant information. The use of network measures based on graph theory is dependent on the choice of measures, causes loss of information, and is hard to interpret because it addresses properties of the network, not the network itself.
The deep learning technique can be used to decode connectivity maps because the technique has been introduced in neuroscience to solve the individuality, diversity, and unpredictability of brain signals [54]. A convolutional neural network (CNN) is one of the most effective deep learning techniques for conducting automatic feature extraction and classification for high-dimensional input data. It can decode an individual FC pattern without manual feature extraction by processing the network input itself regardless of the subject's age. CNN has recently been used for EEG decoding [28,46,50], but it has mostly been focused on younger adults based on temporal, spatial, or spectral information of neural activity.
In this study, we propose to decode FC in six-class classification using CNN for the elderly in addition to young adults. The primary reason for using FC for BCI in this study, rather than using more commonly used brain features, is that a neural circuit shows compensatory overactivation in the aging brain. The proposed method benefits from the compensatory role of FC, which provides advantages for the implementation of BCI with older people relative to previous methods. To verify the hypothesis, low-and higher-order RCs were measured to obtain functional information in different levels of neural circuits.
Multi-order FCs were estimated from five frequency bands of EEG signals during six cognitive-motor tasks. There are two main purposes of this study. First, we compare classification performances by the proposed method in the elderly to those in younger adults. Second, we analyze the effects of age-related changes in multi-order FCs on classification performance in younger and elderly groups using layer-wise relevance propagation (LRP). This study suggests an FC-based BCI system using an explainable deep learning technique, which is particularly advantageous in research regarding aging.

Participants
We recruited twelve elderly people (mean age: 72.5±3.2, two males and ten females) and seventeen younger adults (mean age: 24.5±2.7, nine males and eight females). Both groups consisted of right-handed people. No subjects had any history of neurological or pathological disease and had normal or corrected normal vision. The Korean Mini-Mental State Examination (K-MMSE) was conducted for the elderly to verify whether they have normal cognitive ability [18]. The elderly with a K-MMSE score of 24 or above were allowed to participate in this experiment. Therefore, all the older participants were determined to be experiencing nonpathological normal aging. All the subjects gave written informed consent, which was approved by the Korea University Institutional Review Board (KUIRB) (NO. 17-126-A-2).

Apparatus
We used a research-grade wireless dry EEG device (DSI-24, WEARABLE Sensing, San Diego, USA) for data acquisition in a comfortable environment. The equipment used a hair brush-type dry electrode without using gel, leading to signal quality comparable to that of a wet electrode [49]. The equipment installation was completed within five minutes. We adopted dry EEG because it is robust for motion artifacts due to its joint fastening mechanical structure, which enables the frame to adjust to the shape and size of the subject's head. We used nineteen electrodes located according to the international 10-20 system of electrode placement: Fp1, Fp2, Fz, F3, F4, F7, F8, Cz, C3, C4, T3, T4, T5, T6, Pz, P3, P4, O1, and O2. The ground channel was at Fpz, and the reference channel was at Pz. We measured EEG signals with a sampling rate of 300 Hz.
Hand pressing force was also measured by a laboratory-made system. The system consisted of eight piezoelectric sensors (Model 208M192, PCB Piezoelectric, Inc., New York, USA) inside an aluminum frame for the fingertips of both hands excepting for thumbs. The positions of sensors were adjustable for the shape of each subject's hand. The forearms were supported by a soft buffer and were fixed to it to steady the arms of each subject. Analog pressure signals obtained from the sensors were then digitized and recorded in LabVIEW.

Experimental paradigm
The whole experiment consisted of 78 trials. Half of the trials were categorized as the performance of a single task (motor task without mental arithmetic), and the rest were categorized as dual tasks (motor task with mental arithmetic). Single and dual tasks were then subcategorized as left, right, or both hands depending on which hands would be used to perform each task. Figure 1 shows the whole experimental procedure. Before the experiment, we measured the subject's pre-bimanual maximum voluntary force (pre-BMVF) during the setup of the EEG device. During the pre-BMVF measurement, the subject repeatedly presses both hands simultaneously three times for 5 s, including one practice trial. The BMVF value was used to set the amplitude of target force depending on the subject's ability of force production. There were two monitors to display the instruction and motor task feedback.
A trial consisted of cue, execution, and rest periods. In the cue period, the upper monitor showed which hand to use (e.g., "right hand") in a single task for 4 s. In the dual task, the monitor showed an arbitrary three-digit number (ex. "768") and which hand to use (e.g., "right hand") simultaneously for 4 s. In the execution period, the upper monitor was turned off and the lower one was turned on to show the target force line and the force-feedback line. The execution period was 20 s. In the single task, the subject conducted an isometric force control by pressing their fingertips with their left and/or right hand(s). The target force was shown in the monitor by a yellow line shaped as a sine wave with a frequency of 1 Hz, an amplitude of 5% of BMVF, and an average of 15% of BMVF. The subject was instructed to fit the white line (force production) to the yellow line (target force) by evenly changing the pressure of the fingertips. In the dual task, the same motor task was performed simultaneously with the mental arithmetic task. For the mental arithmetic task, subjects subtracted ten from a given three-digit number sequentially. Subtraction was done only in the mind. There were six tasks in total and three motor conditions for each single and dual task. The six tasks were referred to as {Single both , Single right , Single left , Dual both , Dual right , Dual left }. After carrying out the task, the subject rested for 14 s with no conscious thoughts or additional motions, except for giving the final answer of the mental arithmetic in the dual task. The final answer to the mental calculation was recorded manually to measure calculation speed, maintain the subjects' concentration, and provide motivation for the subjects.

Preprocessing
The measured EEG signals were down sampled from 300 Hz to 250 Hz to reduce the memory usage. The continuous signals were divided into five frequency ranges: delta (1-4 Hz), theta (4-8 Hz), alpha (8-12 Hz), beta , and gamma . They were segmented from 0 s to 20 s beginning at the start point of the task execution. The baseline was corrected using the signals for a time interval between -3 s and 0 s.

Estimation of low-and high-order functional connectivity values
Low-and high-order functional connectivity (LoFC/HiFC) values were estimated from the preprocessed segments at the five frequency bands. We chose correlation as a measure of FC rather than using other analytic techniques such as coherence, phase locking value, and phase lag index [57]. Such methods can measure EEG connectivity considering nonlinearity and are less sensitive to volume conduction than the correlation [47]. However, they have higher computational complexity and are less straightforward compared with the correlation method. The correlation method could be better for BCI application because it is easy to implement and interpret. By estimating both LoFC and HiFC simultaneously using the multi-variate normal distribution (MVND) assumption, we can eliminate the transient correlation between time series by artifacts and give some nonlinearity in the measurement of connectivity. Figure 2 shows the procedure of multi-order FC estimation from top to bottom. Each rectangle represents a data matrix. ch and SR represent the number of electrodes and sampling rate, respectively. The rectangles with wave and plaid patterns represent EEG time series and the correlation matrix, respectively. FC estimation consists of two steps designed for the cropped strategy [50] and MVND [67], each of which requires segmentation. In the first step, a trial is segmented by a sliding window having a length of 5 s with a 4.5 s overlapping the neighboring window. This 5 s segment is called a 'crop'. The number of crops is 31 per trial. For the second step, the crop is segmented again into sections to collect the matrix-variate distribution. The time window had a length of one second and slid for 0.2 s, resulting in 21 sections per crop.
A correlation matrix was then calculated using the one-second EEG signals in a section. The ( , )th component of is , representing the Pearson correlation coefficient between time series and . and represent the segmented EEG time series in a section at the th and th electrodes, respectively. As a result, 21 correlation matrices are given per crop. We assume that the correlation matrices follow MVND, as ~( , ⊗ ) and ∈ ℎ× ℎ , where , , and represent the mean, variance among rows, and variance among columns, respectively [17]. The covariance ∈ ℎ 2 × ℎ 2 of the matrixvariate distribution can be described as ⊗ when ⊗ denotes the Kronecker product. We can assume = because is symmetric. and , instead of and , are used to represent LoFC and HiFC, respectively, because the size of is too large [67]. and are calculated based on a maximum likelihood estimation [14,65]. Therefore, we obtained ten FC matrices in one time-crop because LoFC and HiFC were calculated in five frequency ranges. Figure 3 shows the architecture of the CNN for the classification of a three-dimensional input consisting of low-and high-order FCs at five frequency bands. The cuboids and small spheres represent the input/output feature map and data point in each stage, respectively. The rectangles with solid lines covering data points show the convolutional kernel inside the feature map. Dashed squares cover the pooling range. The whole procedure was divided into two phases. The first phase begins with the input layer. The size of the input variable is [number of electrodes × number of electrodes × 10] , which has a depth of ten, consisting of [LoFC δ , HiFC δ , LoFC θ , HiFC θ , LoFC α … HiFC γ ]. To avoid confusion with the EEG channel, rather than using the term channel, we used the term depth, which is normally used in image processing studies. The subscript characters in LoFC and HiFC represent the frequency bands. The first convolution is conducted using ten 2D convolutional filters with a size of 18 × 1 . The filter length was calculated as [number of electrodes − 1], outputting ten feature maps with a size of 2 × 19. The exponential linear unit (ELU) was applied to the feature map. ELU outperforms the rectified linear unit (ReLU) in deep learning for EEG signals [50]. Then, ten 2D convolutional filters with a size of 1 × 18 were applied in the second convolutional stage. ELU was also applied for nonlinearity. A max pooling layer was applied with a pooling size of 2 × 2 to reduce dimensions. The length of the convolutional kernel was defined as [number of electrodes − 1] because this number yielded the best performance in our preliminary study. It was better for the filter length to be less than the number of electrodes, considering the regional similarity of the FC distribution. However, if the length was too short, the number of layers increased. Many layers increase computational cost and time.

Explainable CNN
In the second phase, six pointwise convolutional filters were applied to the output of the previous phase [9]. Pointwise convolution applies a 1 × 1 sized kernel to a depth direction. The pointwise convolution can combine information of the feature maps [28]. The final pointwise convolution combined filtered FC information from different feature maps. A linear unit was applied as an activation function followed by a pointwise convolutional unit. The output of the pointwise convolutional layer was then flattened, fully connected, and passed to a softmax function. This fully connected layer was accompanied by a dropout method to prevent overfitting [53]. The probability of dropout was 0.5. The softmax function converts the classification output to a probability value, which indicates the probability that the input variable belongs to each class. The class of the input variable can be estimated by finding the maximum value of the output of the softmax function. In a ten-fold cross-validation of each subject, eight folds were used as a training set, and the rest were used for validation and test sets. The network was trained with stochastic gradient descent with a batch size of 64 and constant momentum of 0.9 for 300 epochs. The initial learning rate was 0.005.
After learning by forward propagation, LRP decomposes the classification output in terms of the relevance values representing the contribution of input data by backpropagation [1]. The relevance value of the th neuron at layer can be calculated using a local redistribution rule: where and → → +1 represent the input of the th neuron at the th layer and the weight connecting the th neuron at the th layer to the th neuron at the ( + 1) th layer, respectively. is the output of the th neuron. The total amount of relevance in a layer is conserved [1].

Filter bank common spatial pattern
Filter bank common spatial pattern (FBCSP) was adopted for comparison with the performance of CNN classification using FC. FBCSP shows the best performance to de-code the oscillatory activity from EEG signals in the BCI competition [56]. The FBCSP algorithm computes the spatial filter to extract discriminatory information from two classes in a supervised manner. We applied the FBCSP as described in [38] after bandpass filtering EEG signals into delta (1-4 Hz), theta (4-8 Hz), alpha (8)(9)(10)(11)(12), beta  and gamma  ranges. The FBCSP is basically able to conduct a form of binary classification. The six-class problem was solved by dividing it into a classifier for the two-class problem, 2− , dealing with ( ) and the threeclass problem, 3− , ) using multilabel transformation in FBCSP. 2− solved the binary classification problem using FBCSP features with the NB classifier. 3− used the combination of three different classifiers for a multiclass problem using one-vs-one voting strategy [63]. In a testing set, the trained 2− and 3− estimated the cognitive label (dual or single) and hand label (both hands or right hand or left hand) of an unseen sample, respectively. Then, the results were combined to predict the label of the input variable among six classes.

Analysis of LRP -derived relevance value by brain hemispheres and regions
We investigated the age difference in LoFC and HiFC within and across hemispheres, and within and across regions depending on the task and frequency bands. First, we investigated the age-related overactivation within a hemisphere and across the left and right hemispheres. Functional connectivity between the electrodes located in the same hemisphere (electrodes in the left hemisphere: Fp1, Fz, F3, F7, Cz, C3, T3, Pz, T5, P3, O1, and electrodes in the right hemisphere: Fp2, Fz, F4, F8, Cz, C4, T4, Pz, T6, T4, P2) is called 'intra-hemispheric FC'. The relevance value of the connectivity between the electrodes in the left and right hemispheres represents an 'inter-hemispheric FC' activation. Second, we divided brain regions to examine the effect of age on the topology of connectivity activation. EEG channels were divided into seven brain areas: PF (Fp1, Fp2), F (F7, F3, Fz, F4, F8), C (C3, Cz, C4), LT (T3, T5), RT (T4, T6), P (P3, Pz, P4) and O (O1, O2). After the EEG channels were divided into hemispheres or regions, the estimated functional connectivity derived from relevance at each electrode was averaged under each condition.
When investigating inter-regional connectivity, we used graph theory to quantify node connections. We compared the inter-regional connectivity between the younger and older groups using graph theory. In graph theory, networks can be represented as a graph consisting of nodes (points) connected by edges (lines) [13]. Nodes and edges represent the brain regions and age-related different FCs (two-sample t-test, p<0.05) in our study, respectively. Based on graph theory, we calculated the degree of nodes and the average over all degrees of the nodes to calculate the numerical representation of theage-dependent changes in inter-regional networks. The degree of nodes represents the number of edges incident on the node. The average degree of a network is large when the network is highly connected. The degree of the ℎ node ( = 1, 2, … , ) can be calculated by is the adjacency matrix containing binary values for network connectivity and is the number of brain regions, which was set to seven. Then, the average degree of the network is given by =< >= 1 ∑ , =1 (3) where the notation < > indicates the mean value of .

Behavior results
We obtained behavior performances related to hand force control and mental arithmetic during six cognitive-motor tasks in the younger and elderly groups. The root-mean square error (RMSE) of hand force was calculated by averaging the squared deviations of hand force relative to the target force for each subject, and it was calculated from the measured hand force between 5 s and 19 s after starting the task to eliminate task-independent variations. The RMSE indicates how well hand force was accurately produced by the subject. The performance of mental calculations was evaluated by the subtraction speed computed by [ − ] 10 , where the Target number is the number given on the screen at the beginning of each task. The Answered number is the number received from the subject at the end of each task. Table 1 shows the RMSE and subtraction speed in the younger and elderly groups. The values are shown as the mean value ± standard error (SE). We found that only the RMSE, not the subtraction speed, was age-dependently changed in all task types. RMSE was significantly larger in the elderly subjects than in the younger subjects (two-sample ttest, p<0.05) in all tasks. However, there was no difference between the younger adult and elderly groups in the performance of mental arithmetic (two-sample t-test, p>0.05). Figure 3 illustrates the task-related ΔHbO. Like the theoretical fact empirically proven in numerous studies, motor cortex regions in contralateral hemispheres were wellactivated while volunteers were performing unilateral finger tapping. Distinct ΔHbR were observed at Ch05, 06, 15, and 16 located at the anterior area of C3 and C4. Unlike unilateral finger-tapping, during performing FT, HbO responses were fully activated earlier (at 5-7 s) than the end of the task period (at 10 s). At the end of the task period, the HbO level returned near to the baseline. We can see similar trends across all channels. For ΔHbR presented in Figure 4, like the HbO case, distinct ΔHbR to the opposite direction of that of ΔHbO were seen in (pre)motor cortices in contralateral hemispheres. It is seen that distinct ΔHbR in the opposite direction of that of ΔHbO corresponding to the same location, particularly at ch05, 06, 15 and 16, were observed.

Classification accuracy: CNN results
We classified the combination of LoFC and HiFC into six cognitive-motor tasks for each subject included in the younger and older groups using CNN. The six tasks were divided into three single and three dual tasks. Each single or dual task consisted of three motor conditions. ]/ , where , and represent the number of samples, the truth label assigned to the th sample, and the predicted label of the th sample, respectively. The overall accuracy (OA) was obtained by the ratio of the number of correctly classified samples to the number of total samples. CNN classified the multi-order FC input more accurately in the elderly group than in the younger group. Especially in the Single ℎ , Single and Dual ℎ tasks, the correctly classified rates were higher in the older adult group than in the younger adult group (two-sample t-test, p<0.05). The OA is 75.3% in the elderly group, which was 4.6% higher than the accuracy of 70.7% in the younger group (two-sample t-test, p<0.05). We also calculated the Cohen's kappa coefficient to obtain unbiased classification accuracy in the multi-class problem. The kappa value is an evaluation metric for the multi-class classification problem considering the chance level depending on the number of classes [10]. A kappa coefficient of 0.65±0.034 in the younger group was lower than that in the elderly group (0.70±0.031) (two-sample t-test, p<0.05) for the six-class classification. Classification results are considered to be at an acceptable level when Cohen's kappa value is larger than 0.61, regardless of the number of classes [27].
We verified that the CNN using multi-order FC yields higher classification accuracy on six-class classification problem than the FBCSP. The FBCSP yields significantly lower classification accuracy than CNN using multi-order FC in young adults (two-sample ttest, p<0.05, CNN: 70.7%, FBCSP: 42.3%) and elderly (two-sample t-test, p<0.05, CNN: 75.3%, FBCSP: 40.0%). Classification accuracy was increased using the proposed method by 67.1% in younger adults and 88.3% in the elderly compared to the accuracy obtained by using the FBCSP.

Relationship between classification accuracies and behavior performances
We calculated the relationship between behavior performance and classification accuracy in the younger and older groups. Cohen's kappa coefficient was used to represent the classification performance for the evaluation of the behavior-accuracy relationship. Pearson's correlation coefficient was calculated between the RMSE and classification accuracy in the single and dual tasks, and between the calculation speed and classification accuracy in the dual task. For the single task, the RMSE is not correlated with the classification accuracy in the younger (Pearson's correlation, r=-0.00, p>0.05) and older (Pearson's correlation, r=-0.27, p>0.05) populations. The dual task also showed no correlation between the RMSE and classification accuracy in the younger (Pearson's correlation, r=0.01, p>0.05) and older (Pearson's correlation, r=-0.10, p>0.05) groups. A positive correlation between the speed of mental subtraction and the kappa coefficient was found in the elderly group (Pearson's correlation, r=0.60, p<0.05) but not in the younger adults (Pearson's correlation, r=-0.12, p>0.05).

LRP results
We found the age difference in LoFC and HiFC within and across brain regions depending on the task and frequency bands. We found that the LRP-derived relevance values were higher in the elderly group than in the younger group, which reflects the agerelated FC overactivation. When the relevance was averaged over all the tasks and brain regions, the relevance of the elderly value was greater than that of younger adults by 1.26 times in LoFC (two-sample t-test, p<0.05) and 1.44 times in HiFC (two-sample t-test, p<0.05). In HiFC , the relevance of the older group was 1.23 times greater than that of the younger group (two-sample t-test, p<0.05).

Comparison of LRP-derived relevance in intra-and inter-hemispheric FC between age groups
We investigated the age-related overactivation for intra-and inter-hemispheric FC. Figure 5 shows the bar graphs for the LRP-derived relevance values of intra-and interhemispheric FC in LoFC and HiFC for all tasks in the five frequency bands. The asterisks show the significant difference in the relevance values between the younger and elderly groups (two-sample t-test, p<0.05). Intra-hemispheric FC showed a higher age-related overactivation than inter-hemispheric FC. The overactivation of the intra-hemispheric LoFC for the elderly group was observed in a frequency range above the theta range, mostly in the left hemisphere. In the frequency ranges above beta, the intra-hemispheric network in LoFC was increased in both hemispheres by age. HiFC showed age-related overactivation in the beta and gamma bands. Within similar age groups, there was no difference in intra-hemispheric FCs between the left and right hemispheres (Wilcoxon's signed rank test, p>0.05). However, the intra-hemispheric FC was higher than the interhemispheric FC in both groups (Wilcoxon's signed rank test, p<0.05).
We also investigated the task dependency of the age-related overactivation in intraand inter-hemispheric connectivities. Figure 6 shows the ratio of FC relevance between the older group and the younger group. The asterisks indicate that the relevance of FC in the elderly was significantly higher than that in the younger adults (two-sample t-test, p<0.05). Activations of LoFC and HiFC were higher in the elderly group than in the younger group for the Single ℎ , Single , and Dual ℎ tasks (two-sample t-test, p<0.05).

Comparison of LRP-derived relevance in intra-and inter-regional FCs between age groups
The regional contributions of FC to the six cognitive-motor tasks were compared between younger and older groups. Figure 7 and Figure 8 show the bar graphs of the intraregional relevance of LoFC and HiFC, respectively. Gray and black bars represent the younger and older groups, respectively. The asterisks represent the difference in the intraregional relevance between the younger and older groups (two-sample t-test, p<0.05). Task-specific overactivation occurred in more regions in the single tasks than in the dual tasks for both LoFC and HiFC. In the dual tasks, age-related overactivation in LoFC and HiFC was observed when both hands were used. Among the regions, the prefrontal (PFC) cortex shows the largest age-related difference in intra-regional FC activation in LoFC and HiFC. Regarding the degree of overactivation in PFC, LoFC related more to the single task while HiFC related more to bimanual coordination. Table 2 shows the degree of node ( ) in the seven brain regions and the average nodal degree ( ) of the changed network associated with age in the six tasks.
represents the number of edges connecting the brain region (node) with other regions. The relevance results of the inter-regional network show both similarities and differences with those of the intra-regional network. Both show agerelated overactivation at the frontal area in the Single both , Single , and Dual ℎ tasks. However, compared to the intra-regional FC, inter-regional FC was higher in HiFC than in LoFC, on average. The average over all the tasks is higher in HiFC (1.26) than in LoFC (0.73).

Discussion
Authors should discuss the results and how they can be interpreted from the perspective of previous studies and of the working hypotheses. The findings and their implications should be discussed in the broadest context possible. Future research directions may also be highlighted.

Higher classification accuracy in the elderly than in younger adults
CNN using FC yielded a better performance in the classification of six-class cognitive-motor tasks than the FBCSP. There are two possible explanations for the improvements by our proposed method. First, the proposed method exploits the fact that neuronal populations communicate to represent high-level cognitions, such as attention, emotion, memory, and planning, which are not explained by regional activation. The FBCSP is based on functional segregation of the brain which is insufficient to understand the neurophysiological source of the human mind in complex tasks. No matter how complex the structure of functional segregation is, it is insufficient to explain how all the brain functions are formed [45]. The use of the state-of-the-art nonlinear classifier, CNN, is the second contributor of substantial accuracy even when there is a lack of a prior knowledge about the physiological basis of FC. Potentially necessary information is often excluded in linear classifiers because it needs to reduce the dimension of the input feature set [28].
We show that the proposed classification method using LoFC and HiFC with CNN is more efficient in multi-class classification for the elderly than for younger adults. The older group shows higher classification accuracy, which is revealed in the classification accuracy (elderly: 75.3%, young: 70.7%), confusion matrix (Figure 4), and Cohen's kappa coefficient (elderly: 0.70, young: 0.65) for the six cognitive-motor tasks. This result is not in agreement with the classification result of a recent EEG-BCI study wherein the older group yielded lower classification accuracy than the younger group [8]. They show average classification accuracies of 66.4% and 82.3% in the older and younger groups, respectively, for binary classification using EEG oscillatory power with linear discriminant analysis (LDA) and common spatial pattern (CSP) [22,35,62].
There are two reasons for the improvement of classification for the elderly population in our study. First, FC shows larger age-related changes compared to the oscillatory power of EEG [34]. In the brain of the elderly, FC activation contributes more to cognitive functions than regional neural activity. Second, CNN extracts task-related information better than CSP [55] by accounting for age-related changes in brain activation. The FC pattern of an aging brain contains different temporal and spatial properties compared with that of a younger brain. While CSP omits some meaningful information during feature selection, CNN can use as much information as possible for machine learning regardless of agerelated changes in the FC patterns. The results imply that a combination of FC with CNN would be preferable to a spectral feature with a linear classifier, especially for BCI systems targeting older people.
Our results also support CRUNCH by finding that the behavior performance in mental calculation is positively correlated with the classification accuracy in the elderly but not with that in the younger group. The connectivity of the aging brain works hard to produce the best performance, which explains the similar performances of mental arithmetic between the younger and older groups ( Table 1). The age-related overactivation may not be reflected in the performances of the motor tasks due to physical impairment [42].

Age-related compensatory overactivation in the prefrontal cortex
LRP provides specific evidence and characteristics for the age-related compensatory activities in LoFC and HiFC. After CNN learning through hidden layers provides a descriptive explanation of activity, the LRP-derived relevance values enable us to describe processes and causes of that activity through back propagation. Our results show that the age-related overactivation of FC was dominant at the PFC (Figures 7 and 8, Table 2). This finding is consistent with previous works on age-related compensation in the brain network [3,43]. The frontal overactivation of FC is found while performing a single task, which implies that the decline in motor function in an aged person is due to physical impairment rather than a deficit in compensatory neural correlates. The dual task evokes less age-related overactivation than the single task because the increasing difficulty by cognitive load decreases the age-related overactivation [42]. PFC overactivation is evoked by bimanual coordination, even in the dual task, because PFC is associated with motor coordination [12]. The frontal lobe is recruited more during bimanual movement in the elderly than in their younger counterparts [15]. The LRP results explain why the proposed method showed even higher classification performance in the elderly group than in the younger group despite of neurocognitive declines from aging.

Age-related increase in functional connectivity within hemispheres rather than across hemispheres
In this study, FC was increased in the older group compared to the younger group within hemispheres rather than across hemispheres ( Figure 5). There is evidence supporting the hemispheric cooperation model in which inter-hemispheric interaction increases to compensate for the neural deficit of one hemisphere in the aging brain [61]. However, our results show intra-hemispheric overactivation rather than inter-hemispheric overactivation against the perspective of hemispheric cooperation. This is because age-related hemispheric cooperation occurs in tasks where the activation of the younger brain is lateralized [4]. In this study, LoFC and HiFC were evenly activated for the left and right hemispheres in younger adults ( Figure 5). Therefore, the aging brain does not need to increase long-distance interaction across hemispheres.

Compensatory overactivity in higher-order FC
Although the effects of aging on neural networks between brain regions have been studied previously, the age-related changes in the relationship between the networks have rarely been studied. In this study, we found that normal aging affects not only the connectivity between brain regions (LoFC) but also the connectivity between brain networks (HiFC). LoFC and HiFC have similarities and differences in task-specific patterns and agerelated compensation. Both inter-regional low-and high-order FCs increase by age at the frontal area for the same tasks (Figures 7 and 8). This suggests that LoFC and HiFC have similar functional roles triggered in a specific location. They may cooperate to complement performance for tasks hindered by age-related functional declines.
On the other hand, HiFC has distinctive characteristics compared to LoFC. Age-related activity in HiFC was observed mostly in the beta band ( Figure 5). Beta activity is responsible for high-level cognitive functions [39] such as the integration of sensory inputs for older people [60]. Therefore, we can assume that HiFC is associated with higher-level cognitive processes such as functional integration. Moreover, HiFC shows higher interregional overactivation than LoFC ( Table 2). Considering that the LoFC serves to gather information and that the HiFC plays a role in abstracting that information [66], the results indicate that HiFC contributes to the preserved or improved function of high-level cognitive processing in the elderly by integrating neural inputs from different areas. The interregional increase of HiFC in the aging brain could explain why some elderly persons show successful achievement of complicated tasks despite their neurocognitive decline [33].

Conclusions
The primary finding of this study is that the age-related compensatory overactivation in multi-order FC results in a higher accuracy in multi-class BCI for the elderly than for the younger population. We designed and used a CNN to maximize the ability to extract information from high-dimensional FC maps in both age groups. The proposed method improves the classification accuracy in six-class problem by 67.1% in the younger group and 88.3% in the elderly group compared to FBCSP. Classification accuracy was 75.3% for older adults, which was 4.6% higher than that for younger adults. LRP, one of the explanatory techniques of deep learning, gave a neurophysiological explanation of the impact of age-related changes in the FCs on classification performance in the younger and older populations. Low-and high-order FC in the prefrontal cortex was more activated in the aging brain compared to the younger brain, depending on the type of task. High-order FC increased in the beta band to integrate neural inputs from different brain regions in the aging brain. Therefore, our results provide the ways and reasons for a multi-order FC with explainable CNN to be an optimal method for BCI applications in the elderly. Future BCI research for older people should investigate further into the impact of age by including higher orders of functional connectivity to develop appropriate features with consideration for age-related neurophysiological changes. This study can potentially extend BCI use to healthy elderly subjects for improving quality of life.    Schematic flow chart of the estimation for the low-and high-order functional connectivities from EEG signal segments. The procedure proceeds from top to bottom. Rectangles with wave and plaid patterns represent segments of the EEG time series and correlation matrix, respectively. In the first and second steps, the dashed rectangles inside the rectangles represent sliding windows that form the segments for the next step. Figure 3. Structure of the CNN for the classification of a three-dimensional input consisting of low-and high-order FCs at five frequency bands. The cuboids represent the input and output variables of each process. Circles inside the rectangles represent data points. The first input variable consists of ten layers of feature maps. Rectangles with solid and dashed lines inside the cuboids represent the convolutional and pooling filters, respectively.

Figure 4.
Confusion matrix of the six-class classification results using CNN in (a) the younger and (b) older groups. Each row and column shows the true and predicted class, respectively. The values inside the cells represent the mean and standard error of the intrasubject scores of classification in the corresponding group in the form of Mean (SE). The rightmost column and bottom row represent the recall and precision of the classification results. The overall accuracy of each group is presented at the bottom-right corner. The asterisks show the significant difference in the classification results between the younger and older groups (two-sample t-test, p<0.05) except for the recall and precision lines. Figure 5. Bar graphs of the relevance values for the intra-and inter-hemispheric FC in (a) LoFC and (b) HiFC. Bars with solid and dashed rectangles represent the younger and older groups, respectively. The gray, medium and dark gray bars represent the intra-hemispheric FC in the left hemisphere, intra-hemispheric FC in the right hemisphere, and interhemispheric FC, respectively. The asterisks show the significant differences in the relevance values between the younger and older groups (two-sample t-test, p<0.05). Figure 6. Ratios of the relevance values between the elderly and younger adult groups for (a) LoFC and (b) HiFC. Relevance values were averaged over all brain regions and frequency ranges. Solid lines marked by circles and dashed lines marked by triangles represent the relevance ratios of intra-hemispheric FC in the left and right hemispheres, respectively. The relevance ratios in inter-hemispheric connectivity are indicated by the dash-dotted line with diamond marks. The asterisks show the significant difference in relevance ratios between the younger and older groups (two-sample t-test, p<0.05).

Figure 7.
Bar graphs of the relevance values of inter-regional LoFC in (a) Single ℎ , (b) Single ℎ , (c) Single , (d) Dual ℎ , (e) Dual ℎ , and (f) Dual . The defined brain regions are shown in the top left corner. Gray-and black-colored bars represent the younger and older groups, respectively. The asterisks show the significant differences in the relevance values between the younger and older groups (two-sample t-test, p<0.05).

Figure 8.
Bar graphs of the relevance values of inter-regional HiFC in (a) Single ℎ , (b) Single ℎ , (c) Single , (d) Dual ℎ , (e) Dual ℎ , and (f) Dual . The defined brain regions are shown in the top left corner. Gray-and black-colored bars represent the younger and older groups, respectively. The asterisks show the significant differences in the relevance values between the younger and older groups (two-sample t-test, p<0.05). Root-mean squared error (RMSE) and calculation speed are shown as the Mean ± SE. The Average column shows the mean RMSE or calculation speed over three motor conditions for the single and dual tasks.  The seven brain regions are the prefrontal (PF), frontal (F), central (C), parietal (P), left-temporal (LT), right-temporal (RT), and occipital (O) areas. and are calculated by (2) and (3), respectively.