Assessing language-induced motor activity through Event Related Potentials and the Grip Force Sensor , an exploratory study

The link between language processing and motor systems has been the focus of increasing interest to Cognitive Neuroscience. Some classical papers studying Event Related Potentials (ERPs) induced by linguistic stimuli have found differences in electrophysiological activity when comparing action and non-action words; more specifically, a bigger p200 for action words. On the other hand, a series of studies have validated the use of a grip force sensor (GFS) to measure language-induced motor activity during both isolated words and sentence listening, finding that action words induce an augmentation in the grip force around 250-300 ms after the onset of the stimulus. The purpose of the present study is to combine both techniques to assess if the p200 is related to the augmentation of the grip force measured by a GFS. We measured ERP and GFS changes elicited by listening to action and non-action words while maintaining an active grasping task in 10 healthy subjects. Our results show that the amplitude of the p200 in central electrodes is correlated to the augmentation in the GFS around 300 ms induced by linguistic stimuli. To our knowledge, this is the first study where the electrophysiological activity and the changes in the grip force induced by auditory language processing are put together, opening new venues of interpretation for the sensorimotor interaction in language processing.


Introduction
The functional link between language and the motor system has been of interest to Neurology since the earliest description of a cortical seat for linguistic function (Broca, 1861). In the following century, descriptions of clinical syndromes following brain lesions have confirmed the existence of information transfer between the language and motor brain systems (Liepmann, 1905, Geschwind, 1965. This topic is also at the core of one of the central debates in cognitive science: the nature of "meaning" and the format of linguistic representations in the human mind (Mahon & Hickok, 2016).
Classical views of cognition suggested that human language existed in the brain as a system of purely abstract symbols with no obvious relation to their external referents in the world. In this view, a person's sensorimotor experience of the external world would be irrelevant to acquire or retrieve words in their mental lexicon (Collins & Quillian, 1969;Fodor, 1987;Kintsch, 1988;Pylyshyn, 1986;Newell, 1980). "The symbol grounding problem" offers a different interpretation that challenges this view, postulating that, to be interpretable, symbols must be grounded to their referents (Harnad, 1990). Based on this understanding, the way to "ground" a symbol would be through the sensorimotor capacities of each organism, this is, through direct experience with the symbols' referents.
Related to this approach, the embodied cognition view suggests that sensory and motor information are a necessary part of meaning acquisition in human language (Barsalou, 1999;Glenberg & Kaschak, 2002;Meteyard, Cuadrado, Bahrami, & Vigliocco, 2012). In this perspective, the linguistic message is functionally assimilated when people are engaged in action (Frak, 2010). According to the experiential trace model (Zwan and Madden, 2005), it is the coocurrence of perceptual input and action with linguistic input that generates embodied language acquisition. In terms of biology, this model is supported by the Hebbian theory of learning (Hebb, 1949); when we learn a word through experience, our linguistic and motor brain networks are activated simultaneously, generating the shared circuits that will be later involved in semantic recognition.
In recent decades, the emergence of functional neuroimaging methods has provided novel approaches to study the link between language and motor function in the brain. Converging experimental evidence using fMRI and PET has found activation of the brain sensory-motor areas during language comprehension, mostly for action-related words (for comprehensive reviews, see Hauk, Shtyrov & Pulvermüller, 2008;Hauk & Tschentscher, 2013). However, the lack of temporal resolution of these methods remains a limitation to determine the processing stage at which these activations occur. Are they truly part of meaning retrieval, or do they correspond to secondary cognitive processes subsequent to comprehension and semantic access? (Glenberg and Kaschak, 2002). A promising response to these questions can be provided by methods that measure language comprehension processes in real time.
Given their temporal precision, electrophysiological methods like Event Related Potentials (ERPs) are highly appropriate to study the time course of language comprehension. The first evidence for word-class specific event-related activity in the brain was reported by Dehaene, who found a left inferior frontal positivity around 250 ms after stimulus onset, specific to action verbs (Dehaene, 1995). In agreement, a team led by Pulvermuller reported an increased positivity around 200 ms (p200) in frontal and central electrodes when comparing the ERP elicited by verbs to that elicited by nouns, using visual stimuli (Preissl, Pulvermuller, Lutzenberger, & Birbaumer, 1995).
The role of the p200 as a word-class neural marker is consistent with the time of semantic retrieval, for its earlier stages have been found within the first 250 ms after word presentation (for a review, see Hauk, Shtyrov, & Pulvermüller, 2008). According to a study of intra-cranial recordings in Broca's area during a reading-aloud task, three sequential peaks in activity happen at 200, 320 and 420 ms, and these peaks corresponded to lexical retrieval, syntactic/inflectional processes and phonological changes, respectively (Sahin, Pinker, Cash, Schomer, & Halgren, 2009). In addition, recent behavioral and ERP studies have indicated that semantic information retrieval begins around 160 ms (Mollo, Pulvermüller, & Hauk, 2016).
In a follow-up study to their 1995 publication, Pulvermuller et. al. suggested that these differences in p200 amplitudes reflected the "strong motor associations" evoked by action word recognition in frontal (motor and premotor) cortices, absent in the processing of non-action words, which would rather involve occipital areas, due to their visual attribute salience (Pulvermüller, Lutzenberger, & Preissl, 1999). However, this remained an interpretation, for the degree to which these electrophysiological correlates were related to measurable motor activity

The present study
Some groups of researchers have studied correlates of action-words in the brain, suggesting the p200 indexes motor activations induced by action-word presentation. Other groups have studied the peripheral changes in force induced by action-words but, to our knowledge, no one has yet brought these two approaches together. The time course revealed by both types of studies has indicated that these two phenomena could be functionally correlated, for the p200 occurs between 180-240 ms and the GFS experiments show an augmentation of the force beginning around 300 ms. Under these concepts, this study represents a first methodological approach to bring the two techniques together, assessing if their measurements are temporally and/or statistically correlated.

Participants
10 adults (mean age=28.17, SE=2.85), french native-speaker volunteers. All of them were right-handed (Edinburgh Inventory Mean score=73.05, SE=5.29) without any history of hearing problems, psychiatric or neurological disorders. They were recruited through inter-institutional advertising.

Ethics
The study was approved by the ethical committee CIEREN (Comité institutionnel d'éthique de la recherche avec des êtres humains) of UQAM, Montreal, Canada.

Stimuli
The stimuli consisted of a list of 70 French words. They were divided into two groups, one of 35 action words and another of 35 non-action words. All selected words were all bi-or trisyllabic. In each group, words were controlled for frequency, number of syllables, bi-and trigram frequency. All action words denoted actions performed with the hand or arm (e.g., grab, throw) while non-action words referred to imaginable concrete entities without specific motor associations (e.g., storm, terrain) and were used as control words. Words were spoken by an adult male and recorded on a digital voice recorder. Ten words were chosen as targets controlling the linguistic parameters mentioned above. The mean word duration was 684 ms (SD = 98 ms).

Procedure
The experiment was conducted in a sound isolated chamber with dim light and no other sources of electromagnetic interference. Participants were seated in a comfortable armchair in front of a desk and were told to hold a grip force sensor (consisting of a metallic disc which characteristics are detailed below) with a precision tridigital neutral grip (thumb, index and the middle finger) in each hand.. Their forearms were resting on the desk covered with a foam mat for better comfort (Fig. 1). Participants were also wearing an elastic cap connected to 64 electrodes through which EEG was recorded. During the whole session, participants used headphones, through which they listened to the list of words.

Task
Using the 70 words described above, we built a stimuli sequence composed by 10 blocks of 40-42 words each (half non-action words, half action words) with the E-Prime 2.0.3 software. The participant sat in a chair and the researcher instructed him or her to count the number of times a target word would occur while performing the previously explained motor task (holding a grip force sensor in each hand). In each block, one target word was repeated 10-12 times. In-between the blocks, the subject could take a short break, during which he would set down the metallic discs and answer how many times he or she heard the occurrence of the target word. The target word was an action word in half of the blocks and a concrete non-action word in the other half. The overall session lasted between 30 and 40 minutes. The list of words can be consulted in the Appendix (Table 1).

EEG Recording
A Biosemi 64-electrode international reference cap was placed on the Ss' heads according to head circumference and electrodes were connected to the cap using a column of Conductive Gel to fill the gap between the skin and the electrodes. Electrode signals were received by a Biosemi ActiveTwo amplifier at a sampling rate of 2048 Hz with a band pass of 0.01-70 Hz. Impedance of all electrodes was kept below 5kOhms. Data collection was time-locked to time point zero at the onset of auditory stimulus presentation.

EEG Data Analysis
We used the EEGLab 13.4.4b open source software (Delorme and Makeig, 2004) to process raw EEG files through the following steps. (1) We filtered the continuous EEG data using a high-pass filter with 1 (Hz) as the Lower edge frequency. (2) We down-sampled our data to 500 Hz to decrease computational requirements for grand averaging. (3) We identified "bad channels" using the Automatic channel rejection function in EEGLab. (4) We rejected those channels with distributions of potential values that were further away from a Gaussian distribution, using the pre-set value of 5% (5) We interpolated said channels using the average of the neighbour electrodes. (6) We re-referenced to a virtual average reference including all head electrodes but excluding the facial ones. (7) We divided the epochs into 2000 ms segments with individual epochs spanning from -500 to 1500 ms around time zero. (5) We performed a baseline correction based on the 200 ms preceding stimulus onset. To correct for potential artifacts, we rejected epochs using a voltage threshold of 50 μV. The percentage of rejected epochs varied between 4% and 18% among datasets.
We ran grand averages to obtain the waveforms elicited by four conditions: target non-action words, target action words, non-target non-action words and non-target action words. We plotted our ERPs from -200 to 1100 ms around time zero and obtained the distinct locations of the scalp in which statistically significant effects were found. This preliminary statistic analyses were conducted using the EEGLab software (parametric statistics, p<0.05, with FDR correction). After identifying our Regions of Interest and significant time windows (see below), we extracted the mean and maximal voltages to determine the amplitude (mean voltage) and peak (maximal voltage) of each component of interest. Amplitude differences within Ss were assessed with student t distributions and repeated measures ANOVAs, using the IBM SPSS 23 Statistical Software. Cohen's d were calculated to obtain effect sizes, using means, standard deviations and correlation between means. We also plotted scalp distributions for each condition in the time-windows of interest.

Grip-force data acquisition
The grip force sensor (GFS) in Figure 1 is uniaxial. It is a metallic disc of 1,8 cm of thickness and 55 grams of weight. It has two 5 cm diameter aluminum washers screwed to each side, and it can withstand a pressure of up to 1 kg. The amplitude of the output signal is 1.0 +/-10% mV / V. The linearity error and the hysteresis are 0.02% (on the total scale). The temperature compensation scale goes from -10 degrees C ° to 40 degrees C °. The GFS is connected to a Honeywell DV10L amplifier, which is in turn connected to an acquisition card (measurement computing usb-1608GX), connected to a portable computer. The stimuli triggers come from the same computer that sends the triggers to the EEG acquisition equipment and are received by the portable computer, which processes the GFS data through the Measurement Computing Dasylab software ™ . Figure 1 shows the complete setup, except for the grip force sensor data acquisition computer. The whole portable GFS data acquisition device. We can see the grip force sensors, the amplifiers Honeywell DV10L, the data acquisition card measurement computing usb-1608GX and the headphones.

Grip-force data processing and analysis
The data is transmitted in 1 kHz from the acquisition card to a portable computer, where it is preprocessed by the Dasylab 11.0 software. The software filters the data with the following filters: 15 Hz with fourth zero, low-pass butterworth filter and a notch filter of 50 Hz.
The grip force sensor data is processed through the following steps: 1) We choose the segment of data between 200 milliseconds before the beginning of the word to 800 milliseconds after the onset 2) The chosen data is normalized for every word's baseline. The mean of the data between -200 ms and 0 is used as the baseline. 3) We apply rejection artefacts to isolate outline modulation, which can be generated by a hand movement. The data is rejected when it exceeds 200 milliNewtons (mN) or when a modulation of 100 mN within 100 ms is present, following the exclusion criteria of Nazir et al., 2017. If more than 30% of the data is rejected for a participant, the participant is rejected from the analysis. Following these criteria, one participant was rejected from our sample.
For the statistical analysis, we used the grip force measurements between 100 ms and 800 ms after the presentation of the stimulus and regrouped them in fourteen windows of 50 ms. Statistical analyses were performed on version 24 of the IBM SPSS software. We performed statistical analysis for both target conditions: action and non-action. We performed pairwise t-tests comparing each one of the fourteen windows to the baseline, to determine the moment in which the grip force was significantly modulated by our stimuli. We also ran pairwise t-test comparison between conditions (target action word and target non-action word) for each window after the first one that showed a significant modulation compared to the baseline.

Central p200 effects:
We plotted the waveforms for the 64 electrodes for our first exploration of possible significant differences between-conditions, within-subjects. We began by extracting the data from the midline electrodes Fz, Cz, CPz, Pz and Oz to determine our ERP time-windows and Regions of Interest ( Figure 6). We observed a difference in amplitude between action and non-action words in the Cz , CPz and C3 electrodes for the window between 180 and 240 ms (the p200), for the target conditions. We observed bigger amplitudes for action words when compared to non-action words, which were significant  Figure 2). The complete descriptive statistics for the amplitude in the examined electrodes, including means, standard deviations and Confidence Intervals can be found in Table 2.

Parietal p300 -target effects:
We found a significant difference between target and non-target  Above: p200 effects for target words. We observed a bigger amplitude for action words than non-action words in Cz in the p200 window. Below: p3b target effect. Target words elicited significantly bigger amplitudes than non-target words in a cluster of parietal-electrodes, maximal at Pz.
In summary, our linguistic stimuli induced two main effects in our subjects: 1) the p200 "verbal signature" effect (Blaszczak, Csypionka Klimek-Jankowska, 2018), sensitive to differences between action words and non-action words when these were presented as a target, and 2) the p3b parietal effect that distinguished between target and non-target stimuli but was not sensitive to differences in word-class
When listening to a non-action word, the grip force became significantly superior to the baseline with a later onset, from 350 ms until 800 ms (right hand: mean=6.570, t(8)=2.911, p<0.05; left hand: mean =8.814, t(8)=2.983, p<0.05). The complete descriptive statistics for the grip force modulation for each hand, including means, standard deviations and Confidence Intervals can be found in Table 4. For our between-condition analysis, we found a significant difference between target action words and target non-action words at the window of 250 ms to 400ms (mean diff=0.005441 t(8)=-2.634, p=0.030, d= -0.8779) for the right hand only: The grip force of the right hand was stronger following an action word than following a non-action word (Figure 3).
The differences in grip force between conditions assessed by paired-samples t-tests amplitude in each hand subdivided in windows of 50 ms can be found in Table 5. Figure 3: Differences in the grip force comparing Action to Non-Action Words. The grey rectangle represents the time-window in which the differences between conditions were significantly different for the right hand. For the left hand, we did not observe significant differences.

Correlations between p200 and grip force effects.
Once obtaining our electrophysiological correlates and changes in GFS, we ran a series of linear correlations between the force amplitudes and the p200 amplitude in Cz, CPz, C3 and C4 in said time-windows. We used Spearman correlation coefficients, considering the size of our sample (9 subjects) and the fact that our electrophysiological variables for non-action words (p200 amplitude) did not have a normal distribution according to the Shapiro-Wilk normality test. We found significant correlations between the p200 amplitude and the grip force modulation between 250 and 400 ms for action words (in Cz and C3) and for non-action words (in C3 only). See Figure 4. Figure 4: Scatterplots depicting the data distribution for each correlation. Above: Correlation between p200 in the CPz electrode and GFS elicited by Action-Words. Correlations were significant for action words for action words in both the right hand (rho=0.717, p < 0.030) and the left hand (rho=0.583, p< 0,099). Below: Correlation between p200 in the C3 electrode and GFS elicited by Action Words (significant for both the left (rho=0,700, p < 0,036 for Action words, rho=0.667, p< 0.050 for Non-Action Words) and the right hand (rho=0.667, p< 0.050 for Action Words, rho=0.650, p < 0.058 for Non-Action Words.

Discussion:
As stated in the introduction, a recent series of neuroimaging studies has found sensorimotor activations related to word processing. However, as pointed out by Hauk in a recent review article, they lack temporal precision, essential to understanding if those activations can be classified as "semantic" (Hauk, 2016). Our study provides a new methodological approach integrating two pieces of temporal evidence (ERPs and online grip-force measurement) that support this idea, and it also explores the link between these two.
When analyzing the ERP elicited by our auditory stimuli, we observed the p200 effect already described by other authors using isolated words as stimuli, that is, a bigger p200 positivity for action-words when compared to non-action words ( . We also observed a classic, p3b target effect: a bigger p3b amplitude observed for the target words (Polich, 2009). The P3b component is thought to be generated by parietal cortices, and to be related to the process of voluntary stimulus evaluation (Wronka, Kaiser, & Coenen, 2012). We interpret the p3b effect as a confirmation of the allocation of attentional resources to the presented stimuli, which is important because attention is known to increase processing efficacy in mental imagery (Fazekas & Nanay, 2017).
An additional asset of this exploratory piece is the confirmation of the previously reported results in terms of grip force enhancement between 250-400 ms after listening to action words for both hands, but more importantly for the right (dominant) hand (Da Silva, Labrecque, Caromano, Higgins, & Frak, 2018). An interesting and somehow controversial finding was that non-action words also induced a modulation of the grip force that happened to be significant compared to the baseline. While the grip force enhancement induced by action words was more prominent (in terms of earlier onset, longer duration and bigger amplitude), it is nonetheless important to acknowledge the augmentation of the grip force induced by non-action words. To interpret these findings, we hypothesize two possible explanations. The first one falls within the view that all words, regardless of word class, imply simulation and therefore a certain degree of activation of the primary motor cortex (M1). This hypothesis is supported by recent results reported by Dreyer & Pulvermüller (2018), who observed an involvement of motor areas in the processing of abstract nouns. If abstract nouns, thought of as the cardinal example of words without embodied meaning, can activate motor areas, then concrete nouns, referring to entities in the world with which humans interact in motor ways (Greeno, 1994), can surely induce motor activations related to the affordances of each one of these objects or entities. A second explanation is that the force enhancement in the case of non-action words corresponds to an additional cognitive process happening during the experiment. A recently published meta-analysis gathered the evidence for the implication of M1 in different cognitive functions (Tomasino & Gremese, 2016). In our experiment, the working memory processes needed to accomplish the task (to keep in mind the number of times a target word appears) could underlie the activation of M1 and the augmentation in the grip force for the target non-action words (Da Silva et al., 2018).
In any case, that linguistic stimuli were able to induce involuntary motor activations strongly speaks for embodied theories of word meaning. The p200 is recognized as a word-class neural marker, sensitive to changes in semantic properties of words, and its time-course is also coherent with evidence of the earliest onset of semantic retrieval. Theoretically, the source of this To our knowledge, this is the first time that the p200 verbal signature is replicated while subjects perform and active motor expectation task (grasping two objects). We acknowledge that the small number of subjects used is a limitation to interpret these differences in the p200 for action and non-action words, and that additional studies with bigger n's are necessary to confirm our observations with bigger statistical power.
It is also the first time that electrophysiological correlates of word processing are correlated to a peripheral measurement of motor activity -the existing literature has studied these two types of measurements separately so far. Our results fit a temporal pattern of information flow, considering the time between the primary motor cortex and the hand muscles is approximately 18-20 ms (Rossi, Pasqualetti, Tecchio, Pauri, & Rossini, 1998). The link between the p200 and the grip force enhancement, suggested by the temporal onset of both measurements, is further confirmed by the statistical correlation between the amplitudes of the p200 and the grip force between 250-400 ms. Subsequent studies with larger samples are also needed to corroborate the link between the electrophysiological correlates and the grip force modulation.
This study provides a first methodological basis to integrate these two measures within the same theoretical framework. Based on previous studies of the time of linguistic processing (Mollo et al., 2016;Sahin et al., 2009), we interpret the p200 elicited by hearing a word as an index of early stages of semantic retrieval, including the imagery and sensorimotor properties of the presented word. When it comes to action words, imagery is thought of as mental action simulation (Jeannerod & Frak, 1999). Therefore, when comparing action to non-action words, the bigger p200 amplitudes we observe may correspond to action simulation inducing stronger activation of motor and premotor cortices (Pulvermüller et al., 1999). On this framework, the subsequent onset of involuntary force augmentation in our experiment (happening 10 to 20 ms after the end of the p200 window) could be explained by a failure to inhibit these motor areas engaged in language-induced action simulation (M. Jeannerod, 1994). We sketch a potential

Conclusions
In the light of our findings, a potential explanation for the bigger p200 (160-240 ms) observed for action words is that it accounts for action simulation that complements semantic recognition (indexed by bigger activations of neurons in the premotor cortex). If confirmed by subsequent studies with larger samples, the temporal onset of the subsequent grip force enhancement (250ms) and its correlation with the p200 amplitude provide further evidence to interpret this motor phenomenon as part of a semantic process, more specifically, as the involuntary motor outflow resulting from language-induced action simulation. We believe that the results of this groundwork are promising enough to consider the combination of Event Related Potentials and the Grip Force sensor as a valid an enriching approach to study language induced motor activity, generating valuable and interesting hypothesis considering a temporal pattern of information flow.

Acknowledgements
We would like to thank the personnel at the Cognition and Communication Laboratory of UQAM, for their collaboration in the EEG data acquisition and analysis; the principal investigator in this laboratory, Stevan Harnad, for his commentaries on the theoretical framework of this research; the CML (Cerveau,, Motricité, Langage) laboratory at UQAM for their support during data processing and experimental setup.
A preliminary version of these results was recently presented in the Movementis (Movement:Brain Body Cognition) conference, Harvard Medical School, July 2018, and included in the proceedings of said conference as an abstract.   Legend: * = p value < 0.05; **= p value < 0.01; ***= p value < 0.001; NATL = Non-Action Target Left Hand; NATR = Non-Action Target Right Hand; ATL = Action Target Left Hand; ATR= Action Target Right Hand.