Analysis of the Uncertainty in the Relationship between Self-Determined Motivation and Competitive Anxiety in Dual-Career Students : Application of Information Theory and Bayesian Networks

This study is framed on the Information Theory as a constructive criterion to generate probabilistic distributions –through the elaboration of Bayesian Networksand to reduce the uncertainty in the occurrence and relationship between two key psychological variables associated with the sports’ performance: Self-Determined Motivation and Competitive Anxiety. We analyzed 674 universitary students/athletes who competed in the 2017 Universitary Games (Universiade) in México, from 44 universities, with an average age of 21 years old (SD = 2.07), and with a sportive experience of 8.61 years of average (SD = 5.15). Methods: Regarding the data analysis, first of all a CHAID algorithm was carried out for to know the independence links among variables, and then two Bayesian networks (BN) were elaborated. The validation of the BN revealed AUC values ranging from 0.5 to 0.92. Subsequently, various instantations were carried out with hypothetical values applied to the “bottom” variables. Results showed two probability trees that have Extrinisic Motivation and Amotivation at the top, while the anxiety/activation due to the worry for performance was at the bottom of probabilities. The instantiations carried out support the existence of these probabilistic relationships, demonstrating the little influence on the competition anxiety generated by the intrinsic motivation. In conclusion, the reduction of the uncertainty made up by the restricted BN may aloe to re-introduce Information Theory principles in psychosocial studies, allowing authors to obtain useful probabilities values upon target psychological variables related with sportive performance.


Introduction
One of the biggest problems faced by social sciences is the great difficulty in predicting human behavior.It can be thought that it is due to the large number of variables that influence them, but explanations have also been proposed within the framework of Information Theory (IT) and the entropy inherent in the development of closed and open systems, although they do not form part of the paradigm currently valid in psychological research [1][2][3].In fact, the difficulties in assimilating the concept of "pure" information -beyond the extended use of the word "bit" -with the "processing" of information in psychology reveal that the approach is only superficial, since in reality the IT does not play any role.Regarding statistical methods in psychology, something very similar to the analysis derived from Bayes' theorem has occurred, which nevertheless has a very direct relationship with the classical concept of entropy [4,5].Adding sense to this similarity between these two concepts, we may argue that "information theory provides a constructive criterion for setting up probability distributions on the basis of partial knowledge, and leads to a type statistical inference which is called the maximum-entropy estimate" [6].However, we must bear in mind that from the classical approaches of Heraclitus and Parmenides, the dialectic between change and permanence has been present in one way or another in the studies of nature's phenomena, including the person's individual and group behavior.
Therefore, it is plausible to assume that the conceptual framework of entropy fits adequately when trying to explain the situation of an athlete who faces some competition at a given moment in time.The so-called entropic "time arrow" -which implies irreversibility of events, by prohibiting the symmetry between past and future-understood psychologically implies the generation of a past, through past events recall and memory-building, in front of a mobile future that is configured mostly on the basis of expectations and a wide range of emotions going from hope to fear.
In the case of sports psychology, one of the most relevant issues extensively studied but yet open yet to discussion affects precisely these two aspects of the "time arrow": how do emotions associated with future performance combine with the motivational past of athletes?
Motivation is one of the most studied variables in the sport context [7,8],and is defined as the cause of a behavior, which operates at the psychological level within the individual and determines the execution or not of a certain activity [9].Intrinsic motivation can be defined as one in which individuals move autonomously towards new challenges, broader experience frameworks and greater coherence in understanding.They represent behaviors that interest them, seek encouragement, limits of evidence and openly assimilate the novelty.On the other hand, there are four motivational norms that offer a broader framework on external motivation.The first of these is external regulation, in which the individual regulates his behavior through externally controlled rewards and punishments; the second is the introjected regulation, in such regulation, the individual, by complying with internal demands, can lead to certain forms of self-esteem, selfsatisfaction and feelings of proud about himself; in the identified regulation, people value the importance of a behavior and see it as something important for themselves; in integrated regulation, it implies that the individual brings a value or regulation in congruence with the other aspects of himself: with his basic psychological needs and with his other identifications.Finally, amotivation describes a state in which the individual is not motivated to behave, or behaves in a way that is not mediated by intentionality [10].
On the other hand, another of the variables most studied and related to the activation of individual and sports performance is anxiety [11,12], which is defined as an immediate emotional state characterized by the apprehension and tension associated with the activation of the organism that occurs in competition situations [13].Also, competitive anxiety has been characterized by two cognitive components.One of them is the worry, which is understood as the restlessness about the potentially negative consequences associated with poor performance.While the second of them is deconcentration, associated with the difficulty of the athlete to focus on the key aspects of the task to be performed that impede clarity of thought during the competitive situation [14].
The relationships between psychological variables have been studied through the Bayesian Networks (RB) methodology, which aims to graphically describe the dependencies and independencies between the variables studied from a database, which would reflect the domains of influence between one variable and another [15].
That is why, due to the lack of literature [16], the objective of this paper is to investigate through RB, the existing relationships between motivational regulations (intrinsic regulation, integrated regulation, identified regulation, introjected regulation, external regulation and amotivation) and anxiety related to sports competition (cognitive anxiety, worry and deconcentration).
In this study, framed in the IT and the entropic consideration in a closed system (a sportive competition addressed to a homogenous sample of dual-career athletes) we decided in favor to use a tool recommended for this type of situations, the analysis through Bayesian networks (BN) [17].This approach should allow -in part -to simulate the inversion of the "temporal arrow" when modifying the parameters of the temporal succession of events, based on the principles of probability inherent to the BNs, which presuppose the independence of the events studied.
The BN are beginning to expand their use in the field of social sciences [2,[18][19][20], and more recently they have been introduced as an useful methodology in Sports Psychology field, given their ability to provide information on the probability of occurrence of events (some of them psychological) related to sports performance, or, for example, with the likelihood of sports injuries occurence.The BN has been used to discover relationships between negative features in sport, Cooperative team work, motivation and types of sportive cooperation among players on competitive teams, motivational climate and anxiety [16,[21][22][23].
As indicated, the psychological variables selected (competitive athlete's motivational and anxiety features) are located on both sides of the fulcrum of the present, between past and future.And for all the rationale expressed above, our aim in this study is to find out the probabilistic links between the different factors of self-determined motivation and those related to the anxiety associated with competition in young athletes of different sports specialties, especially for to try to reduce the anxiety occurrence likelihood, and then interpreting the obtained results according to the entropy inherent to the studied system

CHAID algorithm
The CHAID (Chi-squared Automatic Interaction Detector) algorithm is used to discover relationships between a categorial or ordinal dependent variable and other categorical predictors.It computes a decision tree, which includes meaningful nodes that classify a nominal or ordinal dependent variable [24].It's a convenient way to summarize data as it's possible to easily visualize the relationships.It relies on the Chi-square test to determine the best next split at each node of the tree.To obtain the decision tree, the R package "CHAID" was used in the dataset.To apply the algorithm, a dependent variable and the independent variables need to be previously chosen.We have selected the "Somatic Ansiety" as the dependent variable and the remaining variables as the independent variables.

Bayesian networks
To obtain a BN, it is necessary to determine a structure (defined by a DAG) and the conditional probabilities assigned to each node of the DAG.Therefore, to learn a BN implies the following two tasks: (i) structural learning, i.e., the identification of the topology of the BN and (ii) parametric learning, i.e., the estimation of numerical parameters (conditional probabilities) given a network topology.

Structural learning
In order to obtain the BN, the bnlearn package [25]was used of the R language [26].To obtain the structure, we could follow either a search and score algorithm [27], which assigns a score to each BN structure and selects the model structure with the highest score, or we could follow a constraint based search algorithm [28]which establishes a set of conditional independence analysis on the data where an undirected graph is generated and converted into a BN using an additional independence test.We used the score-based algorithm tabu [27], which gave us a plausible model concerning our data.The search procedure finds the structure that best improves the score, i.e. using the highest score (Bayesian Information Criterion -BIC).

Participants
The study was performed in Mexico with 674 universitary students/athletes who competed in the Universitary Games (Universiade) in 2017, from 44 universities, with an average age of 21 years old (SD = 2.07), and with a sportive experience of 8.61 years of average (SD = 5.15).
All participants were previously informed about the protocol and purposes of the study.The study protocol was approved by the local ethical committee at the Autonomous University of Nuevo León (México) in accordance with the ethical standards in sport and exercise science research.

Procedure
The data were collected during the Universiade games, All participants are legal adults, and their participation was voluntary; the authors obtained a written consent from each participant.Participants completed the questionnaires in the changing room before a training session.

Instruments and material
Competitive anxiety in sport was measured using the Spanish-language adaptation (Escala de Ansiedad Competitiva, [29] of the Sport Anxiety Scale 2 (SAS-2) [30].The SAS-2 consists of three 5-item scales to measure three factors: somatic anxiety, worry, and lack of concentration or deconcentration.Each item was answered on a Likert 4-point scale with a range between "nothing" and "a lot".
Self-determined motivation.An adapted version of the Sport Motivation Scale [31]was used in this study.This measure had been previously translated into Spanish [32].The SMS-II is an 18-item inventory comprised six factors of behavioral regulation.Such factors were extracted from the Selfdetermination Theory in order to prove a model that allows the assessment of Autonomous and Controlled Motivation.The subscales are intrinsic motivation (e.g., "for the pleasure it gives me to know more about the sport I practice"); identified regulation (e.g., "because in my opinion it is one of the best ways to meet people"); introjected regulation (e.g., "because it is absolutely necessary to do sports if one wants to be in shape"); external regulation (e.g., "because it allows me to be well regarded by people that I know"); and a-motivation (e.g., "I used to have good reasons for doing sports, but now I am asking myself if I should continue doing it").Each item was answered on a Likert 7-point scale with a range between "nothing" and "a lot".
The project in which this study is embedded (ELIT-in) obtained the approval of the University de Trás-os Montes e Alto Douro (UTAD, Portugal) Ethical Committee, with the code 23/20/CE/2018.Regarding the values found in the variables studied, the lack of importance of external regulation compared with the values of intrinsic motivation, and, to a lesser extent, with those of Amotivation, is relevant.The subscales of the intrinsic and internalized regulation values are high, being the identified regulation the one that presents them lower.

Descriptive data of the variables studied
When we observe the values of the anxiety related to the competition, we can verify that they are below the averages of the ranges.The anxiety related to the competition shows values below the means of the possible ranges in all the cases.Somatic anxiety and lack of concentration are the dimensions with the lowest values, while the worry for performance is above the average.The SD values -but in one case -are consistent with a relatively homogeneous sample, without being especially significative any of the values found.The Amotivation' SD value is very close to the mean values, indicating that the answers have been pretty little homogeneous regarding this variable.
If we observe the values of the anxiety related to the competition, we can verify that they are below the averages of the ranges.The anxiety related to the competition shows values in all cases below the means of the possible ranges.Somatic anxiety and lack of concentration are the dimensions with the lowest values, while the concern for performance is above the average.The SD -in all cases -are consistent with a relatively homogeneous population, without being striking any of the values found.

CHAID algorithm.
In Figure 1 we may see the result of the application of the CHAID algorithm which reveal a tree prediction model for the "Somatic Anxiety" variable, and, therefore, reduce the uncertainty in the data obtained, thus allowing a BN analysis with restrictions reducing the whole system' entropy.Five variables have been found to predict "Somatic Anxiety".Four of them are motivational: intrinsic and external global regulations, and two subscales of intrinsic motivation, identified and integrated regulation; and the last one is anxiety related to performance.The CHAID tree starts with the top decision node "External Regulation" with the 674 instances of the data set divided into two partitions based on the values of splitting this node into the two categories "High" (n = 67) and "Low" (n = 607).The "Low" category shows a majority of cases associated with the "Somatic Anxiety".This node is further split based on the value of predictor variable "Worry for performance Anxiety", resulting into two more nodes based on its category.The splitting continues until either split does not help to improve the predictive accuracy or a node contains instances which less than the pre-defined size.To validate the BN we performed a 10-fold cross validation, taking into consideration the area under the curve (AUC) which is defined as the probability of correctly identifying a pair of cases (positive and negative).As can be seen in Table 2, the validation values of the BN generated with all the variables studied are acceptable.However, in the same line as the descriptive values, the minor AUC values correspond to the Amotivation and the identified regulation, as well as the anxiety of worry for performance.To achieve a better understanding of the classification accuracy, sensitivity and specificity were calculated.While it's possible to observe a high accuracy in some variables, by observing the Sensitivity and Specificity, we can detect a null positive classification ability on Identified Regulation and low on Intrinsic and Integrated Regulation and a generally high negative classification ability.

Validation and elaboration of the BNs with and without restrictions.
Figure 2 shows the BN generated with the entropy reduction restrictions found after the CHAID analysis using an acyclic graph with no possibility of feedback.The top variable is the external regulation, which has shown a low probability, while the bottom variables (probabilistically dependent on the others) are the identified regulation, which is in turn dependent on global intrinsic regulation.The two intermediate variables -no nodes have been found -are the anxiety due to worry for performance and the sub variable of identified intrinsic regulation.The probability values found in the studied sample indicate that the participants in the study have a high probability of being intrinsically motivated, with a low probability of perceiving external rewards or benefits (although this variable is shown as a key to trigger the probability of occurrence of the others variables); as well as a presence of average probability of anxiety of worry for performance, which acts as a buffer for the other variables.On Figure 3 we can observe the probabilistic "ladders" among the studied variables, using an unrestricted BN made after a cyclic graph.The top variable is the anxiety related with the lack of concentration that have a probabilistic impact on the rest of variables, directly or indirectly by means of the external regulation and the performance's worry anxiety type.Regarding the whole tree, we have found two bottom key variables (in red in the Figure 3): The Somatic anxiety, and the intrinsic introjected regulation (which acts also as a node).This two "big" SDT factors, the Amotivation and the Global Intrinsic regulation are intermediate variables in this BN (this last one being the second node of the tree).Finally, we are considering a cyclic graph, we have discovered that there are four feedback loops, two of them related to the probabilistic impact of the Identified Intrinsic regulation upon the two others Intrinsic sub variables: The Integrated and the Introjected ones.The last two, coming from the Somatic anxiety and External regulation, impacted on the performance's worry anxiety.The probabilistic values of the variables are the same ones observed in Figure 2, where are represented the BN up with CHAID restrictions.

BN Instantations with hypothetical data.
When we start the analysis by means of the instantiations, that is, when hypothetical probability values are "injected" in some detected key variables trying to revert the entropic development of the events, we begin with the variables' selection.The three variables selected are: 1) the most relevant Anxiety subscale found, the one related with the worry for performance, as appears in both BNs; 2) The Global Intrinsic Motivation, being the most significant node in the two BNs, and 3) the Somatic anxiety, the "bottom" variable in the BN generated without restrictions.
We begin by trying to analyze which are the necessary modifications in other BN variables to reduce to the maximum the probability of the occurrence of anxiety of worry for performance, in this case, on the CHAID restricted BN (See Table 3).All the steps involve the reduction of other probabilities.The first two steps are those that provide the most important changes, being lack of concentration and introjected regulation those that should decrease their probability to give an increase of 18% to the decrease in the probability of worry anxiety, which is coming from a low value: one third of the real found probability.The following steps have less weight, being by importance the reduction of the somatic anxiety and the different subscales of the self-determined successive motivation, although the latter represent only a reduction of less than a tenth of probability, which reaches a maximum of 64.83% (just reaching barely the double of the probability actually obtained in the sample, which is 35.88%).This indicates that the existence of worry anxiety for performance is very strong and cannot be reduced even if it is forced through the use of hypothetical "anti-entropic" values, and can only get values in the average probability of occurrence.In Table 4 we can see that when trying to reach the maximum probability of the identified intrinsic motivation subscale (which is the most relevant bottom motivational variable in the NB with restrictions) the reversed BN only achieves a 6% of increase.In this sense, it can also be assured that it is a very solid variable.This maximum value is reached by means of three steps, with pretty similar values, two of them implying reaching the maximum probability of the global intrinsic motivation and the introjected regulation, and going to the minimum with the anxiety derived from the worry for the performance.The third instantiation (see Table 5), is aimed at analyzing how somatic anxiety can be reduced to a minimum, starting from an initial value of very low probability (Low = 87.13).We obtained a decrease reaching almost the 100%, by changing the probability values of several variables in seven steps.The first one involves maximizing integrated regulation, but the most important jump occurs when external regulation has a 0% probability of occurrence.The last four steps mean -as a whole-just a likelihood reduction of 3%, and they need for to obtain these values to minimize the performance worry anxiety, the amotivation and the identified regulation, while reaching the maximum probabilities in the global intrinsic motivation and the introjected regulation.Is important to outline that several steps used a feedback loop (as derived from a cyclic graph) for to contribute to the maximum hiphoptetical reduction of the somatic anxiety, such as the integrated and identified regulations, as well the external regulation.

Discussion
In the first place, and responding to the objective and the question contained in the Title of the manuscript, we must indicate that all the analyzes carried out indicate that it is possible to reduce the uncertainty in the relationships between motivation and anxiety in the practice of competitive sport.
Likewise, We would like emphasize that, from our knowledge, there are hardly any works that try to work from the IT -for historical reasons that have been commented previously (ref needed)the problems that arise from social sciences such as the psychological ones.Thus, the use of a CHAID algorithm to reduce the entropy of the studied system has allowed us to contrast an analysis of "free" probabilities without constraints and with feedback loops, with another more closed and limited to the variables that have shown connection before their inclusion in the Bayesian analysis.
The main findings can be summarized in that the competitive anxiety is completely "disassembled" in its three factors with respect to its probabilistic weight of occurrence: The predecessor variable of the others is the worry for performance, while the other two dimensions have occupied some key positions too: the deconcentration (or lack of concentration) anxiety (the one most responsible for the performance decrease) acts as "modulator" (working thorough a feedback loop in the unrestricted and cyclical BN) backwards on the worry anxiety, triggered by the weight of the probability of external motivation; while the somatic anxiety becomes the "final" subproduct from the others variables.Other studies from our group have demonstrated the caution with which somatic signs of anxiety must be taken by external observers to determine the ability of subjects to perform their tasks [33]).
However, it should be noted that this sample -made of student/athletes, young and of a medium, not top, performance level-presents several clear biases: low probability of anxiety and external regulation and amotivation, contrasting with a high probability of emergence of selfdetermined motivation, without a clear predominance of the source of intrinsic regulation.As indicated in other studies [34]competition-related anxiety can not be considered as a whole, but its three factors must be considered separately in both the evaluation and the intervention psychological, as it is clearly showed in the BN without restrictions regarding the somatic anxiety The results found in the BN, with restrictions, showed that there were five variables to predict the somatic anxiety".Four of them are motivational: intrinsic and external global regulations, and two subscales of intrinsic motivation, identified and integrated regulation; and the last one is anxiety related to performance.
The probability values found in the studied sample indicate that the participants in the study have a high probability of being intrinsically motivated, with a low probability of perceiving external rewards or benefits, although this variable triggers the probability of occurrence of the other variables; as well as a presence of average anxiety probability related to performance, which acts as a buffer for the other variables.
On the other hand, the results found through the unrestricted RB shows that the main variable is the anxiety related to the lack of concentration which has a probabilistic impact on the rest of the variables, directly or indirectly, through external regulation and anxiety related with performance.This lack of concentration could be explained because the athletes are thinking that if they do not get a favorable result in the competition they will not receive any reward.In turn, this trigger anxiety of somatic type can cause potentially negative consequences related to poor performance, since athletes are trying to accomplish their internal demands.
Otherwise, Amotivation depends on external regulation, which in turn impacts probabilistically on the global intrinsic motivation, but with a really low likelihood of occurrence (athletes are highly self-motivated or they have not understood the sport practice as a habit?).
The global intrinsic motivation impacts on the identified intrinsic regulation.While the intrinsic regulation identified impacts probabilistically on the other two intrinsic sub variables: the integrated and introjected regulations, which suggests that athletes by valuing their sport as something important for themselves, regulate their behavior because it is in congruence with other aspects of themselves, while improving their self-esteem, self-satisfaction and their own feelings of pride.
So, what are the major differences between the two BNs, considered from the IT 'point of view?The entropic and uncertainty reduction using the CHAID algorithm also reduces the probabilistic impact of deconcentration, somatic anxiety and amotivation (which rests outside the set of probabilities) while underlining the triggering role (ancestor) of worry for performance and the external regulation, with no more than minimal differences in the top-bottom and bottom-top linkages between the variables of self-determination.
Working with the instantations leaves the probability landscape much clearer.To obtain the lowest possible probability value of worry anxiety -the one that is least associated with a performance deficit [16,35] the variables that have most to reduce their probability of occurrence are the deconcentration and the introjected regulation (which is usually accompanied by "negative" emotions such as guilt), and to a lesser extent of bodily signs of anxiety should be perceived, andabove all-any motivational increase (external or internal) should be minimized.If we combine this data with the "real" probabilities found in the BNs and the nodal or antecessor situation that occupies the worry for performance anxiety, we can see that perhaps this last one is the key element of the studied system.Supporting this, the fact that a zero probability can not be reached (100% Low) may perhaps indicate a characteristic of this type of athletes who are also concerned about their academic career [36,37] and who may have some kind of "immunity" in front of external regulation, which in turn triggers a lower probability of somatic anxiety.
In a complementary way to the previous argument, to obtain a maximum intrinsic regulation (the one identified as bottom variable in both BNs), which is based on a high probability value (almost 90%), it is possible to increase its likelihood when the introjected and the global intrinsic regulations reached their 100% probability of occurrence, and when the anxiety due to worry for performance is at 100% Low probability, that is, zero.It is clear, then, the existence of a strong opposition between the emotion of guilt and the anxiety for worry about performance in the athletes' minds.
To finish completing the scenario described by the BNs carried out, the last instantation tries to reduce to the maximum the somatic anxiety that has proved to be a bottom variable in the BN without restrictions.We observe that it can be reduced (although its "real" value is already quite high: Low 87.13%) its probability to almost zero (100% Low), and that the most relevant steps are the hypothetical reduction to 0% probability of the external regulation (acting in feedback, which indicates the existence of a cycle between these two variables), the amotivation and the anxiety due to worry for performance, as well as an increase in all the self-determination variables.
Trying to summarize now more succinctly the results obtained, we can conclude that the studied athletes starts from high values of self-motivation and low anxiety associated with competition; that the BNs carried out showed a probabilistic "constellation" that situates anxiety due to worry for performance and external regulation as the basic predecessors, and intrinsic regulation and somatic anxiety as bottom variables, losing informative value the amotivation and the deconcentration.All these findings are confirmed when hypothetical values are introduced in these key variables and we are able to observe how other variables' probabilities must be modified to reach its maximum or minimum probability of occurrence.This work presents some limitations, the most important ones derived from the studied sample's constitution, and the impossibility of relating the variables studied with either objective or subjective athletes' performance.Also, the bias that has been obtained from these athletes with dual career regarding their type of motivation and competitive anxiety had made limited the results obtained by use of BNs with hypothetical values.Finally, in this study there has not been a real use of the IT with respect to the probability values obtained, since they have been too much associated to their qualitative component (meaning of the variable).
Considering this study as a whole, the next investigative step should be able to structure the studied system in a more useful way to carry out an analysis completely "inside" the IT, perhaps using a BN with restrictions (e.g., through the algorithm CHAID that has been used here) and acyclical, while the psychological variables analyzed may be encoded in less qualitative terms, once we already know their relations of probabilistic chaining.

Figure 2 .
Figure 2. BN generated using the restrictions made after the CHAID algorithm (inter-dependent variables only).

Figure 3 .
Figure 3. BN generated without restrictions, considering all studied variables in a cyclic graph.

Table 1 .
Descriptive data of the variables studied (N = 674).

Table 2 .
Validation of the BN realized upon the variables studied: AUC values, Accuracy, sensitivity and specificity

Table 3 .
Step-by-step instantiations leading to maximization of the likelihood of LOW performance' Worry anxiety in the BN with CHAID restrictions.

Table 4 .
Step-by-step instantiations leading to maximization of the likelihood of HIGH Identified intrinsic regulation in the BN with CHAID restrictions.

Table 5 .
Step-by-step instantiations leading to maximization of the likelihood of LOW Somatic anxiety in the BN without restrictions and allowing feedback among variables.