An analysis on Dual Diagnosis Revealed a Possible deviation of Schizophrenia Polygenic Risk Score by Mexican Amerindian Ancestry

In order to summarized the polygenic background of psychiatric diseases, polygenic risk scores (PRS) have been developed. Recently, PRS have been use to predict patients with higher comorbidities in psychiatric diseases, like dual diagnosis. PRS are principally derived in analysis of Caucasian and Asian populations, we are not aware of how this PRS could be applied in populations with high admixture. In order to explored this, the present work has the aim to analyzed if previous calculated PRS for psychiatric diseases could predict dual diagnosis in Mexican population, and also, if PRS calculation could be influenced by Mexican Amerindian (MA) global ancestry. We performed PRS calculation, using PRSice, with summary genome-wide association statistics previously published for psychiatric diseases, and also, performed Nagelkerke correlation test in order to established if PRS are correlated with dual diagnosis. We found that dual diagnosis could be predicted by major depressive disorder polygenic risk score. Nevertheless, schizophrenia polygenic risk score is highly correlated with global MA ancestry, independently of the schizophrenia diagnosis. Our results reinforced the notion that PRS calculation could be deviated by the MA global ancestry, nevertheless analysis on larger sample sizes are required in order to clarified this issue.


Introduction
Psychiatric disorders are complex diseases affected for multiple risk factors [1,2] .In the last decade, the discovery of genetic factors affecting these disorders has increased fast, principally by techniques like genome-wide association studies (GWAS) [3][4][5][6] .GWAS have identified multiple locus associated to psychiatric disorders.In order, to summarize and applied this multiple disease-associated locus as diagnostic tool, some measures like the Polygenic risk scores (PRS) have been developed [7,8] .PRS could be briefly explained, as the summary, on one patient, of the effect of the large number of diseases associated variants for a particular disease identified by GWAS [9] .Principally PRS as diagnostic tools, have been centered into find how much two different complex diseases shared a polygenetic background and if this PRS could predict a subtype or another trait inside individuals' groups with the same diagnose [10,11] .Nevertheless, if we analyzed the ancestry of the individuals included in the GWAS of psychiatric disorders, most of them have Caucasians or Asians ascendance, with small number of individuals of other populations like African-Americans or Latinamerican populations.In some studies exploring the application of PRS to population different of the Caucasian has exposed that the ancestry could be deviating factor.Otto et al found that the Caucasian PRS could not be used to predict nicotine dependence in individuals of Native American population [12] .
In some epidemiological analysis, psychiatric disorder has showed to have a high comorbidity with substance use [13,14] .The comorbidity between psychiatric disorders and substance use is so high that even this comorbidity has been named dual diagnosis [15] .Dual diagnosis in psychiatric disorders have been associated with multiple negative physical and psychosocial outcomes, like a poorer quality life, and a higher rate of relapse [16][17][18] .Even when the etiologic mechanisms underlying the dual diagnosis has not been stablished, some discoveries have found that the PRS for psychiatric diseases could have a high impact in substance use disorder [19] .
Mexican population has been previously found to shared genomic background with three main populations: African, Caucasian and Mexican Amerindian (MA) [20] .Regarding the ancestry component, the Mexican population could be considered a population with a unique admixture and the best of our knowledge the application of PRS in psychiatric disorders has not been explored previously.Due to this, our aim is to explore the role that plays PRS calculated form previously performed GWAS for psychiatric diseases on dual diagnosis in Mexicans individuals diagnosed with schizophrenia and bipolar disorder, and also, to explored if the global MA ancestry could have some impact in the calculation of PRS.

Sample population.
All individuals were evaluated with the Diagnostic Interview for Genetic Studies [21] .
Diagnosis where performed using the criteria of DSM-5 for bipolar disease, schizophrenia and for substance use disorder.As inclusion criteria, all the individuals were Mexican ascendants, with father and mother of Mexican ascendance, also, their grandparents where born in Mexico.To be 18 years old or more.Whereas that for the subjects of the healthy group, subjects with presence of depression, substance use disorders, anxiety or suicide behavior were excluded.As well, subject with relatives with a psychiatric disease diagnosed.
A total of 192 Mexican ascendance individuals were included in the analysis.First, we include 125 patients with psychiatric disease.53 patients diagnose with bipolar disorder (25 male, 28 female) and 72 patients diagnoses with schizophrenia (49 male, 23 female).
Whereas that for the control group we included 67 subjects.The prevalence of lifetime substance use disorders (dual diagnosis) in the group of the patients with schizophrenia were 54.71% (n=29), whereas that for bipolar disorder were of 43.78 % (n=29).An overview of some sociodemographic characteristics of the sample are reported in Table 1.
The protocol was approved by the ethic and investigation committees of the National Institute of Genomic Medicine under the approval number 23/2015/I.All the subjects were informed by the aims of the study and gave their informed consent before inclusion in the study.All the protocols where performed under the Helsinki declaration.

Genotyping and imputation.
DNA was extracted from peripheral leucocytes using a salting-out commercial protocol, with specifications established by the provider (Quiagen, USA).Genotyping was performed using the whole-genome genotyping platform PsychArray BeadChip (Illumina, USA) with the protocol and conditions established by the provider.PsychArray, includes approximately 560 000 polymorphisms distributed in the whole-genome, and also, some variants previously associated to diverse mental psychiatric disorders.
After the genotyping process, we performed an imputation using Beagle software and as reference, the 1000 genomes database was utilized [22][23][24] .For next analysis steps, we included only the single-nucleotide polymorphisms (SNP) with a chi-square test p-value for a Hardy-Weinberger equilibrium (HWE) higher than 0.00001, a minor allele frequency (MAF) higher than 0.05 and an allelic R 2 higher than 0.4 [25] .After imputation and QC filter, we obtained a total of 4,835,917 SNP's.

Polygenic risk score calculation.
Polygenic risk score, is a measured developed to reduce the polygenicity of complex diseases to a more and simple manageable score.For the calculation of individual PRS, we implemented the calculation with summary statistics based in previous published GWAS from the psychiatric genomic consortium in PRSice [9] .For PRS the best estimates of the models are reported, also, all the models were adjusted by age, gender and the first five principal components of global ancestry.The estimation of global ancestry was performed with principal components analysis implemented in the GCTA software [26] , using a panel of 200 ancestry informative markers (AIM's) previously reported to accomplished ancestry estimations in American populations [27] .Also, for global ancestry estimations, a clusterization analysis was performed with ADMIXTURE [28] , with the same AIM's panel.
For ancestry calculation, the following populations were used as reference, Utah residents with northern and western European ancestry (CEU) and Yoruba in Ibadan from Nigeria population (YRI) reported in the 1000 Genomes project [24] , and also, a total of 25 individuals of Mexican Amerindian (MA) ancestry genotyped with Multi-Ethnic genotyping Array (Illumina, San Diego, CA, USA).

Calculation of Substance Use Disorder correlation with Psychiatric Diseases polygenic risk scores.
In order to search which psychiatric polygenic risk score (ADHD, ASD, MDD, SCZ and BD) is more correlated with the SUD phenotype, we performed Nagelkerke correlation test implemented in PRSice and using the summary statistics previously reported [29] .In this correlation tests we recoded the phenotype of dual diagnosis as cases and individuals without SUD diagnoses and healthy controls where considered as controls.Also, for this analysis we utilized the best models.After finding which psychiatric disease was more correlated with SUD, we performed pair-wise T-Student comparison of this PRS, between the patients diagnosed with only a psychiatric disease and the patients with dual diagnosis (BD or SCZ and SUD) and considered a statistically significant p-value lower than 0.05.

PRS ancestry dependence estimation.
In order to establishes if the Mexican Amerindian ancestry component has some influences with the PRS that reach significant correlation with dual diagnosis, we performed Spearman correlation test implemented in R [30] .We compared the global ancestry principal component 1 (PCA1) and principal component 2 (PCA2), with the PRS.We only included PCA1 and PCA2 because these two global ancestry principal components could separate individuals in the main populations.The comparison of PCA1 and PC2 with PRS was performed with Spearman correlation tests and considered a correlation with PRS and global ancestry component if the p-value was lower than 0.05.Also, we performed an stratification of the samples based in 20% of global MA ancestry proportions calculated by Admixture.Once individuals stratified on this arbitrary classification we performed comparison between all the groups with pair-wise Mann-Whitney test.We considered statistically significant a p-value threshold lower than 0.05.

Schizophrenia and major depression polygenic risk scores are correlated with dual diagnosis in Mexican patients.
In order to establishes who much of the multiple locus previously associated to psychiatric Once identified that MDD and SCZ PRS are correlated with SUD, we performed pair-wise comparison of the individual PRS in patients with dual diagnosis and patients without dual diagnosis.In this analysis, patients diagnosed with dual diagnosis in the bipolar disease diagnosed group have a higher MDD-PRS score compared with bipolar disorder diagnosed patients without dual diagnosis, and this difference reach statistical significance (p-value < 0.05) (Figure 1).At difference when we compared this MDD-PRS in dual diagnose in patients with schizophrenia, patients with dual diagnosis did not have difference when compared with patients without dual diagnosis.In this sense, when we compared the SCZ-PRS in patients with dual diagnosis and without, in the schizophrenia and bipolar diagnosed groups, we found that this PRS did not have differences between the groups.The results for the pairwise comparisons are in the Table S1.In this analysis the SCZ-PRS differences were not significant in either of the two groups.

Ancestry Dependence of MDD and SCZ polygenic risk score.
One of the particularities of the global ancestry of Mexican population is their influence of three main populations, which make a particular population.In order to explored if this global ancestry differences could influence the PRS, we performed a compassion with global ancestry.First on the global ancestry PCA plot performed with the present sample (Figure 2) we can see the separation of the main population that we used to performed the ancestry calculations, being CEU population in the upper (blue), YRI population in the right corner (orange) and MA population on the left corner (red) of the plot.Also, it is notoriously when we analyzed the MA global ancestry in the sample included in the analysis (green) that an inversed gradient of MA and CEU global ancestry could be seen.Individuals with high global MA ancestry are grouped closer to the left corner, were the MA individuals could be found, and individuals with higher global CEU ancestry are more grouped in the upper.Also, to find the proportion of the main population global ancestry we performed an ancestry clusterization analysis.In this analysis we found that the analyzed Mexican population sample has an average YRI ancestry proportion of 6.67% (SD ± 4.85%), CEU ancestry proportion of 38.10% (SD ± 18.16%) and an average MA proportion of 55.23% (SD ± 19.10%).After this, we performed an arbitrary classification of the subjects based in 20% of the global MA ancestry proportion.In Table S2, is reported the number of individuals by diagnosed and ancestry  where correlated with PCA1 and PCA2.Nevertheless, SCZ-PRS have a higher correlation with global ancestry components.In Figure 3, the comparisons between the SCZ-PRS and MDD-PRS, and the principal components could be seen.S3.At difference, when we analyzed the SCZ-PRS we found that this PRS differed significantly between each group, the groups high grade of admixture.But was not significant when we compared the groups with higher global MA ancestry (100% to 79.9% compared with 80% to 59.9% of global MA ancestry) and the groups with lower global MA ancestry (0% to 20% compared with 20% to 39.9% of global MA ancestry).Based in these results it is clear that the SCZ-PRS and not the MDD-PRS could have a global Mexican Amerindian ancestry dependent bias.

Discussion.
Evidence suggests that dual diagnosis is highly heritable [31,32] .In this sense, dual diagnosis has been related with increased severity and poorer outcomes for various psychiatric disorders, which increased the need to have a tool that could predicted this phenotype [33] .Concerning that, polygenic risk scores (PRS) were developed in order to measure polygenetic risk and due to it became an important tool in uncovering some phenotypes inside a main diagnosis [34,35] .To begin, we conducted an analysis in order to evaluate the possible correlation between dual diagnosis, in patients diagnosed with schizophrenia and bipolar disorder, and polygenic background of ADHD, ASD, BD, SCZ and MDD.In this analysis, findings revealed that MDD and SCZ polygenic risk scores were correlated to dual diagnosis.Nevertheless, only MDD-PRS could differentiated patients diagnosed with bipolar disorder (BD) and dual diagnosis from patients diagnosed with BD and without dual diagnosis.In the same way our outcomes are in agreement with the study of Andersen A et al., where they suggest that shared genetic susceptibility contributes to MDD and alcohol dependence (AD) comorbidity, hence the individuals with elevated polygenic risk for MDD have elevated rates of AD [36] .
In the other hand, due to Mexican population is an admixed genetic population and we could be able to find individuals with different grades of this admixtures in the general population.In relation to this we consider important to address the ancestry dependence of the PRS, principally with the Mexican Amerindian global ancestry.When we evaluated the ancestry dependence, the SCZ-PRS shows a link with the Mexican Amerindian ancestry.The difference in the PRS based in the demographic history has been previously explored [37,38] .Martin et al., performed an evaluation of 8 trait PRS in the 1000 Genome Project panel and they found that the SCZ-PRS is deviated based in the main population's ancestry.
In this analysis they also reported that how the PRS could change based in the ancestry still be unpredictable [38].Reinforcing the notion of this PRS unpredictability based in the ancestry, we reported that only the SCZ-PRS is deviated by MA ancestry, but not the MDD-PRS.Nevertheless, this deviation could be influenced by others factors still unknown like the socioeconomic status or unrevealed correlation between genetic markers analyzed by the GWAS [39,40] .Even though, because these findings are derived from a small sample size, this result have to be considered as preliminary, and it is necessary to increase the sample size in order to have more solid this relationship.However, we consider that this preliminary finding is significant because in order to established the validity of PRS as a risk prediction tool and their utility has to be validated in all populations.Nevertheless, our findings are a first view of the involvement of polygenic risk in dual diagnosis in Mexican population.

Conclusions.
The results of the present study could be considered as preliminary, which can help reduce diseases (ADHD, ASD, BD, SCZ and MDD) are correlated with dual diagnosis in Mexican population we performed PRS calculation and Nagelkerke correlation tests.Attention Deficit and Hyperactivity Disorder, Autism Spectrum Disorder and Bipolar Disorder polygenic risk scores correlation with dual diagnosis did not reach statistical significance, ADHD (p-value = 0.1570), ASD (p-value = 0.0538), BD (p-value = 0.1585).In contrasts, SCZ and MDD polygenic risk scores reach statistical significance with correlation with dual diagnosis, SCZ (Nagelkerke Psuedo-R 2 = 2.83%, p-value = 0.0423, n = 8058 SNP's) and MDD (Nagelkerke Pseudo-R 2 = 4.51%, p-value = 0.0118, n = 334 SNP's) which explained a higher variance.

Figure 1 .
Figure 1.Comparisions of schizophrenia (SCZ) and major depressive (MDD) polygenic risk score in Dual Diagnosis.(a) Comparison of major depressive polygenic risk score.In patients with Bipolar Disorder (BD), the difference between the MDD-PRS in patients with dual diagnosis was statistically significant compared to patients without dual diagnosis (p-value = 0.0005).At difference, in patients with schizophrenia (SCZ) was not significant.(b) Comparison of schizophrenia polygenic risk score.In this analysis the SCZ-PRS differences were not significant in either of the two groups.
proportions.Almost the 50% of the total sample (45.83%, n = 88) have a global MA ancestry proportion of 40 to 60%, which reinforced the notion that the Mexican population have a great degree of admixture.

Figure 2 .
Figure 2. Principal component analysis of the Mexican population sample included.In red are represented the individuals of Mexican Amerindian ancestry (MA), in blue are the Utah residents with northern and western European ancestry (CEU) individuals, in orange the individuals from Yoruba in Ibadan from Nigeria population (YRI) and in green the analyzed sample.In this graph it is clear that the individuals from Mexican population (green) have a high degree of admixture, from those with high Caucasian ancestry to those with high MA ancestry.

Figure 3 .
Figure 3.Comparison of the global ancestry principal components with schizophrenia (SCZ) and major depressive disorder (MDD) polygenic risk scores (PRS).The polygenic risk scores are represented in the gradient scales, meanwhile in the x-axis the global ancestry PCA1 and in the y-axis the global ancestry PCA2 are represented.The colors in the gradient represent the polygenic risk score, with higher values represented by red and lower values represented by blue color, meanwhile, in green color are closer to mean value of PRS.In triangle are represented the individuals diagnosed with bipolar disorder (BD), in square the healthy controls (CTR) and in circle the patients diagnosed with schizophrenia (SCZ).(a) SCZ-PRS shows a clear increased based in the MA global ancestry component.(b) MDD-PRS does not show the ancestry dependence.

Figure 4 .
Comparisions of the major depressive disorder (MDD) and the schizophrenia (SCZ) polygenic risk (PRS) in the Mexican Amerindian (MA) ancestry stratification groups.In red are groups with higher MA ancestry, in green the individuals with high degree of admixture and in blue individuals with low MA ancestry (but high CEU ancestry).(a) MDD-PRS comparison, none of the pairwise comparison where statistically significant.(b) SCZ-PRS, the pairwise comparison between the individuals with higher admixture where the only significant.

Table 1 .
Summary of sociodemographic data.Dual diagnosis was considered if a patient has a psychiatric disease diagnosis (BD or SCZ), and also, a substance use diagnosis (SUD).SD: standard deviation, No SUD: patients without dual diagnosis.