Evidence for Recent Polygenic Selection on Educational Attainment Inferred from GWAS Hits

The genetic variants identified by three large genome-wide association studies (GWAS) of educational attainment were used to test a polygenic selection model. Average frequencies of alleles with positive effect (polygenic scores or PS) were compared across populations (N=26) using data from 1000 Genomes. The PS of 161 GWAS significant SNPs in a recent meta-analysis was highly correlated to population IQ (r=0.863) and to the polygenic score of four alleles independently associated with general cognitive ability. High correlations with PISA scores for a subsample were observed. SNP p value predicted correlation to population IQ and factors from the two previous GWAS (r= -.25). Factor analysis produced similar estimates of selection pressure for educational attainment across the three datasets. Polygenic and factor scores computed using the top 20 significant SNPs showed very high correlation to population IQ (r=0.88; 0.9). Similar findings were obtained using 52 populations from another database (ALFRED). The results together constitute a replication of preliminary findings and provide strong evidence for recent diversifying polygenic selection on educational attainment and underlying cognitive ability.


Introduction
Over the last decade, population geneticists have recognized that most traits are highly polygenic, and hence have moved away from the study of genetic evolution using the single-gene, Mendelian approach, towards models that examine many genes together (i.e.polygenic models).The origin of this trend can be traced back to Fisher (1918).Novel methods for identifying polygenic evolution were recently proposed by Piffer (2013), Berg andCoop (2014), Field et al. (2016), and Leinonen et al. (2013).The aim of this study is to replicate findings by Piffer (2013Piffer ( , 2015) ) suggesting the action of polygenic selection on single-nucleotide polymorphisms (SNPs) that had been associated with educational attainment and cognition in genome-wide association studies (GWAS).To date, GWA studies have identified several alleles with replicated association with intelligence and proxy phenotypes, such as educational attainment.However, each allele typically accounts for a tiny portion of the variance in IQ scores (about 0.5 IQ points per allele).Thus, more recent studies rely on larger and larger samples to identify the greatest possible number of significant associations.Piffer (2013) employed the genetic variants associated with educational attainment and general cognitive ability, showing population differences in allele frequencies that closely matched IQ and educational outcomes differences between ethnic groups and nations.Since then, two big GWAS carried out on independent samples have been published.Thus, a goal of this study is to test the results obtained by Piffer (2013Piffer ( , 2015) ) against the genetic variants found by the latest GWAS of educational attainment.The important question is whether the new GWAS findings support the results obtained by Piffer (2015) using the first GWAS of this kind (Rietveld et al., 2013) and subsequent studies on general cognitive ability (Davies et al., 2015;Ibrahim-Verbaas et al.,, 2015).In other words, the question is: are the population frequencies of the latest GWAS hits correlated to the frequencies of the original GWAS hits employed by Piffer (2013Piffer ( , 2015))?Can the claim (Piffer, 2013) that they predict population IQ be substantiated?To this end, the hits from the two most recent and largest GWAS of educational attainment, measured as completed years of education (Davies et al., 2016;Okbay et al., 2016) will be used in the analysis.The first GWAS was carried out using the UK Biobank sample (N>100,000).Over a thousand SNPs reached genome-wide significance (p< 5 x 10 -8 ), but after controlling for linkage disequilibrium.Genotypes were pruned for linkage disequilibrium (LD) using clumping to obtain SNPs in linkage disequilibrium with an r 2 <0.25 within a 200 kbp window.A few independent signals were identified (Davies et al., 2016).For the sake of simplicity, the three hits found by Rietveld et al. (2013) were lumped together with this polygenic score.The latest GWAS was carried out on a sample of >293,000 individuals (Okbay et al., 2016) and produced 74 independent ("LD-free") hits.A replication of this study found that the polygenic score predicted 9% of the individual variance in educational achievement at age 16 (Selzam et al., 2016).Population differences in allele frequencies are normally correlated to phenotypic differences.If there is a correlation between the population frequencies of unlinked alleles favouring a phenotype, this suggests diversifying selection on the phenotype.Recently, Robinson et al. (2015) used a similar approach and showed a correlation between average height of populations and the frequencies of height increasing alleles.IQ will be used as the phenotype of interest and main dependent variable in the analyses.This choice can be justified by its privileged status in psychometric research and its robust genetic correlation (r= around 0.7) with educational performance (Krapohl et al., 2014) and attainment (Bulik-Sullivan et al., 2015).Moreover, the GWAS hits identified by the three educational GWAS also predict general cognitive ability in their samples (Rietveld et al., 2013;Okbay et al., 2016).A reanalysis of the Okbay et al. dataset revealed that the polygenic score also predicts general intelligence (3.6%) compared to 2% for the 2013 polygenic score (Selzam et al., 2016).Populationlevel polygenic scores (the average population frequency of alleles with a positive GWAS Beta) will be calculated to test the prediction that they explain variance in population IQ.Factor analysis will be used to extract a factor accounting for cross-population variation in allele frequency, hence representing a signal of polygenic selection (Piffer, 2015).Factor loadings will be examined to ascertain the reliability of the factor (i.e.do most alleles with positive GWAS effect load positively on the factor?). Predictive validity will be measured by computing the correlation between factor scores and population IQ.If alleles with positive GWAS beta (i.e.alleles with a positive effect on a trait within population/between individuals) load positively on a factor that is positively correlated to population IQ, this is interpreted as evidence of directional selection on the phenotype (educational attainment or related cognitive abilities).The robustness of the findings will be tested against a null model using simulations with random SNPs.Moreover, the issue of phylogenetic autocorrelation will be addressed: genome-wide genetic distances (Fst index) representing a neutral evolutionary model will be tested against a polygenic selection model.Autocorrelation is considered problematic because it violates the assumption of independent observations, hence leads to false positive (Type I) errors.Demonstrating that the alleles predict country-level differences in cognitive ability above and beyond that predicted on the basis of migration, drift etc, can be taken as evidence for the hypothesis that these differences have been caused by diversifying polygenic selection operating on these alleles.If scores computed from GWAS hits predict cross-population phenotypic differences independently of genome-wide genetic distances, then the result is more likely to be indicative of polygenic selection.

Genomes
Frequencies were calculated from VCF files belonging to the phase 3 data: ftp://ftptrace.ncbi.nih.gov/1000genomes/ftp/release/20130502/Rietveld et al. (2013) produced 3 SNPs reaching GWAS significance for educational attainment.Davies et al. (2016) reported 1115 SNPs reaching GWAS significance, of which 15 were independent signals for educational attainment.942 SNPs were found on 1000 Genomes.Among the 15 independent signals, one (2:48696432_G_A) was missing.Okbay et al. (2016) reported 74 SNPs associated with years of education.70 were found in 1000 Genomes (the other 4 variants were flagged because they had more than 3 different alleles).Population IQs for the 1000 Genomes populations were obtained from Piffer (2015).Country-level average IQs were obtained from Lynn and Vanhanen (2012).The Finnish and Vietnamese IQs were both adjusted upwards to 101 (from 97 in Lynn & Vanhanen, 2012), to account for recent, more accurate estimates of the Finish population (Armstrong, Woodley, & Lynn, 2014) and a recent reanalysis of the Vietnamese data (Rindermann, Hoang, & Baumeister, 2013).IQ for Tuscany was calculatedas the average between the IQs estimated from PISA Creative Problem Solving (Piffer & Lynn, 2014) and from PISA Math, Science, Reading.There were three populations (Chinese Dai, Gujarati Indian and Indian Telegu) for which no estimates of average IQ were available.These populations were therefore excluded from the analyses.Polygenic score refers to the average frequency of alleles with positive effect at the individual level (i.e.GWAS beta).PISA 2012 results for countries matching 14 of the 26 populations of 1000 Genomes were obtained (OECD, 2012;OECD, 2014).The average score of performance in Mathematics, Reading and Science was used.For the US, scores were broken down by race (Hispanics, African-American and White) and data for Puerto Rico were obtained from a separate report (OECD, 2014).
Statistical analyses were carried out using R (v. 3.2.3).Factor analysis used Ordinary Least Squares (OLS) to find the minimum residual solution (fa function in the R "psych" package) and factor scores were computed using the regression method.Fst distances (Weir & Cockerham, 1984) for Chromosome 21 and 1 (largest and smallest) will be utilized.These were calculated using VCFtools on the 1000 Genomes, phase 3 files.VCFtools is a program for working with Variant Call Format (VCF) files, like those generated by the 1000 Genomes Project (Danecek et al., 2011).

Factor analysis
The 17 hits from the two earlier GWAS (Rietveld et al., 2013 andDavies et al., 2016) were lumped together and one hit from Davies et al., 2016 (rs1906252) was removed because it was in LD with rs9320913 from Rietveld et al., 2013.This yielded a set of 16 LD-free SNPs.A factor analysis (function "fa", package "psych") was carried out using Ordinary Least Squares to find the minimum residual solution.The proportion of variance explained was 0.54.This factor was correlated to population IQ (r= 0.89).14/16 alleles loaded positively and the average loading was 0.494.
A factor analysis was carried out for 7 sets of 10 SNPs belonging to the 74 Okbay et al. (2016) independent hits (4 were missing).The number 10 was chosen for two reasons: 1) To follow the recommendation that the subject to item ratio be >2:1; 2) Because 70 (the total number of SNPs) is a multiple of 10.These were sorted by p value, with the first group having the lowest p value (i.e.highest GWAS significance).

Spatial Autocorrelation (SAC)
Spatial (phylogenetic) correlation was calculated using the procedure illustrated in a previous paper (Piffer, 2015), which was based (then unknown to the author) on the Mantel test (Mantel, 1967).Regression analysis applied to the Mantel test enables estimation of polygenic selection pressures (Piffer, 2015).A total of 325 pairwise comparisons were obtained for the 26 populations from the 1000 Genomes database utilized in the present study.Distances were calculated as the absolute value of the difference between population pairs on the selected variables (country IQ, GWAS hit metagene and polygenic score) (Piffer, 2015).Two matrices with N unique pairwise comparisons are generated, where N = n*(n − 1)/2.One matrix represents genetic distance, and the other matrix represents average population IQ.The country IQ variable had missing values, so a total of 253 distances was calculated.Fst distances were used as a measure of population differentiation due to genetic structure, which is based on the partitioning of genetic diversity within and between populations.
The vast majority of random SNPs all over the genome are believed not to be associated with specific phenotypes and therefore not to have been subject to selection.They rather represent population structure resulting from random genetic drift, migration, admixture, and similar processes.Fst is calculated as the ratio of genetic variance in allele frequencies between populations to the sum of variance between gametes within individuals and within populations (Holsinger & Weir, 2009;Weir & Cockerham, 1984).Regression of phenotypic distances on factor distances coupled with genome-wide Fst distances is carried out.If factor distances have an independent positive effect on the dependent variable (phenotypic distances), then the result is more likely to be indicative of polygenic selection.
The SAC-free effect size (standardized beta) of polygenic and factor scores on population IQ are reported in tables 2 and 3.

MCV
The method of correlated vectors (MCV) originated in psychometric research to assess the validity of items in psychometric batteries (Jensen & Weng, 1994).It consists of a meta-correlation, that is computing the correlation between two sets of correlation coefficients.Applied to GWAS results, the correlation between allele frequencies and the average phenotypes of different populations is computed; then, the resulting correlation coefficients are correlated with the corresponding alleles' genome-wide significance (Piffer, 2016).
MCV was applied to the 70 SNPs from Okbay et al. (2016): the vector of the correlation of each SNP's GWAS p value was correlated to the vector of the correlation between each SNP's frequency and population IQ (r x IQ) and the vector of the correlation with the factor extracted from the two previous GWAS.Negative correlations of the p value were found with r x IQ and with r x Rietveld_Davies factor (r= -0.26; -0.25).Smaller, non-significant correlations (although in the right direction) were found for effect size (0.079 and 0.087 for population IQ and R_D factor).
Four indicators of factor reliability were devised: 1) Average factor loading (mean loading of the 10 SNPs on the factor calculated from all 70 SNPs); 2) correlation to the factor scores obtained from Rietveld et al. (2013) and Davies et al. (2016); 3) Correlation with population IQ; 4) SAC-free Beta ("SAC Beta" for short).The values of these indicators are reported in table 4 for each of the 7 SNPs sets, along with their p value rank.Table 5 reports the intercorrelations between the accuracy measures and p value rank.

Table 5. Intercorrelations between the accuracy measures and p value rank
A novel measure of factor accuracy ("meta-accuracy") was calculated as the mean between the four indicators.In turn, the Spearman-rank correlation between the meta-accuracy vector and p value rank was computed.A negative correlation was found: r= -0.408.With the aim of validating the meta-accuracy measure, a meta-factor was created by factor analyzing the scores of the 7 factors.The factor loadings ("meta-loadings") were in turn correlated to the meta-accuracy vector, thus producing a "meta-Jensen coefficient".The correlation between the two meta-vectors was r= 0.969.There were substantial intercorrelations between the meta-factor, the Rietveld+Davies factor scores and IQ.SAC-control was applied to the meta-factor.This produced a very weak SAC-free effect (B= 0.097; Fst B= 0.508).
To extract a reliable estimate of polygenic selection, the average of the two factors ("average factor") was computed.The correlation between the average factor and population IQ was r= 0.858.

LD pruning
Cross-GWAS linkage was checked by feeding SNPSNAP with the list of 86 SNPs, with LD thresholds of 500kb and r= 0.5.In total, 8 SNP pairs were found to be in LD.One SNP was present in two GWAS datasets (rs9320913).
A list of replicated or pseudo-replicated (in LD across studies) SNPs was created, composed of one of the two linked SNPs (one for each pair) and the 8 SNPs in LD across GWAS (table 13).The polygenic scores from the linked SNPs are reported in table 6.The correlation between the two scores is r= 0.919.Frequencies of the replicated hits were also calculated for the 5 super-populations (i.e.races) of 1000 Genomes for both SNP sets.A boxplot is shown in figures 2a and 2b.The replicated SNPs were factor analyzed.The two factors were almost identical (r= 0.998).Finally, a list of cross-GWAS clumped SNPs was created by keeping only one SNP for each LD pair (e.g.rs12042107 (Davies) -rs1008078 (Okbay).Only the latter (rs1008078) was preserved).Obviously, the replicated SNP (rs9320913) was counted only once.This resulted in a list of "LD-clumped" (86-8-1)= 77 SNPs.LD-clumped polygenic score (LD clumping across independent hits from three GWAS.Preclumping N=86; Post clumping and overlap: N=77).
The LD-clumped PS had the following correlations with the other variables: r x IQ: 0.766; r x FactorRietvDavies: 0.835; r x Metafactor: 0.475.
The population IQ variable had some missing cases so the correlations are reported both with and without IQ (table 7a and 7b, respectively).
161 SNPs were found in 1000 Genomes.The polygenic score was computed (table 8).Its correlation to population IQ was r= 0.856 (scatterplot figure 3) and PISA 2012 (r=0.851).The polygenic score of the 7 hits that replicated with a genome-wide significant p value had a correlation r= 0.833 with population IQ and PISA 2012 (r= 0.681).

Method of correlated vectors
For the 161 significant SNPs in the meta-analysis, the correlations of p value and GWAS effect size with the IQ correlation (meta-correlation) were r= -0.009 and -0.048, respectively.

Spatial autocorrelation (SAC): clumped and replicated SNPs
Spatial autocorrelation analysis was run on the three scores.The effect size (Standardized beta) is reported in table 9.In order to show that factors formed from educational attainment SNPs are more coherent (i.e. have higher loadings) than factor formed from random SNPs, a simulation was carried out.100 sets of 10 SNPs matched to the top significant SNPs in Okbay et al. (2016) were obtained from SNPsnap (http://www.broadinstitute.org/mpg/snpsnap/).After removal of problematic SNPs (when frequency was 0 for a population, that population was not counted, creating mismatch between rows, hence these had to be removed).Among those, the first 200 sets (to speed up computation) of 10 random SNPs were chosen for a simulation.Factor analysis was iterated over each set.The average factor loading was 0.268 (SD=0.176).
The distribution was non-normal, with a strong positive skew (figure 5).This information was used as a baseline, null model to test against polygenic selection.

Factor scores
The correlations between the factor scores for the 200 sets of 10 random SNPs and population IQ were computed.The average Pearson's r was 0.22 (95% CI= -0.757; 0.823; 99% CI= -0.826; 0.886).Thus, the correlations between the factors (pseudoreplicated hits) and IQ (r=0.89) is significant according to the conventional p value (0.05).

Polygenic scores
The set of 7914 random SNPs was divided into 52 sets of 150 SNPs.N=150 was chosen because it is close to the number of SNPs in the pooled Okbay et al. (2016) sample.The average correlation between the polygenic scores for 52 sets of 150 SNPs and IQ was 0.467 (95% CI= -0.100; 0.817).The upper limit of the 95 % CI was almost identical to that obtained for the factor scores simulation (0.823).Hence, the correlation between the 152 SNPs GWAS hits polygenic score and population IQ (r= 0.863) is significant according to the conventional p value (0.05).

Meta-accuracy
The 161 SNPs were divided into 32 sets of 5. Factor analysis was carried out on each set.The average factor loadings for each set and the correlation of the factor scores to population IQ are reported in table 20.The average factor loading was 0.225, and the average corr x IQ was 0.112.
The correlations of factor loadings and corr x pop IQ with p value were r= -0.273 and -0.008, respectively.Moreover, the two vectors (factor loadings and corr x pop IQ) were intercorrelated (r= 0.223).
The top 4 significant sets were chosen because their loadings, p value and correlation with population IQ suggested the presence of stronger signal (average corr x population IQ= 0.83; average loading= 0.383).The correlation between population IQ with the average factor score and the polygenic score were r=0.923 and 0.867.

Controlling for baseline population differences in derived allele frequencies
At a theoretical level, an ancestral allele is the allele that was carried by the last common ancestor between humans and other primates whereas an allele is derived when it arose in the human lineage after the split from other primates.In practice, this allele is usually ascertained via comparison with chimpanzees.One limitation of this procedure is that if a mutation arose in chimpanzees after the split from humans, then the ancestral allele is not the chimp allele.Thus, 1000 Genomes infers ancestral alleles via alignment with 6 primate species (Ensembl, 2015).Allele status could be ascertained for 156 out of 162 trait-increasing alleles in the Okbay et al. ( 2016) pooled meta-analysis.Of those, 86 and 70 were derived (55.13%) and ancestral, respectively.Since this is not an equal representation, it creates a potential confounding factor.Average derived and ancestral allele frequencies for positive effect alleles (DAF and AAF, respectively) were computed.These indeed confirmed previous findings that non-African populations have higher frequencies of derived alleles (Henn et al., 2015).A new polygenic score was computed by averaging the two scores.This gives equal weights to derived and ancestral alleles, hence eliminating the potential bias due to different baseline levels of derived alleles among populations.The correlation between this "corrected" score and population IQ was r= 0.828.The derived and ancestral polygenic scores had opposite correlations to population IQ (r= 0.848 and -0.516, respectively).

Do "educational" SNPs population frequencies reflect selection pressures on general intelligence?
Piffer (2015) analysed 9 GWAS hits of educational attainment and IQ.Five of these were related to general cognitive function (rs10457441 C; rs17522122 G; rs10119 G; rs236330 C; rs17518584 T), as reported in two recent studies (Davies et al, 2015;Ibrahim-Verbaas, 2015) .The polygenic score is reported in table 1 (col.5).Its correlation to population IQ is r= 0.847.Its correlations to the four educational attainment polygenic scores (in the order reported in the table) are: 0.896; 0.822; 0.655;0.856.

Polygenic scores
The analysis was replicated using ALFRED (lhttps://alfred.med.yale.edu/),which contains more populations than 1000 Genomes (52 after pairwise deletion of missing values) but has smaller samples and much lower coverage.Thus, only a minority of the SNPs available in 1000 Genomes are present in ALFRED.In order to overcome this limitation, two strategies were used: 1) For the SNPs reported in Rietveld et al. (2013) and Davies et al. (2016), the full list of significant hits was searched, comprising also the non-independent hits.Then, the hits in LD (r>0.5) were removed.2) Okbay's 162 independent hits (pooled meta-analysis) were searched for those in LD (r>0.9) using SNPSnap.In total, 12 SNPs from Rietveld + Davies and 72 SNPs from Okbay et al. (2016) were found in ALFRED.
Polygenic scores were computed for both datasets and are reported in table 10.No valid IQ estimates are available for those populations, so this is not reported.The correlation between the two polygenic scores was r=0.781 (Figure 6).Seven racial groups were identified and average polygenic scores computed for them (table 11).Geographic distances for the ALFRED populations were obtained from an analysis of genetic variation using the HGDP-CEPH data, which showed that 77% of the genetic variance (Fst) could be explained by geographic distance between populations (Lawson Handley, Manica, Goudet and Balloux, 2007).There were 54 and 52 populations in the HGDP and ALFRED datasets, respectively.Of these, 49 were overlapping and were used for calculation of PS distances (N= 1177).Correlations are reported in table 12.The two polygenic scores were averaged.The correlation between the average PS and distance from AA is r=0.171,indicating little effect of neutral population history on polygenic scores (Fig. 7).

Discussion
The two largest GWAS of educational attainment, published in 2016, appear to confirm the population-level findings based on the first large GWAS of this kind (Rietveld et al., 2013).Specifically, higher frequencies of alleles with a positive effect on educational attainment are observed among East Asians than in Europeans, and the latter have higher frequencies than Africans.Moreover, the polygenic scores obtained from two independent samples are strongly correlated (table 2) to those obtained from the first GWAS (r= 0.68-0.88).
The frequencies of 4 SNPs associated with general cognitive ability in previous studies (reviewed by Piffer, 2015) are strongly correlated to the educational attainment polygenic scores.The genetic variants identified by three large genome-wide association studies (GWAS) of educational attainment were used to test a polygenic selection model.Strong inter-correlations among population-level polygenic scores of alleles found by three independent GWAS to be associated with educational attainment were observed.Moreover, these polygenic scores were substantially correlated to estimates of average population IQ (table 2).
The polygenic score of 161 SNPs that reached genome-wide significance in the meta-analysis by Okbay et al. (2016) of the discovery and replication samples (N =405,072) was highly correlated to population IQ (r=0.854).A similar correlation was produced when using only the 7 SNPs that reached significance in both datasets separately (r= 0.833).
Using the hits by Okbay et al. (2016), the method of correlated vectors revealed the presence of a "Jensen effect" (meta-correlation) of SNP p value on population IQ and factor from the two previous GWAS (r= -.25).That is, frequencies of alleles with lower p value (more GWAS significance and higher likelihood of being true positives) had stronger correlations to population IQ and a factor extracted from an independent dataset.Factor analysis produced similar estimates of polygenic selection strength for educational attainment across the three datasets.The SNPs from the largest GWAS (Okbay et al., 2016) were subset by p value (N= 7) and factor analyzed.Variables indicating reliability (factor loadings), predictive (r x population IQ and SAC-free Beta) and convergent validity (r x factor extracted from previous GWAS) were computed.The correlation with p value rank was in the expected direction in three out of four (except p value rank x SAC Beta) instance and of low magnitude.A composite index was created by calculating the average of the four variables.This index substantially correlated to SNP set's P value rank (r= -0.4).There was also a positive correlation between average factor loading and corr.x IQ, corr.x factor, corr.x SAC-free effect.
This result also highlights a potential problem, that is the high noise present in the data.Some of the SNPs with very small effects on educational attainment may have been subject to such minimal selection that chance rather than selection is the main reason for their present allele frequencies.In other cases, very small effects on education may indicate that their main phenotypic effect is somewhere else.There could be small effects on the risk of psychiatric and other diseases, for example, that were more important than the marginal effects on learning ability or interest in learning.
Factor analysis was carried out on the 7 factors belonging to the 7 sets.The loadings on this "metafactor" were strongly correlated to the composite index of factor accuracy (r=0.96).That is, the factors' independent correlations to measures of accuracy were in turn correlated to their loadings on the "meta-factor".The factor extracted from the hits of the first two GWAS (Rietveld et al., 2013;Davies et al,, 2016) survived control for phylogenetic correlation using Fst distances (test of the hypothesis that random factors such as drift or migration confound the results, see Piffer, 2015) quite well (B= 0.861), but among the Okbay et al. 2016 hits, only the top 10 significant ones had some residual predictive validity (B= 0.122).This discrepancy was also reflected in the average factor loadings, which were much higher in the former than in the latter (0.49 vs. 0.19).
There were substantial intercorrelations (r= 0.77-0.89)between the meta-factor, the Rietveld+Davies factor scores and IQ (table 7a).LD clumping was performed on the entire SNPs set from the three GWAS (N=86).This produced 77 LD-clumped SNPs, 1 SNP in common between two publications and 8 linked SNPs.The polygenic score calculated from the LD-clumped SNPs was correlated to population IQ (r=0.77) and to the factor (r= 0.85) and polygenic scores extracted from the 9 replicated/linked SNPs (r= 0.94 and 0.8) (tables 7a and 7b).The two PS obtained from the cross-publication linked/replicated hits, were strongly correlated to population IQ (r=0.82;0.9).The two factors of the replicated/linked hits were similarly correlated to population IQ (r=0.89).
All four vectors survived control for spatial autocorrelation by partialling out of Fst distances (table 9), reaching Betas= 0.25; 0.35; 0.79; 0.69).The PS obtained from the 161 SNPs of the Okbay et al. (2016) meta-analysis had "SAC-free" B=0.500.Another PS was created using 7 hits that reached genome-wide significance of p<5 x 10 -8 in Okbay et al. 2016's study using the UK Biobank replication sample and this had a similar correlation to population IQ (r= 0.83).
The correlation between polygenic, factor scores and population IQ exceeded the confidence interval obtained via simulations with hundreds of random SNPs.Factor analysis also produced more coherent factors than those extracted from random sets of SNPs.The average factor loadings for random SNPs was 0.268, but it was generally higher for the replicated loci or the hits from Rietveld et al. (2013) and Davies et al. (2016).However, the SNPs from Okbay et al. (2016) did not produce loadings higher than chance expectations.Both controlling for spatial autocorrelation (SAC) via regression analysis and simulations with random SNPs revealed that the high correlation coefficients between polygenic scores and IQ are inflated due to population history mechanisms (drift, migrations, etc.) unrelated to natural selection.That is, both random SNPs and Fst distances had substantial correlations to population IQ and to polygenic scores, although not as strong as the correlation between the latter two.Thus, the reported effects are overestimates and the magnitude of the real effect is probably closer to the "SAC-free" estimate ranging from 0.25-0.79obtained with polygenic scores (table 9).It should also be pointed out that it is most likely not the SNPs or the polygenes per se that explain the group differences in IQ.These are probably only indicators, picking up signal of differential selection strength, supposedly acting on many other genes, scattered across the genome, whose association with IQ is not known yet.A prediction is thus that as future GWAS will produce more hits, they will fit the pattern found by this study.This study is important because it confirms the prediction of an earlier one (Piffer, 2015) -which used fewer SNPs -that cognition-enhancing alleles identified by future GWAS would show similar population frequencies.The fact that the addition in the present study of over 150 SNPs to the original analysis carried out by Piffer using only 9 SNPs produced very similar polygenic scores is remarkable and suggests that the selection signal is robust and relatively independent of the variance explained by the polygenic score.However, since the SNPs themselves only contribute to a small % of phenotypic IQ, it is still possible that discovery of additional SNPs could substantially change the PS-group IQ correlations.
Factor analysis carried out on 32 sub-sets (N=5) of the 161 SNPs from Okbay et al. (2016) confirmed a tendency for lower p value (more significant) SNPs to have higher loadings (r= -0.23) and a stronger correlation of their factor scores to population IQ (r= -0.22).Moreover, the two vectors (factor loadings and corr x pop IQ) were intercorrelated (r= 0.27).Polygenic and factor scores computed using the top 20 significant SNPs showed very high correlation to population IQ (r=0.88;0.9).
Controlling for differing frequencies of derived alleles across populations due to drift or GWAS ascertainment bias and the overrepresentation of derived alleles among alleles with positive effect did not significantly alter the polygenic score nor its relationship to population IQ.A new polygenic score was computed by averaging the two scores.This gives equal weights to derived and ancestral alleles, hence eliminating the potential bias due to different baseline levels of derived alleles among populations.The correlation between this "corrected" score and population IQ was r= 0.851.
However, there is a possibility that giving equal weight to derived and ancestral alleles leads to overcorrection if intelligence-enhancing alleles tend to be derived.This is likely because the increase in intelligence during evolution of the Homo lineage has been favoured by mutations that arose after the split from the last common ancestor with chimpanzees.
The major difference was an attenuation of the African-Eurasean differences due to derived alleles being in general at lower frequencies among Africans, and derived alleles being also overrepresented among those with positive GWAS effect.Previous findings that non-African populations have higher frequencies of derived alleles were confirmed within this sample of SNPs (Henn et al., 2015).The replicated/linked SNPs exhibit a clear clustering of alleles across 5 major racial groups (figure 2), showing a pattern matching cross-racial differences in IQ: East Asians>Europeans>South Asians> Hispanics>Africans.
A replication of the analysis using 52 populations from ALFRED revealed a similar pattern (figure 6).Surprisingly, distance from the putative ancestral environment in Eastern Africa (Addis Abeba) was not correlated to polygenic score (table 12), suggesting that selection rather than drift due to geographic distance or isolation is responsible for the observed pattern (figure 7).This is reflected also in the lower polygenic scores obtained by Native Americans, despite their genetic similarity to East Asians.If genuine, this finding would suggest that selection pressures on higher IQ for East Asians date after the split from Americans ca.13Kya and are thus relatively recent.Lower population density in America could account for relaxed selection pressure, possibly mediated by lower social or sexual competition.Another potential issue raised by one reviewer is that "the extrapolation to non-European populations is still problematic because the accuracy of the polygenic score declines in such populations as a result of differing LD patterns (Scutari et al., 2015)".In fact differences in LD should simply reduce the frequency differences at the tag SNPs between populations, compared to the real causal SNPs.This is due to a phenomenon called "attenuation".Indeed, correction for attenuation is used "to rid a correlation coefficient from the weakening effect of measurement error (Jensen, 1998).This scenario works in the case that the frequency differences between tag and causal SNPs are due to random error, so that the mean frequency of the cognitive ability alleles is equal to the (genome-wide) background frequency (which for a mathematical reasons, is 50%).If a non-European population's tag SNPs (GWAS hits) are at frequencies lower than 50%, LD decay implies that the causal SNPs are at even lower frequencies because attenuation will push frequencies up towards the baseline frequency of 50%.Conversely, if tag SNPs are at frequencies >0.5, then the frequency of causal SNPs will be underestimated by the tag SNPs.Hence, controlling for LD decay would actually increase frequency differences when non-European populations exhibit frequencies that markedly differ from 50%.For example, the average frequencies of the replicated sites are all markedly lower than 50% (CEU= 38%; Chinese: 39%, Nigerian: 26%), suggesting the causal SNPs are at even lower frequencies, and this effect will be stronger for populations whose frequencies are farther away from the mean of 50%.Future studies should examine which of the SNPs predict educational attainment in non-European populations, because the linkage phase between tag SNP and causal SNP will in some cases be different in different populations.In some cases, the causal SNP may even be monomorphic in some parts of the world, especially if it arose recently.Initially, this means that comparisons between populations should be limited to those SNPs that reliably predict educational attainment or related traits in all populations.In addition, different linkage phase in different populations is not only a nuisance factor that limits the reproducibility and usefulness of reported associations.It is also a great tool for trying to find the causal polymorphism among clusters of SNPs that may have very high linkage disequilibrium in any one population or haplotype.
Another potential problem is that of population stratification in the original GWAS.Population stratification refers to the "allele frequency differences between cases and controls due to systematic ancestry differences" (Liu et al., 2013).Obviously controlling for population stratification is important because if the putatively causal SNPs were actually ancestry markers, the crosspopulation differences in allele frequencies would be due to population stratification signal picked by the GWAS, instead of genetic variants which are causally related to the phenotype.Fortunately, the GWAS publications dealt with the issue of population stratification, and identified stratification effects that were small in magnitude, about 8% of the signal due to population stratification and not to polygenic signal (Okbay et al., 2016).Okbay et al. (2016) also employed within-family analysis, which is a more direct approach to isolate polygenic association signal from population stratification but is not powered for individual SNP association testing.
Overall, this study provides strong evidence that there has been recent polygenic and diversifying selection on educational attainment, hence producing different levels of cognitive capacity and other traits related to educational attainment among populations.The fuzzy nature of the results is expected given the highly polygenic nature of the trait, the small effect sizes associated with each SNPs and the likely presence of many false positives.Pleiotropy, random genetic drift, different linkage equilibrium patterns and differential distribution of ancestral and derived alleles are also expected to increase the noise in the data.Given all these factors, it is remarkable that a relatively clear pattern emerged, especially when focusing on the associations produced by the largest sample or the replicated SNPs and loci.
A limitation of this study is the reliance on GWAS hits for a complex phenotype such as educational attainment, which shares the majority of additive genetic variation with general intelligence, but also other personality and health-related traits (Krapohl et al., 2014 and2015).It is possible that there are other SNPs affecting cognitive variables unrelated to educational attainment and these may not necessarily be subject to the same selection pressures.Future studies should carry out "deep phenotyping" with SNPs from each chromosomal region that has produced hits in the educational attainment studies, to find out which of them are related to IQ (and its components, such as long-term memory, working memory, and abstract reasoning), and which are related to non-cognitive traits such as ambition, self-control, preference for abstract thinking, or incompetence in practical pursuits.There are reports indicating that school success is as much related to preferences and personality as to intelligence, and that these non-cognitive traits, if properly measured, are as heritable as IQ (Kovas et al., 2015).
Another limitation is the reliance on estimates of population IQ as the phenotypic variable, which are not perfectly accurate, besides reflecting environmental and economic differences between populations.

Appendix
Raw data and code can be downloaded from: osf.io/z5nju

Figure 1a .
Figure 1a.Regression of population IQ on factor extracted from the Okbay et al. (2016) dataset.

Figure 6 .
Figure 6.Scatterplot showing the correlation between the polygenic scores obtained from three GWAS publications.ALFRED populations.r= 0.781

Figure 7 .
Figure 7. Scatterplot showing correlation between geographic distance from Addis Ababa and averaged polygenic scores.

Table 3 . SAC control for factor scores: Betas.
Factor scores extracted from Okbay et al. 2016 GWAS.7 sets of 10 SNPs sorted by p value and factor score extracted from Rietveld et al. (2013) and Davies et al. (2016).

Table 9 . SAC control for polygenic and factor scores.
Preprints (www.