Differences in citation patterns across areas, article types and age groups of researchers

The evaluation of research proposals and academic careers is subject to indicators of scientific productivity. Citations are critical signs of impact for researchers, and many indicators are based on these data. The literature shows that there are differences in citation patterns between areas. The scope and depth that these differences may have to motivate the extension of these studies considering types of articles and age groups of researchers. In this work, we conducted an exploratory study to elucidate what evidence there is about the existence of these differences in citation patterns. To perform this study, we collected historical data from Scopus. Analyzing these data, we evaluate if there are measurable differences in citation patterns. This study shows that there are evident differences in citation patterns between areas, types of publications, and age groups of researchers that may be relevant when carrying out researchers’ academic evaluation.


Introduction
A seminal contribution to the measure of scientific impact is the one proposed by Eugene Garfield, who noted that the best way to follow the life cycle of a scientific article is through its citations. He claims that a relevant article for a community is frequently cited, so by tracking its citations, we can measure its impact [1]. Garfield extended the notion of impact of an article to the impact of a journal, introducing the journal impact factor (IF). Despite their wide use in research evaluation, metrics based on citations have criticisms and limitations. Moed highlighted the importance of the use of more complex measures that consider differences in citation patterns and dynamics between disciplines [2]. The problem with the use of a unique index that will not consider the specificities of each discipline produces distortions at different levels of research evaluation, both for projects, people, and even institutions [3].
When evaluating individuals, citation-based indices at the researchers level take an even greater role [4]. Hirsch [5] introduced the H-index. An H-index equal to h indicates that the researcher authored h papers with at least h citations. The H-index avoids the evaluation of long lists of uncited articles, determining which of these papers have had an impact on the scientific community. However, since citation patterns between disciplines are different, in some disciplines, the evolution of citations is much slower than in others [6]. Accordingly, to make a fair comparison, citation-based indices as the H-index [5] should be observed in very long time windows. These delays in citation patterns discouraged their use in the evaluation of young researchers.
One of the crucial points in the criticism of the scientific impact indicators is if they have predictive capacity [7]. Robien [8] showed that several of the most relevant works in Chemistry did not come from journals with the highest impact factor. The same observation was made in entrepreneurship journals [9]. Finardi [10] also observed something similar in a study that covered more disciplines. Vanclay [11] showed that the IF is based on a sample that may represent half the whole-of-life citations to some journals and a small fraction of the citations accruing to other journals. Since citation patterns across disciplines have different dynamics, the IF negatively impacts disciplines with slow dynamics. Other factors, such as the impact of self-citations, have also shown that some journals show abrupt changes in their IF [12]. In addition, journals that publish reviews tend to be highly cited. In contrast, journals that publish in very specific topics often have low citation impact [13].
Variants of the IF or H-index have been proposed to overcome their limitations. For example, Braun et al [14] proposed weighting the effect of citations on the H-index computation, incorporating the journal's IF, giving high weight to citations that come from journals with a more significant impact. Balaban [15] proposed the opposite, giving high weight to citations that come from low IF journals since there would be more exceptional merit in receiving citations. Bergstrom [16] introduced the eigenfactor score, which homologates the concept of prestige with the stationary probability of a random walk in a citation graph. An extension of the eigenfactor is the Article Influence Score (AIS), which calculates the average impact of articles in a given journal, measuring impact in terms of eigenfactors. To infer prestige, agnostically at the scale of a community, González-Pereira et al [17] introduced the Scimago Journal Research indicator (SJR), which ranks journals based on citation weighting schemes and eigenfactor centrality. Moed [18] introduced SNIP (Source Normalized Impact per Paper), which is a ratio of a journal's citation impact and the citation potential in its subject field. Liu and Fang [19] detected the differences between citation-based indicators, comparing IF with 2-year windows (IF2) with IF over five years (IF5). The research shows that IF2 has much more significant changes in terms of quartiles than IF5. IF5 was also found to have a strong correlation with AIS, having a better predictive ability of paper influence than IF2.
Whatever the metric used to evaluate, it is essential to get citations [20]. The need to have a recognized track record is a crucial element to access competitive funds and promote scholars [21]. The scientific environment is becoming increasingly competitive. In this context, delays in a publication, measured as the time elapsed between reviews, acceptance, and definitive publication, affect research evaluation [22].
We introduce a citation-based study that elucidates differences between areas, types of publications, and researchers' age groups. This article expands the scope of previous studies, measuring the cross effect of citations patterns between areas and types of publications. In addition, we study differences in citation patters in terms of age groups of researchers. By extending the domain of analysis, we will find new citation patterns that have not been clearly evidenced in the related work. To achieve this goal, we conducted a data collection process in Scopus, including different types of publications by researchers from a broad spectrum of areas. The study will elucidate the existence of differences in citation patterns between areas, conditioned by the type of media considered for publication. We will show that these patterns have clear differences according to the age of the researchers. The conclusions of the study will indicate the need for indicators that consider different media of publication incorporating the age factor when evaluating the scientific outputs of a researcher.
The contributions of this article are: -We study differences in citation patterns considering a large volume of data that covers different areas, types of publications and age group of researchers. -We detect differences in citation patterns conditioned on areas and types of publications. -We detect differences in citation patterns between areas conditioned on age groups of researchers.
The article is organized as follows. Related work is discussed in Section 2. In Section 3, we present the materials and methods used in this study. Section 4 presents differences in citations patterns between areas and types of publications. In Section 5, we study differences in citation patters across areas and age groups of researchers. Finally, Section 6 highlights findings and future work.

Related work
Lariviere et al [23] evaluated the concentration of papers and funding resources at the researchers' level, showing the existence of differences between areas. The study highlights a higher concentration of research funds, publications, and citations in Social Sciences and Humanities than in other fields. Nederhof et al [24] detected that the publications in Space-Life Physical Sciences are cited initially much lower than similar ones of groundbased research, showing that the use of short-citation window (1-4 years) is detrimental in impact assessment. The study states that bibliometric indicators should be calibrated to provide optimal monitoring of the impact. Prins et al [25] detected that some areas had a greater diversity of media in which they produced scientific outputs. In these areas, the use of Google Scholar could be beneficial to improve the coverage of publications, offering advantages over databases such as WoS, which offer a lower diversity of publication media.
Studies of differences in citation patterns according to type of publication have strongly focused on measuring the impact of conferences in Computer Science. Freyne et al [26] shows that Computer Science emphasis on publishing in conference proceedings hurt research evaluation when this evaluation is based on WoS indices. Vrettas and Sanderson [27] corroborates that Computer Scientists value conferences as a publication venue more than any other area. The analysis also shows that only a few conferences concentrate on many citations, while the mean rate per paper is higher in articles than in conferences. Glänzel et al [28] shows that conference proceedings are a valuable source of information for research evaluation in Computer Science. Furthermore, the study shows interesting international collaboration patterns triggered by these types of events not observed in other areas. Meho [29] assesses the quality of Computer Science conferences matching the citescore ranges of the top quantile journals. The study shows that top conferences make up 30 % of top-quartile publications. Thelwall [30] compares Scopus citations with Mendeley reader counts for conference papers and journal articles in eleven computing subfields showing high correlations between both counts. Kochetkov et al [31] applies a methodology similar to that of SJR to rank conferences in Computer Science. The study shows that the top conferences have similar SJR values to those reached by the top journals in the area. Li et al [32] study the impact of conference rankings in Computer Science using regression analysis. The study determines how these rankings have generated changes in the citation patterns of Australia and China. The study shows that the researchers align themselves with the rankings used in their country, modifying their publication patterns to fit these rankings better. However, Qian et al [33] shows that journals and conferences' relative status varies significantly between different areas of Computer Science. Recently, Yang and Qi [34] show that book publications are more relevant than conference publications in most areas. They also show that productivity indicators based on citations fail to capture these types of publications' relevance.
The skewness of citation distributions between areas has also received much attention from the community. Ruiz-Castillo and Costas [35] study the skewness of citations per area, comparing 27 different areas. The study shows that skewness is very similar across areas, and only a small part of the authors of each area are responsible for the production in this area. The most prolific authors dominate citation skewness. Radicchi et al [36] study that the probability of citing an article c times varies between areas. However, it shows that by rescaling the distributions by the average number of citations per article, a universal distribution of citations with high skewness emerges. Based on this index, the study introduces a variant of the h index that allows scientists from different areas to be compared. Albarrán et al [37] show that citation distributions based on 5-year observation windows are highly skewed. Rescaling these distributions, the level of skewness across areas is quite similar, showing that 10% of the most cited articles concentrate more than 40% of the total citations. By reproducing the study incorporating subfields, Crespo et al [38] shows that the differences between areas increase. At a finer granularity, more differences are detected between areas in terms of citation distributions. Bensman et al [39] studies the mean citation rate per article in Mathematics. The study shows that the citation rate is 4 of 18 highly skewed. Much of the publications that explain this phenomenon are due to many citations concentrated in Reviews. The study also analyzes the bibliometric structure of the area, detecting a weak central core of journals. The study hypothesizes that these differences with other areas would explain why methods based on citations do not apply to this particular area.

Materials and methods
To build the data repository we used for this study, we accessed information provided by Scopus. Scopus provides access to researcher profiles, in which lists of publications are displayed. To have a significant number of profiles of researchers, we built a profile crawler, using the rscopus library 1 . This library makes available functionalities that allow using the Scopus API, from which data can be accessed. Our crawler was built on top of rscopus, which invokes the Scopus API to get the data. The crawler uses a set of seed profiles from which it retrieves its collaborators, which in turn are used as seeds by the crawler. Iteratively, the crawler builds a repository of researcher profiles along with its publication lists.
Our crawler stores the profiles in a relational database, on which it asks if a profile is already stored. If it is not stored, it indexes it. It does the same with publications, avoiding duplicates. Scopus marks the areas of knowledge of the publishing media. A method of imputation maintains a system of classification of areas at the researchers' level. Our crawler downloaded all this data.
Each author's list includes descriptive fields such as the title, author names, name of the media, doi, publication date, and some descriptive keywords. An author id is stored for each author, provided by Scopus. The identifier of each publication is the Scopus id. A crucial method used in the study is the one that allows us to retrieve the references of each publication. This method establishes the relationship between the Scopus id of the publication and that of each of its references. We use these data records to count citations between the papers of the repository.
We conducted the data collection process from a set of profiles for each of the disciplines considered in the taxonomy of areas maintained by Scopus. This taxonomy considers 27 areas of knowledge. In each of them, we fed the crawler with a list of top cited authors.
To determine the seeds of each area, we looked for the most reputable journals by field, according to the impact factor, and in them, we looked for the most cited papers. From these works, we selected the top ten most-cited authors. The crawler ran for about three months collecting profiles from 111813 researchers. We stored a total of 4,504,779 citable document records in our repository. Our collection spans the years 1970 to 2018. A summary of our collection's basic statistics is shown in the Appendix.
Each publication media declares a list of related areas, in which Scopus maps 27 areas of knowledge. Then, Scopus classifies authors in knowledge areas, according to the journals and conferences in which they publish. These areas are attributed to researchers, accounting for the proportion of publications in each field. Accordingly, each investigator is classified in the area in which he/she records more publications. We discarded for this study areas with too few authors. We picked the areas with at least 3000 authors in our repository. Accordingly, the dataset used for this study comprises data records for ten areas from 95668 different authors, as it is shown in Table 1.
Scopus classifies the documents into 15 different types of publications. Four publication types (Conference Review, Report, Abstract Report, and Business Article) have too few documents in our dataset, totaling less than 600 documents. We discarded these publication types from the study. Accordingly, the resulting dataset has 4102168 documents classified into 11 types of publications. We show the number of documents per publication type in Table 2.  Citations are verified relationships of references between citable documents in our collection. It should be note that this track of citations is a sample of the total citations of Scopus since it corresponds to citations between documents in our repository. Our collection totals more than eight and a half million citations among citable documents. However, we discarded citations from or to areas not considered in this study. In total, we discarded 976349 citations, focusing the analysis into the 7534371 citations recorded between documents of the top-ten most salient areas of our repository.

Differences in citations patterns across areas and publication types.
An essential factor in this study is to conduct an analysis that distinguishes between the type of media in which the publications are made. We consider eleven types of publications, among which articles and conference papers are the ones with the most documents (see Table 2). Several of the other types of publications are related to journals. Review, letter, editorial, note, short survey, and erratum are types of publications included within journals. We decided to report them separately to distinguish the differences between these specific types of publications and regular articles. Something similar happens with chapter and book, two related types since chapter is a type of publication included within a book. We decided to report these types separately to distinguish the differences between them.
The number of documents according to each type of publication varies between areas. Table 3 shows that Reviews and Letters are more frequent in Medicine than in other areas, while Chapters and Books are more frequent in Social Sciences. A significant difference is observed in Computer Science, where conference papers are much more frequent than in other areas, totaling more than 60% of the publications in that area. Table 3 shows significant differences between areas. Articles are a majority in all areas except Computer Science, in which they cover 31% of the sample. On the other hand, conference papers cover 61% of Computer Science production, surpassing the coverage shown in other areas. Conference papers are also relevant in other areas, with 38% in Engineering and 19% in Physics. However, in the other areas its incidence is very low, with around 3% or even less.
Citations according to each type of publication vary between areas. Table 4 shows that Reviews, Letters, and Editorials receive more citations in Medicine than in other areas, while Chapters and Books are more cited in Social Sciences. This fact is explained because these types of publications are more frequent in these areas than in other areas. Articles receive more citations than the other types of publications, which is valid in all areas, even in Computer Science, where these publications are not the majority.
The volume of citations captured by articles is majority in all areas. For example, in Mathematics, articles capture 95% of citations, while in Physics, they capture 92%. Something similar occurs in Biochemistry, and Agricultural and Biological Sciences. Psychology Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 17 August 2021 doi:10.20944/preprints202108.0367.v1    shows a similar pattern, with 89% of citations captured in articles. In Engineering, the incidence of citations in articles is 70%, showing that 25% of citations go to conference papers. This difference is most important in Computer Science. In this area, 51% of citations go to articles and 43% to conference papers. Table 5 shows citations across areas. The quantities are shown as a fraction of the total amount of citations produced by a given area. For instance, the entry 0.12 from Medicine to Biochemistry indicates that 12% of the citations produced by Medicine go to Biochemistry. Table 5 shows that citations within the same area are a majority. We show the diagonal with a red color to indicate a higher prevalence of within citations and blue a less prevalence of this pattern. Second majorities, indicated with underlines, are essential to understand which areas are strongly related to others. Medicine and Biochemistry are firmly related, with the volume of citations from Biochemistry to Medicine being higher than the other way. Other strong relationships occur between Psychology and Medicine and between Agricultural and Biological Sciences and Psychology. The most substantial connection is from Social Sciences to Psychology, which delivers 35% of its citations to publications in this area. Curiously, this relationship is not symmetric since only 2% of Psychology citations go to Social Sciences. Table 6 shows the average number of citations per paper according to the type of publication and area. These results show that the expected number of citations per paper vary across areas. While the largest amount is observed in Physics Reviews, other types of Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 17 August 2021 doi:10.20944/preprints202108.0367.v1    Table 3. Despite this fact, this type of publication captures a great deal of attention in these areas, reflected in a high average number of citations. The average number of citations per article varies between areas, obtaining its highest value in Physics, with 4.2 average citations, followed by Biochemistry with 3.4. These numbers are very low in Agricultural and Biological Sciences, with only 0.29 citations per paper. In this area, the expected number of citations is higher in reviews. In general, the expected number of citations per paper in Agricultural and Biological Sciences is low, which is explained because many citations go to other areas. Interestingly, in Computer Science, the expected number of citations in conference papers reaches only 0.72, less than half of what the index for articles shows. This finding suggests that many conference papers receive very few citations, a fact that can be corroborated from the Zipfians that show the distributions of citations per paper, as was illustrated in Brzezinski [40], Moreira et al [41]. In this area, reviews also have a high average number of citations, with 3.32 citations per review. In general, the results show that the attention provided to each type of publication strongly depends on each area. In general, reviews receive more citations than other types of publications, and both conference papers and articles have very different indicators between areas. It is also striking that in Biochemistry, the expected number of citations in conference papers is very high, with 1.65 per paper, which is the highest in the repository. This fact occurs even though there are very few conference papers in Biochemistry (just over 1% of the total number of Biochemistry   papers, see Table 3), which indicates that although conference papers' production is low, this attracts much attention from the community.
Our repository records documents between the years 1970 to 2018. In this period, the volume of citations to articles has grown in all areas, as shown in Figure 1. However, the volume of citations between areas is very different. While Physics, Medicine, and Biochemistry show explosive growth in citations since the '80s, growth in Social Sciences and Agricultural and Biological Sciences has been slower. The differences in volumes are noticeable. The gap between the most cited areas and the rest is in one to two orders of magnitude (note that the figure uses a logarithmic scale). We can see that Figure 1 shows a decrease in citation growth from the year 2013 onwards. This fact is due to the delay produced by the dynamics of citations. As the sample ended in 2018, citation growth is captured up to five years earlier. This observation indicates that many papers' life span are more extensive than five years, on average, corroborating the recent finding of Liu and Fang [19]. Figure 2 shows that citations to conference papers have grown since the '80s. This growth begins in Engineering and Computer Science. Other areas show slow growth in the volume of citations to conference papers. For example, Medicine started growth in the late 1980s, as did Biochemistry and Physics. Even later is the growth in this type of citation in Chemistry (the '90s). In the other areas, this growth has been prolonged or, for all practical purposes, it is non-existent.
The growths on the curves in Figure 3 represent different events. While in Engineering and Computer Science, this dynamic represents a significant volume of the total citations, in the other areas, these quantities are much lower. We observe this fact in Figure 3  we show the fraction of citations to conference papers over the total number of citations per area and year. The figure clearly shows that citations' dynamics to conference papers are part of a phenomenon in Engineering and Computer Science. In contrast, in the other areas, this phenomenon is practically non-existent. The curves show slower growth in Engineering than in Computer Science. In this area, citations to conference papers reached almost half of the total citations in 2010. We analyze citation patterns in the top-1000 most cited papers per area. Figure 4 shows that many of these papers are articles. The figure also shows that some of these papers belong to other types of publications. For example, more than 200 highly cited papers in Medicine belong to other types of publication. Something similar appears in other areas such as Physics and Chemistry. This fact is explained by the high prevalence in these areas of citations to reviews and letters. Agricultural and Biological Sciences and Social Sciences also show highly cited papers in other types of publications, which is explained by the high prevalence of citations to letters, books, and chapters. The area with the least highly cited papers in other types of publications is Mathematics, where more than 900 highly cited papers are articles. On the other hand, Computer Science shows that several highly cited papers are conference papers, reaching almost 400 top papers. This quantity is the largest amount in this category in the entire repository. Curiously, the Engineering conference papers that are not articles are very few, indicating that even though there is a growth in the number of citations to conference papers, the most cited papers in Engineering continue to belong to journals.
The top-1000 papers per area span many years of the timeline. Figures 5 and 6 shows that in Medicine, many of the top-cited articles were published in the '90s and' 00s, with a distribution very similar to that exhibited by Physics, Biochemistry, and Chemistry. Mathematics and Psychology papers span along more years, ranging from the '70s to' 00s. Computer Science is the area in which top papers are located in a shorter timeline. The bulk of these papers were published from the late '90s and until the early 10s. This distribution is similar to Engineering, even though its first most cited papers are young. Psychology shows two periods with many highly cited papers, around the years 96 and 2000. The distributions of Social Sciences and Agricultural and Biological Sciences are similar. As for the highly cited papers of another type, such as conference papers, letters, and reviews, these show to be relevant in Medicine and to a lesser extent in Physics and Biochemistry. In Engineering, the impact of this type of paper is much lower. It is striking that these papers acquired a little more relevance in Social Sciences in the mid-'00s and to a lesser extent around the same time in Chemistry and Agricultural and Biological Sciences. The area where conference papers have had the most impact is Computer Science. In this area, its relevance from 2005 onwards is significant, surpassing even the articles. This trend is only observed in Computer Science, illustrating that this reality is unique to this area. We also analyze if there are differences between areas regarding the papers' life span, according to their activity in terms of citations. To do this analysis, we counted the number of citations received for each paper in our repository since its year of publication. Our repository spans almost 50 years. In Figure 7, we show the average life span of all the papers in our repository. These results show that the average life span of a paper begins in the same year in which it is published and acquires its most significant relevance in its third year of life. Then, its impact progressively wanes. Very few papers have a life span of more than 20 years, and as time progresses, the citations that most papers receive diminish until they are almost entirely extinct. To study the differences in life span according to citations between areas, we split the timeline into six five-year periods to compare papers published in close years in each  Figure 8 shows some interesting dynamics. While various areas in the 1980s followed a similar life span to Figure 7, showing a performance peak in the third year, Psychology and Social Sciences show a much slower growth curve. This effect becomes more noticeable in the '90s, where the curves of these two areas show growth during the 20 years of the life span while the rest of the areas' curves begin to decline after the third year of life. This finding suggests that the papers cited in Psychology and Social Sciences have a longer life span than other areas. This difference is also noticeable in the '00s. Most of the areas show their peak of citation in the third year, followed by a substantial decline, while Psychology and Social Sciences have a slower growth but a long life.

Differences in citation patterns across areas and age groups of researchers
We analyze the differences in citation patterns by comparing groups of researchers of different ages. For this purpose, we calculate each researcher's academic birth, defined as the year in which the author published their first work. Authors span many years of the Figure 9. The academic birth of the authors studied in this paper. Academic births are firmly located in the '90s and '00s. timeline, beginning in 1970. Figure 9 shows that many authors recorded their academic birth after the '90s. Most of the academic births are located in the decade of the '00s.
We separate the authors into three groups according to their academic birth. Seniors are those researchers whose birth is before the year 2000. Mid-age researchers are those born between the years 2000 and 2009. Young researchers are those born after 2009. The sizes of these groups do not have significant differences, which facilitates the comparison between them. The senior group has 42417 (44%) researchers; the mid-age group has 29923 (31%) and the young group 23321 (25%).
For each of these groups, we compute authors' life span, defined as the track of citations recorded over time. Each citation track records the total number of citations received by an author per year. We compute a life span of ten years for young researchers since these authors were academically born after 2009. The other two groups are analyzed using 20-year life spans. Figures ?? and ?? show the life span per area and age group. The solid line shows the mean life span, while the colored area indicates the standard deviation around the mean. The axes use the same scales for each age group to facilitate comparison between areas. The only area in which it was not possible to scale was Physics, which receives many more citations per author than the rest of the groups.
Authors' life spans vary widely between areas and age groups. While the young and mid-age life spans decrease towards the end of the timeline, the seniors show growth towards the year 20. This fact is attributable to the cut-off year in data acquisition. Some comparisons between groups are very interesting. Physics shows rapid growth in authors' life span, with an average of citations in young authors of more than 20 citations by the third year of life. This effect is observed in the other age groups, with a peak in mean citations in the tenth year of life in mid-age researchers and a sustained growth that exceeds 50 citations on average by year 20 in seniors. Physics is out of scale in this comparison because the volumes of citations in the area records are much higher than in the other areas. On the other hand, the other areas record comparable amounts of citations, which allows the comparison between areas on the same scale. It can be seen that the axes used in these figures are the same for each age group (except Physics).
Mathematics shows a strong fit to the mean, with a low standard deviation (see Figure  ??). This effect is maintained in the three age groups, indicating that very few authors have rapid growth in citations in this area. It is also noteworthy that the mean life span in Mathematics is very low, which indicates that most authors concentrate few citations. This fact is due to two factors. First, the volume of citations in Mathematics is low (see Table  4). Furthermore, the average number of citations per paper is also very low (see Table  6). Figure ?? shows that the life spans of Medicine and Biochemistry are similar, which may be because these two areas have a high volume of cross-citations. The Computer Science standard deviation indicates that some authors receive many citations and others receive very few. In this area, the fit to the mean is more significant than in Medicine and Biochemistry, indicating that authors with rapid growth in citations are scarcer than in these other areas. Figure ?? shows that Agricultural and Biological Sciences and Social Sciences also show a strong fit to the mean, with many authors receiving very few citations. The authors' low mean life span is because the volume of citations in these areas is very low (see Table 4) Table 7: Spearman's correlation between consecutive citation quinquenniums in each age group. The correlation in young researchers is measured between the first and second quinquennium; in mid-age, it is measured between the second and third quinquennium and seniors between the third and fourth quinquennium. and because many of their citations go to other areas (see Table 5). In the case of Agricultural and Social Sciences, only 0.3868 of its citations remain in the area, while in Social Sciences, the fraction of within citations is 0.429. The other areas show a weaker fit to the mean. Both Engineering, Psychology, and Chemistry show the existence of separate author tracks from the media. Chemistry shows more careers with a high volume of citations, with consolidation in mid-age around year 10 of the life span. In young researchers, the peak of citations is observed at year 4. The mean of citations in Engineering has a slower growth, which indicates that many authors receive few citations in this area. Psychology has a higher growth curve than Engineering, with a peak of citations around year 10 for mid-age researchers A key question in citation-based research is whether citations do indeed have predictive power. The ground on which citation-based evaluation is based is that citations received in the past are a good indicator of future citations. A question for this study is whether the citation data has the same predictive capacity in different areas. One possible scenario is that citation data has better predictive capabilities in some areas than in others. If this is the case, we may conclude that evaluation based on citations is more informative in some areas than in others.
We analyze the correlation of citations in each age group. A strong correlation indicates that the volume of past citations is a good indicator of future citations. To do this, and following the guidelines discussed by Liu and Fang [19], we add each author's total citations for five years. In the group of young researchers, we compare the correlation between the first and second five-year periods. We measure the correlation between the second and third five years of the life span for mid-age researchers. Finally, in the senior group, we measure the correlation between the third five-year and the fourth five-year period. Spearman's correlation coefficients are shown in Table 7. Table 7 shows that the correlation increases as the authors become more experienced. For the senior group, all areas show a correlation factor close to 0.8. The strong correlation in this group indicates that the careers of these researchers are already consolidated. Therefore, authors with low productivity will have low productivity in the future, and authors with high productivity will maintain their productivity. The academic assessment based on citations in this segment is very informative. This correlation decreases in the mid-age group in all areas. It is striking that this segment is where the correlations are lower, even lower than in young researchers. A plausible explanation is that many authors stop researching between the second and third five years of the life span, which causes the correlation between past and future citation to decline. In young researchers, the correlation between the first five years and the second five years of the life span is more significant than in the mid-age. In some areas, the correlations are much lower than in others. For example, in young researchers at Mathematics, the correlation is only 0.31, the lowest in the entire study. This fact indicates that the first five years of the academic life of mathematicians is very uninformative. In other areas, this life span period provides much more information. For example, in Social Sciences and Chemistry, the correlation exceeds 0.7. The reasons why these areas reach this value are different. While in Social Sciences, young researchers have a slow growth (cold start phenomenon), which explains the high correlation; in Chemistry, the first years of activity show rapid growth with a peak of citations between years 4 and 6.
To analyze how sensitive the evaluation is concerning the year of the life span in which it is carried out, we repeated the previous analysis using sliding quinquenniums. If the evaluation is carried out in year i, the five-year sliding period is set five years before and five years after i. Then we measure the correlation between the sum of citations for both five-year periods. In this experiment, we span the evaluation year through the first ten years of the life span. The results are disaggregated by age group in Figure ??. Figure ?? shows that there are differences between areas. Concerning young researchers, Physics, Biochemistry, Engineering, Psychology, and Medicine show that between years 3 and 4, the correlation between past and future citations reaches its maximum value. Computer Science shows that the first two years of the life span are very uninformative for an evaluation, but in the third year, the correlation rises to 0.6. Both Social Sciences and Agricultural and Biological Sciences need a longer time to make a fair evaluation, requiring an observation window of at least four years for a young researcher. In Social Sciences, the fourth year is very informative, with a correlation of 0.8. Chemistry is the area in which the evaluation times could be shorter since high correlations are already observed between past and future citations from the second year. Mathematics is the most challenging area for young researchers, with the lowest correlation values between citations. When analyzing the mid-age and senior groups, all areas show an improvement in the correlation between past and future citations. In fact, in most areas, the correlation increases with the year of evaluation.

Conclusion
We have introduced a citation-based study that elucidates differences between areas, types of publications, and researchers' age groups. The main findings of this study are: - Articles are a majority in all areas except Computer Science, corroborating the finding of Vrettas and Sanderson [27]. Conference papers cover 61% of Computer Science production, surpassing the coverage shown in other areas. The volume of citations captured by articles is a majority in all areas.
-Citations between areas show strong symmetric relationships between some specific areas (e.g., Biochemistry and Medicine) and a robust asymmetric citation pattern from some areas to other (e.g., Social Sciences to Psychology). Asymmetric relationships reduce citations received by articles in the same area.
-The average number of citations per paper has significant variations between areas and types of publications, corroborating previous studies [35]. The lowest amounts are seen in Social Sciences and Agricultural and Biological Sciences, contradicting the findings of Lariviere et al [23]. Reviews attract many citations, its performance in Physics being notable. Computer Science conference papers have a lower average number of citations than journal articles in this area. The citations to conference papers are only relevant in Computer Science, and only in this area an important fraction of the top-cited papers belong to this category. - The life span of a paper, measured from the track of citations it receives over time, reaches its maximum value in the third year of the publication's life. Psychology and Social Sciences' life span has slower growth than in the rest of the areas, suggesting that these papers have a longer useful life. - The life span of authors is very different between areas. While most areas show a weak fit to the mean number of citations, this indicator is more informative in some specific areas (e.g., Mathematics). The predictive capacity of citations, measured from the productivity correlation between consecutive five-year periods, is very informative in seniors and less informative in mid-age researchers. - The evaluation of researchers based on citations has different informative levels depending on the year in which the observation is carried out. If the evaluation is done in the early years of academic life, this evaluation is more informative in some areas than in others. As researchers gain experience, their citation-based evaluation is more informative in all areas.
This study has allowed us to identify some critical differences between areas, types of publication, and age groups of researchers that may be relevant when carrying out researchers' academic evaluations. Each area has its specificities, which must be taken into account when comparing scientific productivity. It is expected that the findings of this article will motivate further research that requires the development of more and better methodologies and metrics for the evaluation of scientific productivity, which should include the differences in citation patterns between areas, types of publication, and age groups of researchers.