3.1. Dimension 1
Dimension 1 (D1) refers to informational vs involved discourse. The positive pole of this dimension would be most typically associated with dialogues, which are rich in a language that focuses on interaction and expressing an affective content, rather than just delivering information [
11]. The negative pole of this dimension, on the other hand, is associated to information-rich and highly edited text, as it would be typically expected from academic pieces [
29]. Both research articles (RA) and reviews in our corpus have extremely low scores (
Figure 1A), which have remained relatively constant over time for RA, but have been slightly declining for reviews. To better understand how dimensions changed in our two article groups, we resorted to scatterplots that show the value of different items for RA and reviews over time. The scatterplot in
Figure 1B shows that RA with different Dimension 1 scores can be found for all publication dates, but older reviews tend to cluster around comparatively higher values that more recent ones, indicating a slightly more involved style for older reviews. To make the dimension score a bit more transparent and gain some further insight in the phenomena that have occurred in the literature, we decided to analyze the underlying linguistic features associated to D1.
A frequent use of
nouns and
long words is, unsurprisingly, a characteristic feature of texts with negative scores for Dimension 1, as they require planning in production and are less suited for improvised speeches and dialogues. Consistently with this, our analysis reveals that, over the years, both RA and reviews have further increased their Z scores for these features in a similar way (
Figure 2A, B), although no trend of any sort can really be detected for the
type/token ratio or the frequency of
attributive adjectives (
Figure S1), which are also typical for negative D1 scores.
Similarly, a movement toward an even more information-rich writing style can be detected based on the decrease in other typical features of involved texts, such as the
analytic negation (
Figure 2D), the use of
demonstrative pronouns, which can often have a deictic use, typical of spoken language and interaction (
Figure 2E),
private verbs (i.e. that express internal, cognitive and thus private processing, e.g. to think, to feel, to perceive etc.) and
be as the main verb (
Figure 2F). Interestingly, the use of
prepositions, which is typically high in texts with strongly negative scores of D1, has been constantly decreasing in both corpora (
Figure 3A), while
non-phrasal coordination, which is associated to involved writing, increased in both corpora (
Figure 3B).
RA and reviews, however, behaved differently in at least three features. The Z score for 1st person pronouns (a characteristic of dialogue, and, more generally, involved writing) is negative in both corpora, as expected, although there are a few -striking - examples of use as in:
It was my second clinical placement and I was working on a surgical ward when I was asked to accompany a patient to theatre.
which is quite frankly an unusual style for academic prose, yet is found in our corpus.
However, the frequency of 1st person pronouns increased over the course of the 90s in RA, but remained quite constant in reviews, and only in the first years of the new century it started to increase in both text types. (
Figure 3C) The most likely explanation for this behavior is that, although passive verbs have been used abundantly in academic writing as a rhetorical device to highlight the detachment of the narrator from the events contained in the text and as a sign of objective observation [
31], the use of active verbs and 1
st person pronouns has been advocated in more recent times for the sake of clarity [
32] and has been observed to be on the rise in academic writing [
33]. It may be assumed that RAs were more prone to the use of 1
st person pronouns, as they often reported on the experimental activity of a research group, as opposed to reviews, which typically summarize the findings of other research groups, and thus this increased occurred earlier.
The use of
present tense verbs is strongly associated to the positive pole of Dimension 1 too (as it is very frequent in interactions between speakers) and, though generally low in both corpora, our data indicate that it is higher in review papers (
Figure 3D). A possible explanation for this discrepancy is that reviews often summarize the current knowledge in a certain area, using descriptive discourse, and may use the literature to draw conclusions that may be proposed as general rules, as in the following:
Primary care clinicians treat patients with cancer and cancer pain. It is essential that physicians know how to effectively manage pain including assessment and pharmacologic and nonpharmacologic treatment modalities.
where the use of the present tense is appropriate to convey the sense of lasting value that these conclusions have, while the purpose of RAs is usually to report on one or more experiments, which are situated in time and place, and are thus often described using past tenses as in:
During 8 observation days (with time delay of 10-14 days between each observation day), all adult patients hospitalized at an internal medicine ward of 4 Belgian participating hospitals were screened for AB use. Patients receiving AB on the observation day were included in the study and screened for signs and symptoms of AAD using a period prevalence methodology.
The use of present tense increased slightly in RAs in the 90s, and remained stable afterwards, while it started to drop in reviews around the same time. Any explanation is purely speculative at this stage and may be related to the increase of systematic reviews, which may be more grounded in the research articles they are based on, or to a purely stylistic change. Noticeably, the Z score for RA remains significantly lower than for reviews (
Figure 3D).
The use of
possibility modals is associated to positive Dimension 1 scores too, as these are often utilized to express subjectivity, a guess, which is a common situation in a dialogue context. However, they can also be found quite regularly in academic writing [
36], usually to express a hypothesis, as in:
Administration of thioredoxin may have a good potential for anti-aging and anti-stress effects.
Admittedly, the room for hypothesis, although a common and actually quite essential practice in the scientific method [
38], is quite limited in academic literature, given the need for evidence-grounded reasoning, hence the low z scores. Interestingly, the use of possibility modals has been moving in the opposite directions in the two corpora we analyzed: the Z score for this feature was slightly positive in reviews, as it could be expected in a text genre that used to be quite prone to drawing conclusions based on the reviewed data, but negative in RAs, underlying that assumptions and hypothesis were likely confined to few sentences in such texts. However, this index steadily decreased in the review group, reaching negative values in the last decade, maybe in association to the increase in systematic reviews, where the sometimes massive use of statistical tools might make guessing more rare, while it increased by almost 30% in the RA corpus, possibly in association to the use of a bolder, or more personal style, as previously noted (
Figure 3E) [
33].
3.2. Dimension 2
Dimension 2 (D2) is associated to narrative discourse, so a positive score for this dimension indicates that the text has a narrative, active, event-oriented nature rather than a more descriptive or static quality [
11].
Our corpora of RAs and reviews have a negative Z score for D2, with reviews having a lower score than RAs (
Figure 4A). This is not unexpected, as RAs more likely report, by definition, on the execution of one or more experimental procedures, which are usually associated with some sort of activity, as in:
We investigated expression of the five ssts in various adrenal tumors and in normal adrenal gland. Tissue was obtained from ten pheochromocytomas (PHEOs)….
The text is all about action, about doing, selecting, analyzing and other similar activities, which are historically situated and hence require a narration to go through them.
Interestingly, the Z score for RAs tended to progressively decrease and become more negative with time, while the D2 score for reviews remained constant, and even increased – becoming less negative - in the last 5 years (
Figure 4A). This is also reflected in
Figure 4B, which shows that RAs’ D2 score dropped in the 90s and early 2000s and reviews’ score started to increase quite independently from RAs in the first decade of the years 2000. This trend is possibly justified by the change in Z score for the use of
past tense verbs, which has the highest bearing on D2 [
10]. This score, which was and has remained negative in both corpora for the whole timeframe (
Figure 4C), decreased in RAs until the first decade of the XXI century and it was followed by an increase in this score for review articles in the last two decades.
This means that an abstract from a review article in 1989 could more easily contain a passage like:
Several lines of evidence indicate that platelet-activating factor (PAF-acether) is implicated in hypersensitivity reactions. Indeed, PAF-acether reproduces the features of asthma in vivo and in vitro, since it induces bronchoconstriction, hypotension, and hemoconcentration and activates platelets and leukocytes.
which is rich in present tense verbs that are used to convey general principles about a phenomenon, e.g. a disease or a condition, while a more recent review text more easily incorporates some past tenses, such as in the case of:
Mammalian neonates have been simultaneously described as having particularly poor memory, as evidenced by infantile amnesia, and as being particularly excellent learners.
This phenomenon could be hypothesized to indicate that, since the early 2000s, review articles have been tending to circumscribe their conclusions to the research papers they use as sources, contextualizing them, and possibly being more wary of generalizations.
Other important linguistic features associated with D2 underwent similar changes in both corpora: the use of
third person pronouns increased for both text types (
Figure 4D), as did the use of
present participial clauses (
Figure 4F), while the frequency of
perfect aspect verbs decreased in both RAs and reviews, although the scores for this feature remained significantly lower in RAs than in reviews (
Figure 4E).
Noticeably, these findings are also apparently in contrast with what we reported on the same corpora using LIWC 2022 [
19]. In particular, we reported a higher
Narrativity Overall score for reviews. That score was calculated based on the adherence to a peculiar metrics, i.e. the three fundamental narrative curves that were measured in each abstract, namely Staging, Plot Progression and Cognitive tension [
42]. The theory behind these measures is that a narrative trajectory can be traced in a text, which follows Freytag’s dramatic arc: first the stage for the action is set, characters and referents are introduced and presented; the action then begins and as the text progresses it intensifies, as the narrator described events and activities; and cognitive tension refers to the struggles and conflicts that ensue in the story and that reach a culmination point with the resolution of the crisis that leads to the end of the narration [
43]. To get an automated measure of these features, Pennebaker et al. decided to rely on grammatical words, which admittedly form a small set of words in English (and any language) [
44]. In particular Boyd et al. proposed to measure the frequency of articles and prepositions as proxies for the staging score, because they can be assumed to be more abundant when new referents are introduced in the text (via articles) and their relations are explained (possibly also through the use of prepositions), while auxiliary verbs and anaphoric pronouns are taken as proxy measures of plot progression, because they can be expected to be used when describing an action . Cognitive tension is measured on the abundance of verbs in a special dictionary created ad hoc and that includes words as ‘think’, or ‘believe’ (which would be classified as ‘private verbs’ in Biber’s Multidimensional analysis. Boyd et al. recommend splitting the texts in at least 5 segments, to monitor how these scores vary as the text progresses. It is therefore apparent that LIWC 2022 and Biber’s narrativity scores are based on different features and the readers should appreciate the characteristics of the text these tools are actually measuring rather than getting hung up on the ‘narrativity’ label.
3.3. Dimension 3
A high score for Dimension 3 (D3) is associated to explicit and context-independent reference, as opposed to nonspecific, context-dependent content [
10]. This means that referents in the text are mentioned and described explicitly, so that there cannot be any doubt about their identity. According to our data, reviews have a higher D3 score than RAs, and both their scores have been progressively increasing over time (
Figure 5A, B). Among the features that affect D3,
nominalization appears to have followed this trend and may be responsible for the visible changes in D3 over time.
Nominalization [
45] indicates the replacement of a verb with a noun that denotes the same action, and is a common feature of technical language [
46], which is often used to convey a more impersonal tone, because a noun, by describing an action as an entity, detaches it from the agent and confers it a higher independence [
47]. The use of nominalization, albeit often deemed undesirable [
48], has been growing in academic writing [
49]. An example of nominalization in our corpus could be the following:
Pancreatic cancer (PC) is characterized by high tumor invasiveness, distant metastasis, and insensitivity to traditional chemotherapeutic drugs….
Phrasal coordination is also positively associated to D3, as it may be associated to a higher degree of descriptivity and more thorough explanation of textual referents, and, similarly to nominalization, displays a similar trend. An example of phrasal coordination in a manuscript with a high score for this feature is:
.. the specific mechanisms are blurry, especially the involved immunological pathways, and the roles of beneficial flora have usually been ignored.
3.4. Dimension 4
Dimension 4 is associated to overt expression of persuasion [
11], not only referring to the writer’s opinion, but also the quality of texts to prompt readers toward a certain course of action. Both our corpora have a negative score (
Figure 6A), which indicates that both RAs and reviews from our corpus tend to be non-persuasive, which is in line with the declared function of biomedical literature, as previously stated elsewhere [
29]. Unsurprisingly reviews tend to be less negative than RAs in regard to D3 score. This is easily explained by the fact that reviews, by nature, provide readers with an overview of facts and knowledge that can be used to trace recommendations or guidelines. However, the D3 score changed over time, and while RAs have been mostly stable over the years, displaying a slight trend for D3 to increase by about 10% over the course of the last 30 years, reviews have further decreased this score by the same amount in the last decade (
Figure 6B), signaling a movement toward a more impartial stance in review papers. Among the factors that may have affected these changes, the use of
infinitives has been increasing in both corpora in a similar way (
Figure 6C), such as in:
Understanding the age-dependent neuromuscular mechanisms underlying force reductions … allows researchers to investigate new interventions to mitigate these reductions.
Suasive verbs are, understandably, another hallmark of overt persuasion, as in:
…an ad hoc committee of the American Venous Forum, working with an international liaison committee, has recommended a number of practical changes.
Their frequency, quite similar in both manuscript types, has however been decreasing steadily over the years (
Figure 6E), consistently with that more neutral stance we mentioned above. However,
prediction modals, which have quite a high bearing on this dimension, though displaying quite a high variability in our corpora, have mostly changed for RAs (
Figure 6D) and a slight increase can be observed, while the use of
split auxiliaries has changed for reviews only in the last decade (
Figure 6F).
Prediction modals include forms like will, should, or must, which indicate the future directions that research or practice should take, as in:
The data suggest that treatment of H. pylori infection should be considered in children with concomitant GERD.
3.5. Dimension 5
Dimension 5 refers to the abstract or non-abstract nature of the information contained in the texts [
11]. As already reported, academic texts, including those form the biomedical area, tend to have high scores for D5, as they tend to contain technical, abstract concepts.
In our corpora, review papers score higher than RAs regardless of the publication date (
Figure 7A), although the D5 score decreased for both text types over the year, and the gap between the two groups vanished by the mid second decade of the 2000s (
Figure 7A). In the last 5 years D5 score appeared to increase again in reviews only (
Figure 7B). The frequent use of
passives is a hallmark of abstract style, as it typically mitigates the action of an agent (even more so if the passive is
agentless). These two indices – passives with a “by” agent and agentless passives - have been decreasing in both text types (
Figure 7D, E), presumably driving the trend of the overall D5 score. The use of
conjuncts, however, has increased both in reviews and RAs, and this increase has been quite sudden in the last 5 years for reviews, which might explain the surge in D5 score in that timeframe.
Our analysis indicates that, when considering a sample of more than 1.2 million abstracts from the biomedical literature, published in MEDLINE over the last 30 years, we can notice a consolidation of the informational tone of the texts (D1), which occurs in both RAs and reviews. This is combined with a decrease in the use of narrative devices (D2), a change that is most marked in the RA corpus, and a parallel increase in context-independent stances (D3) of both RAs and reviews. The relative lack of overt persuasion (D4) in the academic texts we examined has remained relatively stable over the years, while the degree of abstractness has been decreasing, concomitantly with a decrease in the use of passives. When RAs are compared to reviews, it is apparent that RAs used to rely on narration more heavily than reviews but have toned down the use of this stylistic devices, to a similar level to reviews., while this latter manuscript type used to have a higher degree of content-independency, overt persuasion, and abstractness, which has maintained over the years.