Who is contributing to academic research on blockchain ora- cles? A bibliometric analysis

With the advent of smart contracts, the benefits of decentralization offered by distributed ledger technologies could be implemented in sectors other than cryptocurrencies, such as Healthcare, Supply Chain, and Finance. Smart contracts, however, need oracles to fetch data from the real world, which, on the other hand, do not offer the same characteristics of decentralization as blockchain. Despite their importance, research on oracles is still in its infancy, and academic contribution on the subject is scarce and sporadic. With a bibliometric analysis, this study aims to shed light on the institutions and authors that are actively contributing to the oracle literature with the aim of promoting progress and cooperation. The study shows that although there is still a lack of collaboration worldwide, there are authors and institutions working in similar directions. On the other hand, it can be observed that most of the areas of research are poorly addressed while others


Introduction
"Although oracles play a critical role……the underlying mechanics of oracles are vague and unexplored" [1]. The authors of the cited paper claim that despite the massive amount of money managed by oracles in DeFi platforms, their function and role are still widely neglected. A previous study also shows that despite the plethora of papers involving blockchain, less than 15% consider oracles, and an even smaller percentage further investigate related issues [2]. Blockchain oracles are a critical subject because the whole concept of blockchain applications revolves around the idea of decentralization and trustless transactions. Those pillars, however, are undermined when in order to gather realworld data, blockchain applications rely on centralized and trusted third parties [3]. This issue, either addressed as oracle problem [4], [5] or oracle paradox [6], [7], makes the community of blockchain enthusiasts quite skeptical of real-world applications [8]. Proposing a robust blockchain application against the oracle problem requires the redaction and discussion of the so-called "trust model," a document or scheme that broadly explains how data is fetched by oracles in a decentralized and trustless way [9]- [11]. Defining and adopting a robust trust model is not only essential for a blockchain application to properly work but is often considered as the key to mass adoption [12]. However, it is rare to retrieve academic applications that investigate oracles or propose blockchain applications with a detailed "trust model" [2]. Proposing a real-world blockchain application without deeply explaining oracles' management poses serious doubts on the feasibility and genuineness of the underlying proposal [13]. It could be debated then, proposals with a detailed trust model are more grounded and advanced than those who neglect the oraclerelated features and issues [4]. Therefore, it would be interesting and important to know which institutions are actively undertaking research on blockchain oracles and which ones are already implementing them in real-world applications. Scholarly interest in blockchain has resulted in a number of literature reviews on this topic, but none has yet undertaken a bibliometric analysis on blockchain oracles research [14]- [16]. A bibliometric Data Source: is the source from which data is collected and stored. It may or not be eventually used by a decentralized application. The data source can be a web API, a sensor, or a human aware of specific knowledge or event [32], [33].
Communication channel: usually referred to as "node", it has the task to collect the data from the data source and deliver it to the smart contract for it to be executed. Sometimes, oracle nodes coincide with blockchain nodes, but it is not always the case [27], [34], [35].
Smart Contract: contain the code that establishes how to manage the collected data. Usually, it foresees specific quality criteria for data to be accepted or rejected. If necessary, it may also perform computations to extract appropriate value for the contract [1], [36], [37]. Those three oracle parts are not always separated by each other's as sometimes the same entity may also cover two or three roles at once. A human, for example, can serve as a data source and communicate the data directly to the smart contract [38], [39]. On the other hand, it is possible and desirable that more than one entity covers the role of data source/node. Therefore, the smart contract could eventually execute even in case of unavailable data source or node malfunction [40]. Depending on how those three parts are organized and interact, multiple types of oracles can be designed (centralized, decentralized, computation etc.) [12]. The above-described oracle ecosystems are typical of blockchains where smart contracts can be executed (e.g., Ethereum, Tron). Oracles are instead implemented differently for blockchain, such as bitcoin, where smart contracts (apart from few scripts) are not available. If smart contracts are unavailable, oracles are implemented through M-of-N multi-signature wallets, requiring more than one signature to broadcast a transaction [41]. Therefore, the owner of a key plays the role of an oracle and executes the transaction when a certain condition is met. For example, suppose an agreement sets a payment upon the delivery of a parcel ( Figure 2). In that case, the owner of the oracle key signs the transaction once he acknowledges the events have happened. Of course, it requires a considerable amount of trust in the owner of the oracle key [5]. However, a thorough explanation of all oracle types is beyond the scope of this study, and further information can be found in a recent book or dedicated papers and web articles [12], [31], [39], [42]. For the scope of this study, It is instead essential to note that the oracle ecosystems work differently from blockchains; therefore, characteristics such as immutability, transparency, and trustless execution are not ensured [43]. In few words, when applications run on a blockchain and need data from the external world, oracles characteristics are to be taken into serious consideration when evaluating the quality of the blockchain application. If the data source is unreliable, the node is not trusted (or private), and the smart contract code is bugged, the fact that an application runs on the blockchain is practically irrelevant [4], [5], [44].
This condition, widely explained by blockchain experts such as Andreas Antonopoulos [45], and named by Dalovindj [46] as "the oracle problem", must be considered when interacting with real-world blockchain applications such as Supply Chain, Healthcare, Academic Transcripts, and so on. Of course, as discussed in recent research, the oracle problem determines different consequences according to the specific application [4]. In the healthcare sector, the presence of oracles would constitute another possible source of data breach for patient records to be altered or stolen for malevolent operations [47], [48]. In the DeFi sector, the dependency on oracles makes decentralized applications rely on centralized or insecure data sources, putting millions of dollars in invested capital at risk [49], [50].
In the traceability sector, the blockchain technology was in principle proposed relying on the misconception that since it is possible to trace in a secure and trustless manner the origin and movement of a cryptocurrency on the blockchain, it would have been possible to do the same with a tangible asset such as food, clothes or medicine [45]. However, the dependency on oracles for real-world applications makes it unlikely to reproduce the same level of tracking accuracy, and only a few traceability projects show some robustness against that issue [11], [43].
Due to the oracle problem, numerous critics and concerns also arise for other blockchain applications such as IPRs management, Law, resource management, etc [28], [51]- [53].
For all those applications to run genuinely decentralized and trustless, oracle ecosystems should be structured to ensure the same characteristics as blockchains. However, unlike blockchain technology that has a history and development of nearly thirty years (considering the work of Haber & Stornetta [54] as its precursor), oracle ecosystems are relatively new and unexplored spaces with few actors and scarce literature contributions [2]. The present study aims to shed light on the academic contributions concerning blockchain oracles promoting the harmonization of literature, cooperation, and progress.

Methodology
In order to answer the research questions, an appropriate methodology needs to be followed and outlined. As the aim is to make this research reproducible to ensure other authors can verify the genuineness of the results, all the followed steps will be thoroughly explained. Building on prior bibliometric analysis [55], [56], the methodology description will first involve database selections, inclusion and exclusion criteria, and finally, data extraction variables. Regarding the data collection, the intention is to include as many articles as possible, as long as they are of academic nature. Therefore, grey literature such as a whitepaper, opinion posts, and news is not considered within this research. On the other hand, although non-peer-reviewed, this analysis will also consider preprints. Following, Buttice and Ughetto [23], and Martinez-Climent et al. [57], the selected databases were Scopus and Web of Science, but also Google Scholar was queried. As the analysis also comprises preprints and unpublished manuscripts, limiting the research to Scopus and WoS would not have been a coherent choice. For the three databases, the research was run on 28 th June 2021. Using Blockchain and Oracle as keywords in TITLE-ABS-KEY of Scopus database, 205 articles were identified. On the WoS database, two strings had to be implemented in the Topic section so that articles also containing the word "oracles" were included and identifiable. The research returned 119 Results. The Google Scholar database was queried using the same keywords as Scopus, but it returned more than 10,000 entries due to its structural differences with Scopus and WoS. For that reason, and due to saturation of results, the author decided to stop the research at the 30 th page (300 entries organizing results in ten per page). Table 1 outlines the databases and research strings. Concerning excluding criteria, the underlying idea is to be as inclusive as possible, so no language or timeframe was considered for restrictions. Of course, unrelated or offtopic papers were to be excluded. In order to do so, the abstract and introduction were read, and it emerged that many documents were included in the sample for mentioning "random oracles" or "test oracles," which are not the oracles this article refers to. On the other hand, other papers were included for mentioning the Oracle company that again has no relation with the subject of this study. Following this approach, 105, 55, and 156 articles were removed respectively from Scopus, WoS, and Google Scholar samples. Since grey literature is also retrieved from the Google Scholar sample, we had to remove other articles (6), not written by academics and not published in official venues. The three samples were then merged, removing duplicates obtaining a selection of 203 entries. The sample obtained with the above-mentioned procedure included papers with blockchain/oracle keywords that refer to the communication channels between the blockchain and the real world. However, the author intends to enlighten the papers that not only mentioned oracles or explained their use, but also offered a real contribution to the oracle literature. To further skim the results from unrelated papers, all the PDF articles were downloaded and inspected one by one with a word processor. Therefore all the occurrences of the word "oracle" were evaluated and contextualized. As expected, it emerges that many articles mention the oracles but didn't offer a contribution to the oracle literature as their scope and role were not further investigated. Following this approach, near half of the sample was discarded so that it was reduced to 111 entries. Table 2 summarizes the followed methodology. Final sample 111 *Organizing results in ten per page and stopping the research at the tenth.

Data Extraction
Appropriate extraction variables (displayed in table 3) had to be identified to extract as much value as possible from the selected sample. Despite the fact that this is the first bibliometric analysis on blockchain oracles, since the aim of bibliometric analyses is relatively homogeneous, it is arguable that extraction variables could be taken from similar papers investigating other literature [23], [57], [58]. The year of publication is considered to place the literature in a specific timeframe, while the type of element will show the most usual outlet for those publications. Authors, institutions, and countries of provenance will geographically contextualize the research enlightening the contributors to the academic advancements in the sector. Citations and keywords will instead be used to analyze metrics. Finally, articles were divided by field, as in Butticè & Ughetto [23]. This division is to investigate if there are streams of literature where researchers are more contributing and others that require more effort. To avoid double entries, articles were associated only with one field category. Categories were firstly divided into two main categories and then into other subcategories. To be as transparent as possible, a description of field categories is provided below: 1) Oracle Theory. Under this category were included papers specifically focused on the blockchain oracles either from a theoretical or a practical point of view. 2) Oracle Applied. This category includes papers that focused on real-world applications such as Healthcare, Finance, Business Process Management but provided a detailed analysis on the role of oracles in these fields with theoretical or experimental approaches. Main categories are further divided into subcategories for which those belong to oracle theory: 1) Architecture. With an empirical or theoretical approach, papers in this category perform analysis on the oracle framework to improve technical aspects, enlighten current challenges, and identify new avenues for research. 2) Proposal. Those papers propose new oracle frameworks that may be implemented in real-world applications. Those may still be at a conceptual or prototype stage. 3) Oracle Problem. Those articles focus on the aspects related to the trustworthiness of oracles and their limits to decentralization.
Oracle Applied subcategories such as healthcare and Supply Chain are much more intuitive, but those that require clarifications are described hereafter.
1) Data Management. Article concerning the transfer of data from the real world to the blockchain pertains to the oracle theory main category. In this field are instead considered articles that analyze the access data management for reputation, privacy, or GDPR purposes. Cloud Computing related researches will be filed under their own category since they mainly concern data elaboration. 2) Finance. In this category are grouped articles that involve oracles applied in financial applications and those exploring timeliness and gas usage of transactions. Those concerning assets managements on the blockchain are also included. 3) IoT. This category comprises papers that investigate oracle as efficient IoT systems but do not refer to a specific real-world application. A paper concerning IoT in the supply chain, for example, would instead be inserted into the Supply Chain and Traceability category. Only the first author was taken into consideration to extract the country and institution provenance of the paper. Considering all the authors would have probably created a bias toward the articles with a higher number of authors. Finally, citations were taken from Google Scholar because it is the only database where all the papers in the sample are retrieved. The complete list of articles divided by category is provided in the appendix to facilitate replicability of results.

Results
In this section of the paper, the results of the bibliometric analysis are reported. With a quantitative approach, the status and trends of literature on blockchain oracles are shown. The analysis will first cover the time and space of the research; then, it will focus on the outlets, authors, and field of analysis.

Number of publications per year
The first academic papers considering blockchain oracles appeared in 2016 and were equally distributed among theoretical and applied [59], [60]. As figure 3 shows, the interest in the topic remains low till 2018, and the papers concerning oracle theory are slightly more than those discussing oracle applied. The increase in the trend can be observable from 2019, with 2020 having four times more publications than 2018. This data reveals that the topic gained more impact and attention amongst academics, probably because of the higher developments of blockchain-related platforms. However, in absolute terms, the overall numbers are still low, with a peak of 43 publications in 2020 and only 111 publications in all six years of academic production. Those numbers show that this is still a niche subject.

Productivity rate by geographical distribution
Tables 4 and 5 presents the distribution of papers by countries and continents, respectively. We can observe that the continents with the highest productivity are Europe and Asia, with more than 75% of total paper production. Asia, however, appears to be more focused on Oracle applied than Europe, which presents a balance between the two main categories. Concerning countries, the situation partially reflects what is observed with continents. The most productive countries are the Asian countries China and UAE, followed by the European countries, Italy and Germany. Only those four countries together count for more than 43% of total publications. Concerning fields, countries appear to be sufficiently balanced except for UAE, which is more focused on oracle applications, while Australia, USA, and Austria mostly contribute to Oracle Theory.

Publications by outlets and publishers.
As figure 4 shows, the majority of papers published in this field are journals (48) and conference papers (41). On the other hand, a small portion is constituted by book sections (13) and preprints (9). This data contrasts previous blockchain technology reviews, showing that conference contributions are even four times more than journal publications [2], [15]. This supports the idea that there seems to be no dedicated conference venue on blockchain oracles.  Table 6 and figure 5, on the other hand, show the distribution of papers by journals and publishers, respectively. We observe that the majority of papers (42) are published in IEEE outlets and venues, while Springers and MDPI count 16 and 9, respectively. If we consider only journal publications, however, the weight of the contributions would slightly change since 31 IEEE documents were conference papers, and of 16 Springers entries, 13 were book sections. Then excluding non-journal publications, we would have IEEE with 11 publications followed by MDPI with 9 and Elsevier with 8. This info is incredibly insightful when observing table 6. That table shows that only seven journals

Book Section
Journal Article Conference Paper Preprint published more than one paper on the subject, and only two published more than two documents. Conferences venues, on the other hand, contributed with no more than one paper. From table 6, It is also observable that the journal that published more contributions is IEEE Access, while other three (Applied Sciences, Sustainability, Computers) belong to MDPI. The author's opinion is that it is not a case that we find IEEE Access as the first contributor and 3 MDPI journals in the list. In fact, those publishers share the idea of multidisciplinarity and encourage the submission of papers that don't fit with the scope of other journals. Since the oracle subject is probably still not identified under a journal or subject category, multidisciplinary outlets at the moment may be perceived as the most appropriate venues for related researches. Other aspects those outlets have in common are the open access and the timeliness of the publication. Those features guarantee that the contribution of the study is freely and quickly available for other authors and practitioners. Given the oracle-related literature's infancy and innovation rate, it makes sense to opt for outlets with those features.  4.4. Article types, fields, and keywords. Figure 6 offers a breakdown over the whole sample article types, and it emerges that more than half (68), precisely the 61% are empirical papers, while theoretical and reviews are 23% and 15%, respectively. At a general level, it is interesting to note that the majority of academic research over oracles is of empirical nature. Still, of course, it is also important to have this data distinguished by the field of research. Table 7 then provides an overview of paper types determined by fields according to the main and subcategories indicated in the data extraction section. The first thing that emerges is that the majority of articles belong to the Oracle Theory category. That is understandable since oracles are still in their early-stage development, and there is still heterogeneity of views on how they should function and operate. Although the majority are still empirical, it emerges that articles are well-balanced with theoretical and review types for the Architecture and Oracle problem subcategories. On the other hand, proposals are mainly of empirical/experimental nature, which bodes well for the birth of oracles framework in cooperation or fully developed by academic institutions. Regarding oracle applied papers, being ideally a more practical area, it is understandable why there is an imbalance between empirical and theoretical papers. Furthermore, the low number explains why only three review papers are retrieved. Analyzing subcategories, it can also be observed that there are some with fewer contributions than others. The finance sector is leading with 14 contributions, followed by data management (10) and IoT (7). Given the higher advancement level of these sectors and the empirical nature of academic contributions is also understandable why other sectors such as Healthcare and Cloud Computing only have three contributions. Keywords are also an important parameter to take into consideration when evaluating the sample. Figure 7 shows the word cloud made with all the keywords in the sample. It is noticeable that the most used keywords are Blockchain, Smart Contracts, and Oracles Theoretical Review with 91, 41, and 27 occurrences, respectively. Merging contract and oracle with their plural, the occurrences would even get to 51 and 60, respectively. Other used keywords are Ethereum (15), data (14), decentralized (8), and distributed (8), while others have a lower currency rate. It is also interesting to note that, on the whole sample of 274 keywords, the majority (198) occur just once. Keywords were also divided by categories in order to have a better data breakdown. Excluding the most common keywords, it appears that there is still excessive heterogeneity, even dividing those by categories. In table 8 are listed the keywords with higher occurrences divided by categories. This data is useful for indexing purposes and for research to be easily retrieved by the appropriate audience. The AI and Oracle Problem categories are excluded from the table due to excessive heterogeneity.

Contribution by Authors/Institutions and metrics
The most contributing papers, authors, and universities are displayed in tables 9, 10, and 11. Building on prior bibliometric analyses [61]- [64], the papers are ordered in terms of citations; therefore, the ten papers displayed in table 9 are the most cited ones. Institutions, on the other hand, are ordered in terms of papers produced. The list is not limited to ten but is restricted to those that provided at least two contributions. Most contributing authors instead are evaluated with a mixed approach. Ordering authors by citations would have resulted in a biased list due to papers with many co-authors and citations. Therefore, to be inserted in the list, one requirement is to have produced at least two publications and to be the first author for at least one of them.
As explained, information gathered with the above-mentioned approaches is provided with separate tables for clarity reasons, but it is essential to analyze them together to grasp the meaning of data better. The most contributing author is Xu, Xiwei (705 citations) from the University of New South Wales (UNSW). She has the first two most cited papers and four among the first ten. She has started contributing to the subject in 2016, but it appears that her production on this subject stopped in 2019. All the papers published by the UNSW are hers except for one by Lo, Sing Kuang, which is also among the most contributing Authors (59 Citations). UNSW is the second most contributing institution, and there, research is mainly focused on Oracles Architecture. The second most contributing author is Adler John from the University of Toronto which authored the third most cited paper (98 citations). From the University of Toronto, it is observable that there is also Merlini Marco among the most contributing authors, and it emerges that this institution is particularly focused on producing new oracle proposals. The fifth and sixth most contributing authors are Al-Breiki, Hamda, and Omar Ilhaam A. from Khalifa University, with 43 and 35 citations. From the same University are also Battah, Ammar and Madine, Mohammad Moussa, which are also among the most contributing authors but with fewer citations (19 and 18 respectively). It has to be noticed that Khalifa University is the most contributing institution in the field with 12 documents produced, of which two among the ten most cited and four among the first twenty. Observing the co-authorship, apart from the four most contributing authors, many other authors from the same University are participating in the research. This gives an idea of an institution that is heavily investing in this sector. It has to be also underlined that this institution contributed at least one paper to every oracle application category (except for BPM). Furthermore, besides offering contributions to the healthcare and data management fields, they also produced research to address the oracle problem. Focused on addressing the oracle problem is also the University of Verona, the third by article produced. However, publications from this institution are relatively new and are not among the top-cited publications. From the same country (Italy), the University of Insubria is also among the most contributing, and two authors, Carminati Barbara and Rondanini Christian are among the most cited. Works from this University and researchers mainly concern oracle applied as IoT in Business Processes. Another notable institution is the University of Ljubljana, whose contributions focus on cloud/fog computing and the oracle problem. To that institution also belongs the sixth most cited paper and the third most contributing author, Petar Kochovsky, with 69 citations.
Among the most contributing institutions, other five emerge, whose researchers are also among the most impactful ones. Those institutions are Beijing University, Technische Universit¨at Berlin, University of Potsdam, Chiba institute of technology, Fujian Agriculture and Forest University, and the INRS of Montreal. Lu, Xiaolong from Fujian Agriculture and Forest University, emerges as the most cited in this group (26 citations), and his main contributions focus on oracle theory. Finally, the Austria Institute of science and Technology deserves to be mentioned since even if no authors are among the most impactful ones, a paper coming from this institution is among the ten most-cited ones [65]. Although there is probably not a dedicated research group on oracle-related subjects, it can be argued that their quality of research is reasonably high.

Converging Studies
Before collecting and observing the data, the idea was to undertake a social network analysis to show the cooperation between authors and institutions worldwide. Once the data was collected, however, it emerged that this approach was not feasible. Considering the most contributing authors, it emerges that the produced research is essentially done in cooperation with authors of the same department or with branches of the same University. Paper such as Kochovsky et al. [66], for example, shows the contributions of multiple authors and institutions that, however, are not encountered in further studies. Therefore, it appears that there is no active and stable cooperation between institutions specialized in investigating blockchain oracles, nor among leading authors belonging to different institutions. For that reason, with the aim of promoting cooperation between those and other institutions who are entering the blockchain space, possible cooperation is proposed on the basis of the similarity of investigated subjects and applied methodologies.

Oracle Theory
Considering first, subjects pertaining to oracle theory, the oracle architecture, comprise many different studies. A group of studies is dedicated to investigating common patterns that emerge from oracles architectures with the aim of classification and improvement [42], [67]- [69]. While the studies of Macquarie University and UNSW are theoretical, the study presented by Vienna University also uses case distinctions based on oracles characteristics and gas usage. However, compared to the research of Macquarie University, the papers of UNSW and Vienna university seem to follow a similar approach to distinguish oracle patterns.
Other two studies from UNSW and Tatu University investigate a theoretical framework to decide the most suitable oracle for a blockchain application in terms of security and data management [30], [70]. Similar in the methodology, they both come to similar results.
Another central subject in the oracle theory is the oracle problem issue, for which many contributions are retrieved. It can be observed that there is a group of papers focused on explaining the oracle problem while others are more focused on empirically investigating the subject in order to overcome the issue. Two papers from the University of Ljubljana and Max-Plank Institute introduce the oracle problem from a legal point of view [27], [28]. The oracle's role as legal actors is investigated as well as their responsibility as a trusted entity. Other papers from the University of Verona are focused on investigating the consequences of the oracle problem in various sectors such as IPRS, Healthcare, Supply Chain, and so on. It emerges that due to the amount of money managed by DeFi platforms, financial implications are more alarming [4], [49]. On this aspect, Singapore University has also produced a similar work investigating the reliability of DeFi applications due to the oracle dependency [1]. Finally, studies from the Chiba University of Technology and the University of Dallas explore with empirical data the incentives of oracles to cheat or fail to transmit information [71], [72].
The last subject of oracle theory pertains to oracle proposals. By definition, proposals are original; therefore, similar works should not be found for this category. This hypothesis is confirmed by reviewing the literature. Furthermore, apart from the University of Toronto, which seems to have a dedicated team in developing research on that aspect, other proposals seem to be unique works and are retrieved in a balanced distribution among institutions of different countries.

Oracle Applied
Oracle applied research is focused on various sectors. As expected, due to the resonance and hype that cryptocurrencies attracted, finance applications constitute the widest sample. On the one hand, it emerges that every paper belongs to different institutions; on the other hand, however, there are some similarities in their focus. Two studies from Concordia University and Delhi University theoretically focus on the role of the oracle as a means to manipulate the market, showing the possible risks connected with their use and misuse [50], [73]. Other research from Oxford-Hainan Blockchain research institute and Singapore University of Technology and design undertook empirical research focusing on price oracles failures and possible attacks. The first proposed the BLOCKEYE, a device able to hunt attacks on DeFi and also oracle manipulation for which the research team already presented some experimental results [74]. On the other hand, using primary data, the second shows the deviance rates of four oracles services to enlighten oracle's reliability and possible malfunctions [1]. Another group of studies focuses on specific financial applications (e.g., loans, Exchanges); however, apart from two papers from Khalifa University and University of Clermont Auvergne that both investigate E-Auctions, the rest have heterogeneous aims. Both studies on E-Auction have an experimental approach and proposes a new auction service based on Ethereum blockchain specifying the role of oracles and how to overcome possible security issues [75], [76].
Most of the contributions in business process management applications are from the University of Insubria and the University of Potsdam. Their focus, however, diverges since the first is focused on the privacy of business process transactions, while the second is oriented on the timeliness of registered transactions [77]- [81]. Both institutions, on the other hand, balanced well, theoretical and empirical works.
As for the supply chain & traceability field, the works seem heterogeneous, although six entries are retrieved. Construction and food supply chain is investigated as well as the traceability of vehicles and covid-19 infections [82]- [84]. Covid-19 traceability is the only application with two contributions, but they are both from the same institution [85], [86]. Convergences of papers from different institutions can be instead retrieved in the works of Etemadi et al. [87] and Sanchez-Gomez et al. [88]. The focus of those papers is to enlighten the dependency of traceability systems on oracles and the consequences on reliability and security.
For healthcare, only three papers are retrieved, of which two belong to Khalifa University and are focused on the security and access control of patients' records [89], [90]. Goncalves et al.'s [91] research focuses on the same objective but proposes a specific oracle solution with the Chainlink oracle provider and Ethereum blockchain.
Applications in AI retrieve just four entries, and although there is convergence in the idea behind three of them, they followed a different approach and method [92]- [94]. The central focus is to exploit automation and oracles to guarantee trust over data gathering and processing. As in the original idea of the software oracle problem, the objective is to reduce external parties' intervention in automated procedures [95].
The IoT sector has seven publications, and the main investigated issues regard the problem of ensuring that the data gathered by IoT devices is trustworthy and private. While the work of Gordon [96] outlines the problem of secure data provenance within IoT systems, Shi et al. [97] research propose a secure and lightweight triple-trusted architecture (SLTA) to address the issue effectively. On the other hand, contributions from Khalifa University and Insubria University approach the confidentiality of IoT data by granting users different access privileges. With similar approaches, both studies also present a converging roadmap for development [10], [98].
As for cloud computing, only three studies are retrieved. Two were published by the same institution and propose a trustless oracles system [66], [99], focusing on Service Level Agreement (SLA). On the other hand, the work proposed by Khalifa University approaches the problem of ensuring an optimal fees level to balance the needs of Cloud providers and users [100].
Lastly, a consistent group of papers applies oracles for data management-the focus of those ranges from data privacy to data consistency and data migration. As observed, papers from other categories (e.g., AI, IoT, or Cloud computing) also had similar investigation scope. Two works from Chiba institute of technology and Beijing institute of technology propose a system based on crowdsensing to ensure a distributed data validation [101], [102]. The first is more organized as a proposal, while the second already shows some experimental results. Other studies from Khalifa University and UNSW investigate how data quality can be managed and improved with multi-party authorization and reputation systems [103]- [105].

Conclusions
This paper undertook a bibliometric analysis of the published studies about blockchain oracles. The aim was to display the publication trend along with preferred outlets and publishers. The most cited papers, most impactful authors, and most contributing institutions are also showed. Reviewing the selected literature, the focus of published documents is finally discussed to enlighten convergence among studies and promote cooperation between institutions.
The obtained results show that within six years of academic production, only 111 papers (including-non-peer-reviewed) are retrieved in scholarly databases. This result supports the view that blockchain oracles are still a widely neglected subject despite their critical importance. Most of the contributions come from Asia (40) and Europe (45), for which China (14), UAE (12), Germany (11), and Italy (11) are the most productive countries. The majority of documents are journal publications and conference papers for which IEEE, Springer, and MDPI appear to be preferred publishers. It also emerged that multidisciplinary and open access journals and publishers are preferred given the nascency of the subject. This research also showed that, probably given the technical aspects of the subject, empirical papers are dominant over theoretical ones. Due to the scarcity of publications, reviews are also low in numbers.
Concerning contributions and metrics, it emerged that Xiwei, Xu is the most impactful author in the field and has also published the two most cited papers. Khalifa University, with 12 publications, is the most productive institution. Checking authorships, it seems that research teams are not cooperating with other universities for developing research on blockchain oracles. However, reviewing the literature, it emerged that some studies bear similar aims and scope.
The present study also showed that the literature on oracles does not cover many sectors for real-world blockchains. No contributions are retrieved for: entertainment, tourism, insurance, e-government resource management, etc. Previous studies already underlined the absence of specific papers discussing oracles' role in resource management and, particularly, energy management [2], [4], [12]. Given the crescent interest in energy management with blockchain, a discussion over related oracle use is essential to enable further progress in the field.
The findings of this research are useful for academics as well as for students and practitioners. They give a broad overview of the institutions with advanced knowledge and competence of real-world blockchain that can constitute a reference for entrepreneurs undertaking blockchain-based projects. Students and other academics can then exploit a resource on the state of the art of related knowledge and investigate emerging gaps (e.g., missing resource management contributions) or create other researches building on existing studies.
This paper also has limitations given the scarcity of the retrieved material that determined low numbers in absolute terms in all the tables and figures. On the other hand, it is arguable that since the overall numbers are low, the dominance of specific countries or institutions will probably change in the near future. As specified in the methodology section, a degree of subjectivity in the presented results cannot be excluded. While previous studies inspired the method and bibliometric research, the author had to select them arbitrarily. Subjectivity can also be retrieved in the sample classification since the division is categories, and subcategories hat to be performed manually. Further studies can build on this bibliometric analysis to investigate the trust models adopted and presented in the published literature and the preferred oracle applications for academic investigations.
Funding: This research was funded by UniCredit Foundation "Fondo Emma Gianesini".
Data Availability Statement: The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The author is aware that the presence of his name in table 10 may constitute a potential conflict of interest. Ordering the table by number of contributions would have granted the author of this study a higher position in the list. Therefore, the list was ordered in terms of citations perceived as the most objective and used ordering criteria for the author's impact. The funder had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.