New directions in sensor research: a bibliometric analysis for detecting emerging research fields and new technological trajectories

The fundamental question in the field of sensor research is new directions of scientific fields, which play a vital role in the progress of science and technology. This study confronts this question here by developing a bibliometric analysis, which endeavors to explain the evolution of sensor research and new technologies that are critical to science and society. The database of Scopus concerning scientific documents and patents is used for statistical and computational analyses in these topics. Results suggest that emerging technological trajectories in sensors are wireless sensor networks, wearable sensors and biosensors. Main characteristics of these growing research fields and technologies in sensors are described for fruitful implications of research and innovation policy directed to science advances and technological change in society.


Introduction
The evolution of sensor research and technology has critical aspects to science and human society (Rao et al., 2018;Sensors, 1992;Coccia and Bellitto, 2018). These topics of "The science of science" can clarify the driving factors of the evolution of science in sensors directed to support scientific discoveries and technological advances in society (Fortunato et al., 2018;Coccia 2020; Sun et al., 2013). First of all, a brief background of vital concepts in sensors is useful to clarify the study design here. A broad concept of sensor is a device, module or subsystem having the goal to detect events or changes in specific environment and send the information to other interrelated technological devises, such as a computer processor (Göpel et al., 1989;National Research Council, 1995;Rao et al., 2018). Sensors are technologies 1 associated with different technologies, generating complex interactions in a perspective of host-parasite technological systems (cf., Coccia, 2018aCoccia, , 2019Coccia, , 2019aCoccia and Watts, 2020). In particular, sensor system can be considered a parasite technology of other technological systems (Coccia,

Study design for technological trajectories ▪ Sources and Sample
The study uses datasets of Scopus (2021). In particular, the window of "Search documents" in Scopus (2021) database is used to identify scientific documents having in title, abstract or keywords of articles and patents the term "sensors". Scientific products and patents are the basic units for technology and scientific analyses to explain the evolution of science and technology in the field of sensors with fruitful policy implications. Each of this keyword is inserted in the window "search documents" to detect the specific time series that is used for a comparative analysis between sensor technologies in the list just mentioned to analyze the rate of growth and, as a consequence, new directions in sensor research. The study applies the model by Sahal (1981) for scientific and technology analysis of time series in sensor.
Two models are specified as follows.

Firstly,
Log yi,t = a + b time + ut [1] yt is the dependent variable of scientific products or patents.
a is a constant; b is the coefficient of regression; The parameters a and b are unknown and are estimated using the sample of data.
log has base e= 2.7182818; t=time; ut = error term in equation.
Secondly, if we consider the ratio: The specification of the model is: The equation [2] also has a' = constant; b' = coefficient of regression (a' and b' are the parameters to be estimated); t=time; εt = error term in equation.
The relationships under study here for scientific and technology analysis are investigated using the Ordinary Least Squares (OLS) method for estimating the unknown parameters in regression models [1] and [2]. Statistical analyses are performed with the IBM SPSS Statistics 26 ®.

▪ Research settings
The methodology here has the purpose to investigate the structure of emerging research fields in sensor technology, detected with previous statistical analysis by the highest coefficients of regression of estimated relationships based on publication and patent data (equations [1] and [2]); in particular, the magnitude of coefficients of regression is a proxy of high evolutionary growth of sensor research over time. Emerging research fields under study here, having the highest coefficients of regression, are given by:  wireless sensor network. A wireless sensor network is a group of objects that transfer the gathered data through multiple nodes and wireless infrastructure to cooperatively sense and control the environment (Yick, 2008). These devices are positioned in large numbers, so they need the ability to assist each other to transfer data back to a centralized collection point (Rajaravivarma, 2003).
 wearable sensors. Wearable sensors are integrated into wearable objects attached to the body in health monitoring or physically relevant data collection. They have diagnostic and monitoring applications, including physiological and biochemical sensing and motion sensing (Teng, 2008).
Wearable sensor adaptation has involved miniaturizing sensing technologies, making them conformal and flexible, and developing companion software that increases the value of the measured data (Heikenfeld, 2018).
 biosensors. A biosensor is an analytical device that measures biological or chemical sensing elements and reactions. They are generally employed in monitoring, pollutants detection, and biomarkers discovery (Kissinger, 2005). They restrain biology's great sensitivity and specificity in intersection with physicochemical transducers to provide detailed and bioanalytical measurements with easyto-use and straightforward formats (Turner, 2013 After an initial review of these articles, the abstracts were used to input the LDA technique to explore topics under study. Measures are similar and described in previous section.
Secondly, for textual data pre-processing, we conducted a topic modelling analysis using Python 3.7.7 version programming language to first concatenating all abstracts of publications and then concatenating them into one string set for each field. We created a corpus of the respective field documents by which the model learns the 'topics'. The data are pre-processed prior to the topic modelling using GenSim library (Rehurek, 2010) to convert each publication's abstract into a bag-ofwords representation. We consider each word as a token and then eliminated words in a stopword list provided in the MALLET software (McCallum, 2002). Then, words with a low frequency, fewer than three characters have got removed. We exerted the Tokenization technique by splitting the text into a set of words, punctuation removal, adjusting the terms with higher cases into lowercase. Aside from those processes, we implement lemmatization to assimilate all the verbs in various tenses to present tenses and modified them to the first person. In the end, we removed all terms that appear fewer than ten times across all documents, or that appear in more than 70 percent of records.

− Step 2: topic construction
We can assume a topic as a probability distribution over a term. Those vocabularies with a high probability of occurrence in the same topic are more likely to appear frequently in the same documents simultaneously. For constructing the topic, we started training the model using MALLET, a Java-based package used for statistical NLP developed by McCullum (2002) to build a Latent Dirichlet Allocation model (LDA). This model requires a fixed number of topics that is not specified accurately for a corpus.
Accordingly, we chose an optimal number of topics for implementing the topic modelling technique following the study by Mifrah and Benlahmar (2020). In this respect, we calculated the topic coherence score for each number of topics to identify the most efficient one. We used the C_v coherence measure to retrieve co-occurrence counts of respective word sets based on the sliding window size. We calculated the normalized pointwise mutual information (NPMI) for every top word to every top word to extract a set of vectors for each top word. Afterwards, we measured the similarity between the top words sum vector and each top word vector in one-set segmentation. We utilized cosine similarity to calculate the coherence score based on an arithmetic mean of all similarities (Mifrah and Benlahmar, 2020). We calculated the coherence of a couple of models through different numbers of topics according to the approach of Röder (2015) to identify the best number of topics for our model applied in the present study. Figure 1 demonstrates the coherence score of the model through the different numbers of topics.
For wearable sensors, results show that the highest coherence value (i.e., 0.5546) occurs in topic number 22, for biosensors, the highest coherence value (i.e., 0.5687) occurs in topic number 32, and for wireless sensor network, the greatest coherence value (i.e., 0.5260) stands for topic number 38.

− Step 3: investigation
In this step, the study here investigated topics of the emerging research fields in sensor technology described before: wireless sensor network; wearable sensors and biosensors. This section presents topic modeling findings using a world-cloud demonstration in which the size of each word in a specific topic is according to its frequency in that topic. Afterward, we classified all the topics of each field into two categories: technological characteristics and applications. In the second part of the results, trend analysis was conducted separately to demonstrate their evolutionary growth based on the popularity of topics over time. Evolutionary growth of topics within each research field under study (wireless sensor network, wearable sensors, and biosensors) has been categorized in Positive Evolutionary Growth, Stable Evolutionary Growth, and Negative Evolutionary Growth to assess the topic trend analysis for the classification of each emerging subfields of the sensor. In particular, -'Positive Evolutionary Growth indicates that the topic popularity has been increasing, and the occurrence frequency of the topic words has been elevating.
-'Stable Evolutionary Growth indicates that the topic popularity has been fluctuating and doesn't follow a rising or falling trend. It means that the occurrence frequency of the words in the topic has stable evolution.
-'Negative Evolutionary Growth indicates that the topic popularity has been decreasing, and the occurrence frequency of the topic words has been faced reduction.

Growth of research fields in sensors
The parametric estimates of models [1][2], based on scientific production, are presented in Table 1. In a majority of cases, the significance of the coefficients of regression and the explanatory power of the equations has p-value<.001. The R 2 values are high and in general the models explain more than 80% variance in the data.  Table 2 shows the parametric estimates of models [1][2] based on patents. Table 2 also reveals that in a majority of cases, the significance of the coefficients of regression and the explanatory power of the equations has p-value<.001, except model [2] for remote sensing. The R 2 values are also high and in a majority of cases the models explain more than 70% variance in the data.  Results also suggest that wireless sensors, a restriction of wireless sensor networks, has a high evolutionary growth in the field of sensor technology. All these research fields are the younger ones among research fields in sensors. This result is consistent with studies by Coccia (2018Coccia ( , 2020) that higher growth rates of scientific production are in new research fields rather than old ones.   Technological trajectories of sensor using patents (log scale) The next section investigates these research fields and technologies to clarify their structure and drivers in science dynamics to detect critical technological characteristics and applications for progress in society and society.

Structure, characteristics, and applications of critical research fields in sensors
The results of topic modelling analysis demonstrate the top 15 high-frequency terms in each topic. These topics contain the words reflecting the content and terms of documents with the highest score. The topics are related to significant issues in each growing subfield in sensor technology. We illustrated 38 topics in wireless sensor network, 22 topics in wearable sensors, and 32 topics in biosensors through a worldcloud analysis; the size of each word indicates comparatively the frequency weight of a term in a specific case. The larger the word, the higher the frequency stands in the parent topic. Accordingly, this visualization can reflect the brief information of each topic and partially explains the included documents. Ultimately, this study analyzes and explores the evolution of these topics over time. Topic modeling analysis can also demonstrate the increasing or decreasing popularity of topics, which can better explain how a field of research has been changing over time. We normalized the proportion of each topic per year and obtained the annual trends.
o wireless sensor network Figure 4 shows the 20 most frequent words that appeared in publications on wireless sensor network.
Our results show that the terms "network", "node", "wireless", and "energy" have been used more than 100,000 times across the corpus. Each word, according to its similarity regarding the co-occurrence, leads to topics creation.  Table 4 shows the evolutionary growth of topics in wireless network sensor. From this classification, it can be concluded that the studies of smart sensors associated with the Internet of things are growing; the studies of environmental monitoring and health care evolutionary level are also increasing over time. o Wearable Sensor Figure 6 shows the top 20 words with the highest frequency in wearable sensor publications. These findings reveal that the terms "system", "device", "datum", "time", and "human" have appeared more than 6,000 times across the corpus.  Table 5 shows the positive popularity rate of wearable sensor technologies over time, such as sensor particles, machine learning, and pressure sensing. The growing application topics are mainly physical activities and body motion measuring. Haque et al., 2021). In this context, results here suggest that future developments are directed to improve material flexibility, softness, and comfort of the wearable technologies (e.g., artificial legs and hands devices) to be used properly in patients (Chheng and Wilson, 2021).
o biosensors Figure 8 shows the 20 words with the highest occurrence across biosensors. Our findings reveal that the terms "biosensor", "detection", "base", "sensor", "surface", "cell", and "high" have the highest frequency, appearing more than 30,000 times in the corpus. These high-frequent words' similarity regarding their co-occurrence matrix have been considered in topic creations. Instead, main application characteristics of biosensors are (cf., Figure 9): • genetic • DNA sequence • vital sign measurement • cancer detection • patient monitoring Figure 9. Cloud words of biosensor Table 6 shows that material science, nanotechnology, and the detection process have been growing in sensor research to expand the technological aspects of biosensors. On the contrary, this analysis demonstrates that the glucose sensors topic faced a considerable reduction in its popularity. The results here also demonstrate that biosensor studies are growing over time, especially in topics associated with detecting and monitoring applications in medical systems (Nejadmansouri et al., 2021).