Research on the Visualization of Ocean Big Data Based on the Cite-Space Software

Ocean big data is the scientific practice of using big data technology in the marine field. Data from satellites, manned spacecraft, space stations, airship, unmanned aerial vehicles, shore-based radar and observation stations, exploration platforms, buoys, underwater gliders, submersibles, and submarine observation networks are seamlessly combined into the ocean’s big data. Increasing numbers of scholars have tried to fully analyze the ocean’s big data. To explore the key research technology knowledge graphs related to ocean big data, articles between 1990 and 2020 were collected from the “Web of Science”. By comparing bibliometric software and using the visualization software Cite-Space, the pivotal literature related to ocean big data, as well as countries, institutions, categories, and keywords, were visualized and recognized. Journal co-citation analysis networks can help determine the national distribution of core journals. Co-citation analysis networks for documents show authors who are influential at key technical levels. Key co-occurrence analysis network keywords can determine research hot spots and research frontiers. The three supporting elements of marine big data research are shown in the co-citation network. These elements are author, institution, and country. By examining the co-occurrence of keywords, the key technology research directions for future marine big data were determined.


Introduction
The term "big data" is becoming increasingly mentioned in our society. As oceans occupy 70% of the earth's surface area, oceans have entered the era of big data. Today, there are many available marine observation and survey methods, such as offshore mapping, island surveillance, underwater detection, marine fishery operations, marine buoy monitoring, marine scientific research, oil and gas platform environment monitoring, satellite remote sensing monitoring, etc., comprising a very large marine observation monitoring system. This system has accumulated a large amount of marine natural science data, including on-site observations and monitoring data, marine remote sensing data, numerical model data, etc. With the development of the field of big data, more and more researchers have become engaged in the research on big data in the marine field. Document analysis has been used to evaluate the number and content of the scientific literature and to research the topic of one subject or area [1]. A knowledge graph is a visualization analysis of important knowledge and shows this knowledge using key notes connected with other clusters and simple notes [2,3]. This process can mine valuable information in fields such as Computer Science [5][6][7], Medicine Science [8,9] and Library Science [10]. The main objectives of this study are to identify the current development status, trends, and frontiers of marine big data research, to discover core technologies and their co-citations, and to find key literature in the field of marine big data research from 1990 to 2020.

Data Collection
To guarantee the accuracy and comprehensiveness of the data source, the analysis data for this study are taken from Thomson ISI's SCI (Web of science in the Science Citation Index Expended Edition). For the advanced search by article, the author set the search mode to "advanced search" with the following formula: TS = (ocean data technology) and LANGUAGE = (ENGLISH) and DOCUMENT TYPES = (Article OR Review). The Time interval was set to 1990-2019. A total of 1444 records included the authors, titles, keywords, abstracts, and cited references.

Research Tools
Besides general complex software (Gephi), some bibliometrics software (Cite-Space and SCI2) are also used for this analysis due to their powerful data processing and visualization functions [11,12]. Although both the SCI2 and Cite-Space software provide time analysis and burst detection, co-occurring networks can simultaneously represent time, frequency, and intermediary centrality. However, SCI2 and Gephi put more emphasis on layout algorithms. Therefore, due to its simple operation, powerful functions, and high objectivity, the Cite-Space software developed by Professor Chen Chaomei was selected as the scientific map drawing tool for this study.
A co-word analysis in the Cite-space software provides a choice of 11 types of analysis objects, and different objects will have different graph tables. A co-word analysis can use "Title", "Abstract", "Descriptors", and "Identifiers" as the sources of words to obtain a knowledge map of citations, co-citations, and co-words. Moreover, the Cite-space software provides clustering and time zone and timeline visualization functions [13,14]. Word co-occurrence involves keywords, authors, cooperative institutions, and countries. The circle size in the co-occurrence diagram represents the frequency of keyword citations. The greater the frequency, the larger the circle. Further, the thickness of the lines represents the relevance of the keywords. The different colors in the figure represent time slices. A purple-colored periphery of the circle represents the strong centrality of that keyword. Emergent words are keywords that appear in large numbers over a period of time. Such keywords can be used as predictions of research fronts.
Cite-space was used as the knowledge graph analysis software for this study. It provides strong support for searching and displaying research hot spots, development trends, and novel phenomena in the field of marine big data, and to clearly understand and determine the development trends of the discipline.

Results and Discussion
We imported the downloaded data into Cite-Space, selected the time range from 1990 to 2020, and set each slice as #year 2. We also selected the term sources and the 50 items most commonly referenced or present in each slice. Pruning was done with pathfinder, and visualization was selected as "cluster view". Then, we selected the node types to analyze the countries, institutions, authors, and keywords of ocean big data technology and obtained a scientific knowledge map indicating the future research status and trends of ocean big data technology.

Country Co-Occurrence Network
We selected the parameter "Country" in the Cite-Space II analysis software, and the co-occurrence network maps of the countries in which ocean big data research areas are shown in Figure 1. There are nodes of different sizes in the graph. These nodes represent the countries that are cited in the paper and are identified by different colors and sizes. The larger the node, the larger the amount of paper it sends, and vice versa. Nodes will have different colored lines between themselves, and the thickness of the lines will be different. The thicker the lines, the closer the connections, and the thinner the lines, the fewer the connections. A purple circle is placed around the nodes in the graph, which represents the centrality of the document. The higher the centrality, more likely the explanation will produce an important link with other documents [14]. Based on the overall analysis of the pictures, 1444 English-language documents in the field of marine big data information research were taken from 50 countries. According to the number of times the documents had been attributed, the US literature was cited according to 653 words. China followed closely and was cited 211 times. England, Germany, France, Australia, Canada, Italy, and Japan were cited 179,133,130,122,101,85, and 83 times, respectively. Based on the centrality of the article, the United States ranks first with 0.32, followed by France with 0.2. Britain, Germany, Japan, Canada, and Italy were 0. 17, 0.17, 0.15, 0.11, and 0.11. China's centrality was 0.01, ranking 25th.
From this, the countries with a high degree of centrality occupy an important position in the field of ocean big data research. Although our country has been cited frequently, we are not very central. Even though the frequency of article citations in France and the United Kingdom did not exceed those of China, they have obvious advantages in terms of their centrality. China's low publication centrality also shows, to a certain extent, that differences in language, academic fields, and perspectives are reasons why China publishes related academic results in European and American core journals in the field of ocean big data. Therefore, accelerating the progress of the field, improving the language expression level of researchers, and strengthening international academic exchanges are important ways to improve China's centrality in the field.

Institution Co-Occurrence Network
With the help of the Cite-Space II analysis software, Co-occurrence network maps of institutions related to ocean big data research areas are shown in Figure 2. It is clear that the National Oceanic and Atmospheric Administration is a leader in ocean big data. This institution has been indexed a total of 93 times, far exceeding the results of other organizations. The top ten (exceeding the University of Chinese Academy of Sciences) are the University of California, San Diego (64 times); 55 times at the Woods Hole Oceanographic Institute; 50 times at the University of Washington; and 46 times at the California Institute of Technology. Among the various institutions, the Ocean University of China ranked 12th, with a total of 25 citations. The University of Chinese Academy of Sciences and Ocean University of China are important representatives of China's development in this field and have made outstanding contributions in the field of ocean information.

Figure 2.
Institutional co-occurrence network map.

Institution Co-Occurrence Author Network
As shown in Figure 3, there are 65 authors that co-occur, and there are 263 connections between the authors. An author's research work belongs to his or her organization, so mutual cooperation between the authors and the co-occurrence map of the organization will have more similarities. Among them, Atmanand MA (Atmanand, Malayath Aravindakshan), Hoteit I (Hoteit I), Miloslavich P (Miloslavich Patricia,) and Achterberg EP (Achterberg, Eric P.) have the highest volume of posts, and the frequency of cooperation is also among the top four.

Research Hot Spots and Analysis
In an article, the keywords usually describe the core content of the article. Usually, keywords also involve the cutting-edge development of related fields. If a word appears frequently in a certain period, it can be concluded that these words reflect the most important content of the research field at that time.
In the Cite-Space software, the statistical principles of metrology can be used to extract the frequency of keywords or topic words involved in a paper based on an analysis of the frequency of vocabulary occurrences and display the keywords or clustering relationships in the form of graphs. Figure 4 illustrates a network of co-keywords in the download documents. Based on the software analysis results, there are 82 keywords in total, including "ocean" (180 times), "model" (108 times), "system" (78 times), "variability" (74 times), "water" (57 times), "technology" (45 times), "climate change" (45 times), "impact" (45 times), "data assimilation" (41 times), "Remote sensing" (40 times), and "algorithm" (37 times). In Figure 4, the words satellite and biodiversity appear at the edges of the map, which indicates that these words are also the main research direction in the literature. From the time map of the keywords in Figure 5, we can see that ocean, model, changeable, water, data assimilation, and algorithm have high centrality and are also the central keywords cited from 1990 to 2020.   Figure 6 illustrates a network of burst keywords. We can see that the important topics that will continue to appear in the field of marine big data are systems, ocean information management, monitoring, tracking, in situ, sensors, and satellites. Based on the emergence of certain words, we can determine the key technologies in the field of marine big data information technology from 2018 to 2020. Specific technologies will be introduced later in the article.

Monitoring
The photon nanoimmunosensor platform developed in [15] can be used for the on-site analysis of distribution in independent buoys. The platforms in the literature use marine monitoring tools that are directly comparable to standard analysis techniques, which can save development costs and time. A "hybrid" interactive deep-sea monitoring system that reflects marine climate community data streams and data from three watersheds has been successfully deployed in the Indian Ocean. This application can also be used on the FLUX website to enhance the Argo project [16]. A novel framework for obtaining Ulva proliferative regions has been proposed using the CNN for marine environment monitoring. This framework combines superpixel segmentation and CNN classification and can process images in their original resolutions [17]. The study in [18] compares groups of members of the joint community research team and marine monitoring scientists with previous scientific methods, using multiple methods to supplement marine information research. The work in [19] notes that monitoring the characteristics of the ice in the Bohai Sea can be used for ice disaster prevention in marine transportation prevention, oil field operations, and regional climate change. The study in [20] developed a range gate camera system (UTOFIA) that enables enhanced high-resolution 3D imaging underwater. The use of this solution can eliminate close backscattering, improve image quality, and provide each illumination distance Information object, as well as access real-time 3D measurement data. The authors in [21] engaged in technological innovations to improve marine monitoring and governance and conducted an experimental analysis of the potential of these new technologies. The passive acoustic monitoring technology proposed in [22] provides a useful alternative to visual monitoring and can be used to monitor marine mammals. The studies in [23,24] propose using mobile devices for monitoring harmful algae. This application uses the Sentinel-3 satellite ocean and land color instrument CyAN application to provide a user-friendly platform for water quality managers to reduce complexity, allowing initial lake assessments to be performed quickly and efficiently.

Tracking
Tuna and tuna-like species can be used as indicators to track mercury in the Southwest Atlantic. Mercury is potentially dangerous due to its high toxicity and tendency to bioaccumulate in the body [25]. The authors in [26] proposed a new autonomous three-dimensional (3-D) underwater trajectory tracking method using a model predictive control (MPC) underwater vehicle (AUV). The work in [27] showed that environmental DNA (eDNA) can be used to detect marine vertebrates. The study in [28] established the distribution of dissolved particulates or isotopes (230Th and 232Th) in samples taken during the process of mixing water along a neutral density surface produced by a BONUS GoodHope (BGH) IPY-GEOTRACES cruise ship in the South Atlantic Ocean Region (36° S-February-March 13° E to 57° S-0° 2008). The distribution of total (dissolved + particles) 232Th was mainly due to inputs from the continental margin. The study in [29] introduced the design of an adaptive trajectory tracking controller based on an improved line-of-sight (LOS). An EDO-based adaptive terminal sliding mode control method for dynamic control can improve tracking performance and convergence speed. In addition, the effect of actuator saturation is weakened by the anti-saturation compensator. A strict theoretical analysis and many simulation studies show that the method has good tracking accuracy, stability, and anti-interference abilities. In recent years, with the development of the marine industry, the marine navigation environment has become more complicated. Some artificial intelligence technologies, such as computer vision, can identify, track, and count sailboats to ensure maritime safety and promote the management of intelligent marine systems [30]. Aquatic animals are an integral part of marine and freshwater ecosystems, and their resilience depends globally on food sustainability and the support of coastal communities and indigenous peoples. However, the global aquatic environment has undergone profound changes due to human activities and environmental changes. These changes are altering the distribution, movement, and survival of aquatic animals in ways that people have little knowledge about. The Ocean Tracking Network (OTN) is a global partnership that can fill this knowledge gap [31]. The subsea flight node is a new type of autonomous underwater vehicle. Using adaptive law, we propose an adaptive trajectory tracking controller with prescribed performance. The simulation example of the undersea flight node system shows that the proposed control scheme can compensate for the effects of general uncertainty and, at the same time, obtain fast transient processes and expected trajectory tracking accuracy [32]. The study in [33] proposed a verification problem for estimating hull damping parameters using computational fluids. The process fluid dynamics simulation results and the Lyapunov direct method controller proved to be applicable to the non-linear dynamic system of the ship.

In Situ
According to the analysis results of the Cite-Space software, in situ detection technology is an area that has been increasingly focused on in the ocean field in recent years. Since in situ detection is the most effective method for hydrothermal detection in the deep ocean, in situ detection is a subject that scientists have been studying in depth. The instruments used for in situ detection can monitor both the hydrothermal space and transient information changes and can also reflect the true dynamic system of hydrothermal activity. From the perspective of information science, data storage and data communication technology can be applied to the real-time updates of in situ detection, and the acquisition of in situ detection can thus be achieved [34]. The work in [35] noted that this type of in situ detection is different from other in situ detection methods. In the 1960s, the United States scientific discovery ship "Discovery" found high-temperature brine and abyss hydrothermal poly metallic soft mud in the Red Sea, which opened the door to in situ detections of the sea floor [36]. In situ detection can not only obtain a large amount of deep-sea hydrothermal oil tank information but can also truly reflect the real-time data of the deep sea. These data can also be used for a real-time dynamic display. In situ exploration is a major breakthrough in traditional oceanographic research, as well as a breakthrough in the study of deep-sea organisms and the surrounding ecological environment. At the same time, in situ exploration has greatly advanced marine resource exploration, marine environmental financial control, marine scientific research, and sustainable economic development. Recently, Dr. Zhang Xin, of the Institute of Oceanology of the Chinese Academy of Sciences, was the first to cooperate with the United States MBARI (Monterey Bay Aquarium Research Institute) to successfully develop a deep-sea methane in situ detection system based on a remotely operated vehicle (ROV) cable robot. The research results of this exploration have recently been published in Geophysical Research Letters 1 and were reported and reviewed simultaneously by Nature 2 and Science 3. Using this technology, researchers across the world, for the first time, have access to the true in situ concentrations of methane in deep-sea sediments, with available results that are 10-20 times better than the results of traditional sampling tests. Moreover, methane exists in a large amount in deep sea sediments. At the same time, this technology can also obtain a variety of marine chemical parameters in situ, such as dissolved hydrogen sulfide gas, pH value, and sulfate quality [36].

Sensors
Sensors are increasingly used in the marine field. Fiber optic sensors can measure seawater salinity. High-sensitivity temperature sensors based on hollow microspheres can be used to measure ocean temperature. Experiments using different hollow parameters can be used to obtain sensor parameters versus temperature. As for the effect of sensitivity [37], the study in [38] explored the function of a geostationary satellite ocean color sensor to detect the chemical properties of the chlorophyll of marine life on earth. The work in [39] noted that coastal and ocean acidification can change marine biogeochemistry and cause economic and cultural losses. Existing integrated glider peace sensor systems are used for pH ocean sampling throughout coastal waters. Floating wireless sensor networks present unique communication barriers. Compared with other designs, hemispherical antennas have the advantages of higher efficiency and a greater propagation range. The results show that a floating hemispherical antenna can be successfully deployed and used in coastal wireless sensor network applications [40]. The study in [41] mentioned that the real-time inversion of the ocean wave spectrum and elevation vascular motion sensors have important practical significance but are still in the development stage. The Kalman filtering method has the advantages of real-time estimation, reduced costs, and easy implementation. To date, several algorithms for retrieving cyanophycocyanin (PC) from the ocean have proposed to use color sensors for inland waters, all of which are considered reliable models [42]. As an important means for multi-dimensional observations of the ocean, the Ocean Sensor Network (OSN) can meet the needs of large-scale multi-factor comprehensive information observations of the marine environment. However, due to the multipath effect, the signal is blocked by waves, and unintentional or malicious attack abnormalities occur frequently and inevitably, which directly reduces the accuracy of the performance positioning. Therefore, improving the positioning accuracy measurements in the presence of outliers is a key issue that needs to be urgently addressed in the OSN [43].

Satellites
In May 2002 and April 2007, China launched the HY-1A and HY-1B satellites with independent intellectual property rights. The development of China's ocean satellites subsequently entered a whole new era, and a new chapter in ocean satellite observation began [44]. Because carbon dioxide is absorbed in the atmosphere, long-term absorption will reduce the pH of the ocean, so improving our ability to monitor the chemistry of marine carbonates has become an urgent task [45]. We must fully explore the use of satellite earth observations as an option for the routine observations of surface ocean carbonate chemistry. The literature demonstrates the applicability of using empirical algorithms to calculate total alkalinity (AT) and total alkalinity. Dissolved inorganic carbon (CT), evaluation satellites, in situ interpolation, and climatology datasets can be used to reproduce the broad spatial patterns of these two variables. For performance [46], the study in [47] noted a comparison of many actual voyages collected from satellite automatic identification system data and attempted a statistical verification of voyage simulation models. The verification results show that the simulation results for navigation in the North Pacific and North Atlantic are similar to those of actual navigation. The use of new satellite ocean color data products and improved algorithms can fill areas that are physically and ecologically important in the ocean, which include world-famous coral reefs, seagrasses, and fisheries [48]. The study in [49] introduced the use of marine satellite passive microwave observations to enable inversion research of the TC surface pressure field and proposed a new inversion algorithm. The test results show that this method can estimate the surface pressure field of a building. The work in [50] studied the selection of interpolation parameters in global and Mediterranean SLA products. Studies have shown that the number of eddies detected in the SLA map exceeds that of the products in the Mediterranean region along the satellite track. The study in [51] proposed a hybrid sea surface temperature (SST) algorithm to analyze the regression process of the incremental value between the brightness and temperature observed by satellites. The work in [52] focuses on mesoscale and submesoscale processes, such as coastal currents and river plumes, and how they affect sedimentary dynamics and basin spatial scales. A new method of numerical simulation combining observational data was developed to combine the satellite measurements of suspended sediment with the velocity field obtained from a numerical ocean model to obtain estimates of the sediment flux. The numerical deviation of the flux is then calculated. Marine color satellite remote sensing reflectance data using bio-optical algorithms can also be used to retrieve the biogeochemical characteristics of the ocean [53].

Conclusions
In this article, we introduced and applied the scientific information visualization software Cite-Space. Based on 1444 research articles on ocean big data technology in the web of science, we used atlas research methods for scientific knowledge to analyze marine big data information. A pioneering information system in the field, as well as theoretical research methods, were used to perform a quantitative statistical analysis, fully showing the basic situation and development technology of the marine big data information field, including the distribution of countries, institutions, major authors, research hotspots, and emerging issues.

A. This study Revealed the Distribution Situation of Big Ocean Data Research between Countries, Institutions, and Authors
• American literature is the most frequently cited, followed by studies from China, the United Kingdom, Germany, and France. China ranks lower in citation centrality. In the future, we need to improve Chinese citation centrality in language, communication, culture, and technology.

•
The University of the Chinese Academy of Sciences and the Ocean University of China are the major marine big data research institutions in China, ranking sixth and twelfth in the world, respectively.

•
Of these 1444 documents, Atmanand, Malayath Aravindakshan, Hoteit I, Miloslavich Patricia, Achterberg, and Eric P have a higher volume of posts, and the frequency of their cooperation is also among the top four.

B. This study Defined Research Hot Spots
• Monitoring: Marine monitoring is widely used in the marine field, including "hybrid" interactive deep blue detection systems. Further, CNN algorithms are applied to marine environment monitoring. Digital image processing technology is also widely used in marine environment monitoring.

•
Tracking: Ocean tracking can determine ocean information, tuna can be used as an indicator for tracking mercury in the Atlantic Ocean, existing underwater AUVs can track underwater trajectories, and ocean tracking networks can help people understand the distribution of aquatic animals.

•
In situ: The principles and application advantages of in situ marine detection and the research progress of in situ detection in China are introduced. • Sensor: Optical fiber sensors used in the ocean can measure the salinity of seawater, marine color sensors can be used to detect the chemical characteristics of chlorophyll in marine life, AUVs with integrated sensor systems can sample ocean pH, and the Ocean Sensor Network (OSN) can provide the necessary comprehensive information to observe the marine environment.
• Satellite: Ocean satellites can make up for the shortcomings of ocean observations. Satellite ocean observations can be used to observe the surface ocean's carbonate chemistry. The passive microwave observations of ocean satellites can be used to perform inversion research on TC surface pressure fields and perform an inversion algorithm. In this study, the algorithm was improved by analyzing the incremental regression process between the satellite observations' brightness and temperature.