TECHNICAL NOTE | doi:10.20944/preprints202211.0220.v2
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Relational Database; Columnar Storage; Bloom Filter; Skip List; Field Level Lock; Read Write Concurrency; OLTP; OLAP; LSM-Tree; Token Bucket Algorithm
Online: 14 November 2022 (03:02:09 CET)
At present, diversified and highly concurrent businesses in the Internet industry often require heterogeneous databases formed by multiple databases to meet the needs. This report introduces database kernel SG-ColBase we developed. After achieving read and write concurrency control, data rollback, atomic log writing, and downtime data redo to ensure complete transaction support. The parallelism of database kernel execution is extended through field level locks and snapshot reads. Use the Bloom filter, resource cache pool, memory pool, skip list, non blocking log cache, and asynchronous data writing mechanism to improve the overall execution efficiency of the system. In terms of data storage, column storage, logical key and LSM-tree are introduced. While improving the data compression ratio and reducing data gaps, all disk data operations are written in incremental order. With the characteristics of asynchronous batch operation, the data writing speed is greatly improved. Thanks to the continuous feature of vertical data brought by column storage, the disk scanning brought by vertical traversal is reduced, which is a qualitative leap in efficiency compared with traditional relational databases in the big data analysis scenario. SG-ColBase can reduce the use of heterogeneous databases in business and improve R&D efficiency.
ARTICLE | doi:10.20944/preprints202211.0190.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: sustainability; smart cities; Internet of Things (IoT); multi-agent deep reinforcement learning; smart waste management; smart sensors
Online: 10 November 2022 (04:49:09 CET)
Ever-increasing need for improving the livability of a city and improve outcomes for its residents, over the last decade, the adoption of technology to develop urbanised societies around the world has given rise to the need for developing smart cities. The speed at which the world population is growing, the use of Internet of Things in smart cities have really advanced the quality of life. One significant area of concern within the smart city framework is waste management. If the waste within a city is not adequately managed, then it leads to issues in the health of the citizens. Additionally, the waste management has such a high impact on the environmental footprint, hence the need to have a smart way of managing waste is of critical importance. Through our research, we analyse the challenges of waste management within a city to understand the impact of the problem on to the citizens and overall city operations. We then investigate ways in which we can solve these problems using the emerging technologies, such as the Internet of Things, to collect valuable data of large volumes arriving at an astronomical rate, then apply multi-agent deep reinforcement learning algorithms to harness the power of big data to extract meaningful information and actionable insights. We ingest data generated by our Internet of Things into our algorithm for three main purposes including providing the notifications to an external system, for example, a map navigation engine out of the scope for this project but a future extension for route optimisation and waste vehicle tracking; extracting and reporting the actionable insights from the underlying data; and consuming the extracted data for predictive forecasting to draw out the unknown patterns of waste fill levels within various geographical locations and again send out triggers and notification to external systems for example a waste collection authority who can efficiently schedule the waste collection vehicles and optimise the route. To achieve the above mentioned outcomes, we propose a framework that is agnostic of the hardware that it connects to and can effectively interface with a wide variety of hardware keeping a level of abstraction in the architecture.
REVIEW | doi:10.20944/preprints202211.0161.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: High Performance Computing (HPC); big data; High Performance Data Analytics (HPDS); con-vergence; data locality; spark; Hadoop; design patterns; process mapping; in-situ data analysis
Online: 9 November 2022 (01:38:34 CET)
Big data has revolutionised science and technology leading to the transformation of our societies. High Performance Computing (HPC) provides the necessary computational power for big data analysis using artificial intelligence and methods. Traditionally HPC and big data had focused on different problem domains and had grown into two different ecosystems. Efforts have been underway for the last few years on bringing the best of both paradigms into HPC and big converged architectures. Designing HPC and big data converged systems is a hard task requiring careful placement of data, analytics, and other computational tasks such that the desired performance is achieved with the least amount of resources. Energy efficiency has become the biggest hurdle in the realisation of HPC, big data, and converged systems capable of delivering exascale and beyond performance. Data locality is a key parameter of HPDA system design as moving even a byte costs heavily both in time and energy with an increase in the size of the system. Performance in terms of time and energy are the most important factors for users, particularly energy, due to it being the major hurdle in high performance system design and the increasing focus on green energy systems due to environmental sustainability. Data locality is a broad term that encapsulates different aspects including bringing computations to data, minimizing data movement by efficient exploitation of cache hierarchies, reducing intra- and inter-node communications, locality-aware process and thread mapping, and in-situ and in-transit data analysis. This paper provides an extensive review of the cutting-edge on data locality in HPC, big data, and converged systems. We review the literature on data locality in HPC, big data, and converged environments and discuss challenges, opportunities, and future directions. Subsequently, using the knowledge gained from this extensive review, we propose a system architecture for future HPC and big data converged systems. To the best of our knowledge, there is no such review on data locality in converged HPC and big data systems.
REVIEW | doi:10.20944/preprints202211.0128.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Cyber security threats; Cyber security threats to educational institutes; growing concern for the new era of cybersecurity; New Era of cybersecurity
Online: 7 November 2022 (14:37:03 CET)
Background: The outbreak of the Covid-19 pandemic has significantly affected the operations of higher education institutions. Due to the limited use of video conferencing and cloud computing in these facilities, distance learning became the only option available to them. Objective: The study focused on identifying the most common types of attacks that can affect e-learning assets. Results: There was a lack of clear cybersecurity policies for educational institutes and universities in 2020, according to a report by Microsoft Security Intelligence. The report showed that the education industry was the most targeted sector for malware attacks in the last 30 days. Conclusion: The recommendations for improving the security of e-learning systems. Some of these include implementing policies that restrict access to the resources and applications, updating security patches, and using cryptographic protocols.
ARTICLE | doi:10.20944/preprints202209.0202.v2
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: alive publication; dynamic component of bibliographic reference; latest revision date; Crossref; arXiv.org
Online: 7 November 2022 (11:00:38 CET)
The scientific work posted on the Internet, which its author constantly keeps up to date, will be called an alive publication. The genre of alive publishing has many attractive features. However, it requires a certain expansion of the composition of the meta-attributes of the publication: along with the traditional attributes, the date of the appearance of the new, fresh revision is brought to the fore here. Such date is placed in a prominent place in the text of the publication. Along with this, it becomes highly desirable to include such a dynamically ("on the fly") generated date in a bibliographic reference to an alive publication. The currently used methods of dynamic extraction of this date are considered for a simple online publication, for a publication that has received a DOI through Crossref, and for a publication posted in arXiv.org. Thanks to adding this meta-attribute, references to alive publications will beautify any bibliographic list.
ARTICLE | doi:10.20944/preprints202211.0111.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Recommendation, GNN, Preference
Online: 7 November 2022 (08:38:06 CET)
With the rapid development of technology and the advancement of Internet technology, various social networking platforms are gradually coming into people's view and occupying a higher and higher position. In the recommendation scenario, the user-item interaction naturally forms a bipartite heterogeneous graph structure and with the development of graph embedding and graph neural network technologies based on deep learning to process graph domain information, the combination of graph information and recommendation systems shows strong research potential and application prospects. The methodological improvement of the recommendation algorithm based on collaborative filtering takes advantage of the nature that user-items can form a bipartite graph in the recommendation scenario. The existing methods still have some shortcomings. The methods that only use weights or convolutional recurrent neural networks to implicitly model different historical behaviors lack explicit modeling of video switching relationships in serialized behaviors. The user's interest is changing all the time, so it is not possible to recommend based on the user's history, and it is necessary to consider both the long-term and short-term interest of the user according to the video content in order to achieve accurate recommendation of short videos. In this paper, we design a recommendation model based on graph neural network, which models users' long-term and short-term interests by two vector propagation methods, respectively.
ARTICLE | doi:10.20944/preprints202211.0093.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: automatic typesetting; media-neutral publishing; open access; open source; scholarly publishing; XML/HTML conversion
Online: 4 November 2022 (13:17:34 CET)
Due to resource constraints, most Diamond Open Access journals publish less than 25 articles per year, and 75% of journals are not able to provide their content in XML and HTML, primarily providing only PDFs (Bosman et al., 2021, p. 7-8). In order to keep up with larger commercial publishers, a high degree of automation and streamlining of processes is necessary. The Open Source Academic Publishing Suite (OS-APS) project funded by the German Federal Ministry of Education and Research aims to achieve this. OS-APS automatically extracts the underlying XML from Word manuscripts and offers optimization and export options in various formats (PDF, HTML, EPUB). The professional corporate design, e.g., of the PDFs, is managed automatically by using templates or creating one's own using a Template Development Kit. OS-APS will also connect to scholarly-led and community-driven publishing platforms such as Open Journal Systems (OJS), Open Monograph Press (OMP), and DSpace: the software will be able to be integrated into a wide range of publication processes, whether at small, low-resource commercial Open Access Publishers, or institutional and Diamond Open Access Publishers. References: Bosman, J., Frantsvåg, J. E., Kramer, B., Langlais, P.‑C., & Proudman, V. (2021). Oa Diamond Journals Study. Part 1: Findings. https://doi.org/10.5281/zenodo.4558703
ARTICLE | doi:10.20944/preprints202211.0034.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Blockchain; Smart Contract; Point Cloud; Security; Privacy Preservation; Software-Defined Network (SND); Big Data; Assurance; Resilience.
Online: 2 November 2022 (02:18:50 CET)
The rapid development of three-dimensional (3D) acquisition technology based on 3D sensors provides a large volume of data, which is often represented in the form of point clouds. Point cloud representation can preserve the original geometric information along with associated attributes in a 3D space. Therefore, it has been widely adopted in many scene-understanding-related applications such as virtual reality (VR) and autonomous driving. However, the massive amount of point cloud data aggregated from distributed 3D sensors also poses challenges for secure data collection, management, storage, and sharing. Thanks to the characteristics of decentralization and security nature, Blockchain has a great potential to improve point cloud services and enhance security and privacy preservation. Inspired by the rationales behind Software Defined Network (SDN) technology, this paper envisions SAUSA, a blockchain-based authentication network that is capable of recording, tracking, and auditing the access, usage, and storage of 3D point cloud data sets in their life-cycle in a decentralized manner. SAUSA adopts an SDN-enabled point cloud service architecture which allows for efficient data processing and delivery to satisfy diverse Quality-of-Service (QoS) requirements. A blockchain-based authentication framework is proposed to ensure security and privacy preservation in point cloud data acquisition, storage, and analytics. Leveraging smart contracts for digitizing access control policies and point cloud data on the blockchain, data owners have full control of their 3D sensors and point clouds. In addition, anyone can verify the authenticity and integrity of point clouds in use without relying on a third party. Moreover, SAUSA integrates a decentralized storage platform to store encrypted point clouds while recording references of raw data on the distributed ledger. Such a hybrid on-chain and off-chain storage strategy not only improves robustness and availability but also ensures privacy preservation for sensitive information in point cloud applications. A proof-of-concept prototype is implemented and tested on a physical network. The experimental evaluation validates the feasibility and effectiveness of the proposed SAUSA solution.
ARTICLE | doi:10.20944/preprints202211.0015.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Unmanned Aerial Vehicle (UAV); Lightweight Blockchain; Drone Security; assurance; authentication; resilience
Online: 1 November 2022 (04:07:30 CET)
Rapid advancements in the fifth generation (5G) communication technology and mobile edge computing (MEC) paradigm lead to the proliferation of unmanned aerial vehicles (UAV) in urban air mobility (UAM) networks, which provide intelligent services for diversified smart city scenarios. Meanwhile, the widely deployed internet of drones (IoD) in smart cities also brings up new concerns on performance, security, and privacy. The centralized framework adopted by conventional UAM networks is not adequate to handle high mobility and dynamicity. Moreover, it is necessary to ensure device authentication, data integrity, and privacy preservation in UAM networks. Thanks to characteristics of decentralization, traceability, and unalterability, Blockchain is recognized as a promising technology to enhance security and privacy for UAM networks. In this paper, we introduce LightMAN, a lightweight microchained fabric for data assurance and resilience-oriented UAM networks. LightMAN is tailored for small-scale permissioned UAV networks, in which a microchain acts as a lightweight distributed ledger for security guarantees. Thus, participants are enabled to authenticate drones and verify the genuineness of data that is sent to/from drones without relying on a third-party agency. In addition, a hybrid on-chain and off-chain storage strategy is adopted that not only improves performance (e.g,.latency and throughput) but also ensures privacy preservation for sensitive information in UAM networks. A proof-of-concept prototype is implemented and tested on a Micro Air Vehicle Link (MAVLink) simulator. The experimental evaluation validates the feasibility and effectiveness of the proposed LightMAN solution.
ARTICLE | doi:10.20944/preprints202210.0229.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: medical knowledge graphs; knowledge graphs reuse; ontology modularization
Online: 17 October 2022 (05:13:53 CEST)
During the creation and integration of a health care system based on medical knowledge graphs, it is necessary to review and select the vocabularies and definitions that best fit the information requirements of the system being developed. This implies the reuse of medical knowledge graphs; however, full importation of knowledge graphs is not a tractable solution in terms of memory requirements. In this paper we present a modularization-based method for knowledge graph reuse. A case study of graph reuse is presented by transforming the original model into a lighter one.
BRIEF REPORT | doi:10.20944/preprints202210.0208.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Quantum neural network; Breast cancer; Classical neural network; Machine learning; Mammography
Online: 14 October 2022 (10:11:54 CEST)
Computer-aided image diagnostics (CAD) have been used in many fields of diagnostic medicine. It relies heavily on classical computer vision and artificial intelligence. Quantum neural network (QNN) has been introduced by many researchers around the world and presented recently by research corporations such as Microsoft, Google, and IBM. In this paper, the investigation of the validity of using the QNN algorithm for machine-based breast cancer detection was performed. To validate the learnability of the QNN, a series of learnability tests were performed alongside with classical convolutional neural network (CCNN). QNN is built using the Cirq library to perform the assimilation of quantum computation on classical computers. Series of investigations were performed to study the learnability characteristics of QNN and CCNN under the same computational conditions. The comparison was performed for real Mammogram data sets. The investigations showed success in terms of recognizing the data and training. Our work shows better performance of QNN in terms of successfully training and producing a valid model for smaller data set compared to CCNN.
ARTICLE | doi:10.20944/preprints202210.0129.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: PageRank; Time-Weighted PageRank; collective subjects; citation intensity; scientific research; research productivity; scientometrics
Online: 10 October 2022 (14:04:52 CEST)
This study aims to estimate the scientific productivity of collective subjects. The objective is to build a method for evaluating scientific productivity that allows calculating productivity, including for new collective subjects with a small citation network—the paper purposes the Time-Weighted PageRank method with citation intensity (TWPR-CI). The Citation Network Dataset (ver. 13) has been analyzed to verify the method. The dataset includes more than 5 million scientific publications and 48 million citations. There have been allocated four classes of collective subjects (more than 27,000 collective subjects in total). For each class, scientific productivity estimates from 2000 to 2021 were calculated using the PageRank, Time-Weighted PageRank, and TWPR-CI methods. It is shown that the advantage of the TWPR-CI method is the higher sensitivity of the scientific productivity estimates for new collective subjects on average during the first ten years of observation. At the same time, the assessment of scientific productivity for other collective subjects according to this method is stable. However, the small citation network of the new collective subjects does not allow an adequate assessment of scientific productivity during the first years of its operation. Therefore, the TWPR-CI method can be used to assess the scientific productivity of collective subjects, in particular the productivity of new ones.
ARTICLE | doi:10.20944/preprints202210.0064.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Homomorphic Encryption; Privacy-preserving Record Linkage; approximate string matching
Online: 6 October 2022 (10:31:25 CEST)
String matching is an important part in many real world applications. It must robust against variations in string field. In record linkage for two different datasets matching should detect two patients in common in spite of small variations. But it becomes difficult in case of confidential data because sometimes data sharing between organizations become restricted for privacy purposes. Several techniques have been proposed on privacy-preserving approximate string matching such as Secure Hash Encoding etc. Relative to other techniques for approximate string matching Homomorphic encryption is very new. In this paper we have proposed a Homomorphic Encryption based approximate string matching technique for matching multiple attributes. There is no solution currently available for multiple attributes matching using Homomorphic encryption. We have proposed two different methods for multiple attributes matching. Compare to other existing approaches our proposed method offers security guarantees and greater matching accuracy.
ARTICLE | doi:10.20944/preprints202210.0043.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Commodities; Long Short-Term Memory; Machine Learning; Neural Networks; Prediction; Technical analysis
Online: 5 October 2022 (13:39:16 CEST)
This paper presents the development and implementation of a machine learning model to estimate the future price of commodities in the Brazilian market from technical analysis indicators. For this, two databases were obtained regarding the commodities sugar, cotton, corn, soybean and wheat, which were submitted to the steps of data cleaning, pre-processing and subdivision. From the pre-processed data, recurrent neural networks of the long short-term memory type were used to perform the prediction of data in the interval of 1 and 3 days ahead. These models were evaluated using mean squared error, obtaining an accuracy between 0.00010 and 0.00037 on the test data for 1 day ahead and 0.00015 to 0.00041 for 3 days ahead. However, based on the results obtained, it can be stated that the developed model obtained a good prediction performance for all commodities evaluated.
ARTICLE | doi:10.20944/preprints202209.0413.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Consortium Blockchain; Ring signature; Blockchain privacy; Blockchain security; Access Control; Blockchain big data
Online: 27 September 2022 (07:35:53 CEST)
Banking sectors commit modern working frameworks and models smooth development based on decentralization with keeping money confront in unused ranges and differing activities. Consortium Blockchain Privacy becomes a major concern and the challenge of Most of banking sectors.Development without being hampered being a major concern it can store confirmed, Data privacy includes assuring protection for both insider ad outsider threats therefore access control of Ring signature could help to secure Privacy of inside and outside threats by secure process by RSBAC using CIA triad privacy Confidentiality, Availability, Integrity.This paper proposes a ring signature-based on access control mechanism for determining who a user is and then regulating that person's access to and use of a system's resources. In a nutshell, access control restricts who has access to a system. It also restricts access to system resources to users who have been identified as having the necessary privileges and permissions. The proposed paradigm satisfies the needs of both workflow and non-workflow systems in an enterprise setting. The traits of the conditional purposes, roles, responsibilities, and policies provide the foundation for it. It ensures that internal risks such as database administrators are protected.Finally, it provides the necessary protection in the event that the data is published.
ARTICLE | doi:10.20944/preprints202209.0358.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Quantum Search; Qubit Management; Iterative Search
Online: 23 September 2022 (05:25:28 CEST)
Recent advances in quantum computing systems attract tremendous attention. Commercial companies, such as IBM, Amazon, and IonQ, have started to provide access to noisy intermediate-scale quantum computers. Researchers and entrepreneurs attempt to deploy their applications that aim to achieve a quantum speedup. Grover’s algorithm and quantum phase estimation are the foundations of many applications with the potential for such a speedup. While these algorithms, in theory, obtain marvelous performance, deploying them on existing quantum devices is a challenging task. For example, quantum phase estimation requires extra qubits and a large number of controlled operations, which are impractical due to low-qubit and noisy hardware. To fully utilize the limited onboard qubits, we develop a distributed application with a key-value data structure based on Grover’s algorithm called IQuCS . Consider a database with duplicates. By encoding each element to a binary type with a unique key and forming a key-value pair, we can count the number of occurrences of each element in the database based on quantum computing. We have optimized the operation process by filtering data points to make it more efficient. To determine the effect of this optimization, we evaluate it with datasets of different sizes and with different numbers of duplicates. With the assistance of classical computers, IQuCS can reduce the problem set for each query. Due to this reduction, IQuCS requires fewer qubits. Through the iterative management, IQuCS achieves a reduction of qubit virtualized consumption, up to 66.2%, with reasonable accuracy.
REVIEW | doi:10.20944/preprints202209.0338.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Road; Accidents; Black spots; spatial analysis; Factor analysis
Online: 22 September 2022 (09:41:07 CEST)
This paper deals with identifying the accident black spots and the influencing factors causing accidents using factored analysis in the medium-sized city (Tirunelveli) in India. From the literature review, the geospatial technique to identify the black spots and the factors causing accidents was used for analysis. The most influencing factors driving the accident were identified and ranked based on the repetitive occurrence of accidents in the black spot area. The spearman ranking system obtained the correlation among the factors causing accidents. The factor analysis technique was utilized to identify the key factors driving the repetitive accidents and group them. This study will help transportation planners to understand the factors causing accidents and take appropriate measures to reduce the casualties in the road construction planning stage and existing conditions.
DATA DESCRIPTOR | doi:10.20944/preprints202209.0323.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: COVID-19; Open-source dataset; Drug Repurposing; Database system; Web application devel-opment; software development; Drug fingerprints; Bulk upload
Online: 21 September 2022 (10:14:11 CEST)
Although various vaccines are now commercially available, they have not been able to stop the spread of COVID-19 infection completely. An excellent strategy to quickly get safe, effective, and affordable COVID-19 treatment is to repurpose drugs that are already approved for other diseases as adjuvants along with the ongoing vaccine regime. The process of developing an accurate and standardized drug repurposing dataset requires a considerable level of resources and expertise due to the commercial availability of an extensive array of drugs that could be potentially used to address the SARS-CoV-2 infection. To address this bottleneck, we created the CoviRx platform. CoviRx is a user-friendly interface that provides access to the data, which is manually curated for COVID-19 drug repurposing data. Through CoviRx, the data curated has been made open-source to help advance drug repurposing research. CoviRx also encourages users to submit their findings after thoroughly validating the data, followed by merging it by enforcing uniformity and integ-rity-preserving constraints. This article discusses the various features of CoviRx and its design principles. CoviRx has been designed so that its functionality is independent of the data it dis-plays. Thus, in the future, this platform can be extended to include any other disease X beyond COVID-19. CoviRx can be accessed at www.covirx.org.
ARTICLE | doi:10.20944/preprints202209.0306.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: AIoT; Artificial Intelligence; Assistive Technology; Deep Learning; Machine Learning
Online: 20 September 2022 (10:45:15 CEST)
According to the World Health Organization, about 15% of the world’s population has some form of disability. Assistive Technology, in this context, contributes directly to the overcoming of difficulties encountered by people with disabilities in their daily lives, allowing them to receive education and become part of the labor market and society in a worthy manner. Assistive Technology has made great advances in its integration with Artificial Intelligence of Things (AIoT) devices. AIoT processes and analyzes the large amount of data generated by IoT devices and applies Artificial Intelligence models, specifically Machine Learning, to discover patterns for generating insights and assisting in decision making. Based on a systematic literature review, this article aims at identifying the Machine Learning models used in multiple different research about Artificial Intelligence of Things applied to Assistive Technology. The survey of the topics approached in this article also highlights the context of such research, their application, IoT devices used, and gaps and opportunities for further development. Survey results show that 50% of the analyzed research address visual impairment, and for this reason, most of the topics cover issues related to computational vision. Portable devices, wearables, and smartphones constituted the majority of IoT devices. Deep Neural Networks represent 81% of the Machine Learning models applied in the reviewed research.
ARTICLE | doi:10.20944/preprints202209.0295.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Authentication; Encryption; Blockchain
Online: 20 September 2022 (05:51:42 CEST)
In this work we present a new algorithm that achieves the perfect Shannon secret by means of the XOR function and a method that we call multiple key reuse. The algorithm has two execution modes: message authentication and data encryption. The XOR encryption scheme allows for batch encryption and exhibits Perfect Forward Secrecy (PFS). Furthermore, based on our fundamental algorithm, we have developed a new strategy for blockchain implementation that does not require Proof of Work (PoW), but defines a fair mechanism for miner selection and secure addition of blocks to the chain. Since our method is mainly based on the Boolean XOR function, the strength of the cryptosystem can be directly established thanks to its mathematical properties. Due to the risk that quantum computers represent for current cryptosystems based on prime factorization or discrete logarithm, we postulate that our method represents a promising alternative in the quantum era for the security of communications between Internet of Things devices as well as Blockchain technology.
ARTICLE | doi:10.20944/preprints202209.0286.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: intrusion detection; vulnerability exploit; machine learning; code-reuse attack; malware detection
Online: 20 September 2022 (03:10:37 CEST)
Recent years have witnessed a rapid growth of code-reuse attacks in advance persistent threats and cyberspace crimes. Carefully crafted code-reuse exploits circumvent modern protection mechanisms and hijack the execution flow of a program to perform expected functionalities by chaining together existing codes. The sophistication and intrication of code-reuse exploits hinder the scrutinization and dissection on them. Although the previous literature has introduced some feasible approaches, effectiveness and reliability in practical applications remain severe challenges. To address this issue, we propose Horus, a data-driven framework for effective and reliable detection on code-reuse exploits. In order to raise the effectiveness against underlying noises, we comprehensively leverage the strengths of time-series and frequency-domain analysis, and propose a learning-based detector that synthesizes the contemporary twofold features. Then we employ a lightweight interpreter to speculatively and tentatively translate the suspicious bytes to open the black box and enhance the reliability and interpretability. Additionally, a functionality-preserving data augmentation is adopted to increase the diversity of limited training data and raise the generality for real-world deployment. Comparative experiments and ablation studies are conducted on a dataset composed of real-world instances to verify and prove the prevalence of Horus. The experimental results illustrate that Horus outperform existing methods on the identification of code-reuse exploits from data stream with an acceptable overhead. Horus does not rely on any dynamic executions and can be easily integrated into existing defense systems. Moreover, Horus is able to provide tentative interpretations about attack semantics irrespective of target program, which further improve system's effectiveness and reliability.
ARTICLE | doi:10.20944/preprints202209.0212.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Graph Neural Network; Recommendation; Social Relationship
Online: 14 September 2022 (16:09:15 CEST)
There is a considerable amount of research in online social networks, most of which focuses on the structural analysis of social graphs. The interpersonal relationships of social networks, especially friend circle, can solve the cold start and sparsity problems, and through the relationship between social networks can effectively recommend users' favorite items (items), such as music , videos, brands/products, preferred tags, location, services, etc. User relationships in social networks are diverse and there are many different perspectives on different social networks. Associations among users can form multi-layered composite networks, and multi-layered social networks present new challenges and opportunities. Different relationships can influence users' preferences to different degrees, which in turn affects their behavior. Therefore, fusing multiple social networks is an effective way to improve recommendation. Although some studies have started to address multiple social network recommendations, simple linear superposition cannot reflect the coupling and nonlinear association between multiple social networks. In this paper, we propose a graph neural network recommendation model under social relationships based on this background. We first propose to compute the 2nd order collaborative signals and their intensities directly from the neighboring matrix for updating the node embedding of the graph convolution layer. Secondly by embedding historical evaluations, various social networks constituting different dimensions, the attention integration of user preferences by different social networks is achieved, and its effectiveness and scalability are demonstrated in theoretical derivation and experimental validation. The theoretical derivation and experimental validation demonstrate its effectiveness and scalability.
ARTICLE | doi:10.20944/preprints202207.0283.v2
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: information technology; impact; society; future
Online: 13 September 2022 (08:27:14 CEST)
As we are aware Information Technology had its cutting-edge lifestyle from the overdue sixties of the remaining century whilst the Arpanet become introduced, funded with the aid of using the branch of protection of the USA. After that, the IT enterprise has come a protracted manner to its cutting-edge form in which its miles gambling a dominant function in each sphere of life. It has made innovative modifications in facts amassing and dissemination and worldwide communication. It is growing into a surely paperless painting environment. Also, we can now ship a message very without difficulty to everywhere withinside the international in seconds. From a schooling factor of view, we can have a digital elegance in which the teacher ought to take a seat down in any part of the arena and his college students scattered in all exceptional elements of the arena via video convention with the presentation of look at substances in addition to query and solution sessions. A health practitioner now sitting in any part of the arena ought to carry out a surgical procedure in which the affected person is mendacity in some other part of the arena. These work examples display where we stand these days compared to what has become 1/2 of a century back. But as we recognize, nothing on this international is only correct as the whole thing has a darkish side. In this paper, we might speak about the deserves and demerits of enforcing IT globally and in which we are heading withinside the future.
ARTICLE | doi:10.20944/preprints202209.0109.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Kalman filter; median filter; impulse noise; estimate prediction; object distance determination; lidar; value calibration; point cloud.
Online: 7 September 2022 (10:20:49 CEST)
The task of determining the distance from one object to another is one of the important tasks solved in robotics systems. Conventional algorithms rely on an iterative process of predicting distance estimates, which results in an increased computational burden. Algorithms used in robotic systems should require minimal time costs, as well as be resistant to the presence of noise. To solve these problems, the paper proposes an algorithm for Kalman combination filtering with a Goldschmidt divisor and a median filter. Software simulation showed an increase in the accuracy of predicting the estimate of the developed algorithm in comparison with the traditional filtering algorithm, as well as an increase in the speed of the algorithm. The results obtained can be effectively applied in various computer vision systems.
ARTICLE | doi:10.20944/preprints202209.0094.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Blockchain; Cryptography; DApp; Health Data; Privacy.
Online: 7 September 2022 (03:06:09 CEST)
With the fast development of blockchain technology in latest years, its application in scenarios that require privacy, such as health area, became encouraged and widely discussed. This paper presents an architecture to ensure the privacy of health-related data, which are stored and shared within a blockchain network in a decentralized manner, through the use of encryption with the RSA, ECC and AES algorithms. Evaluation tests were performed to verify the impact of cryptography on the proposed architecture in terms of computational effort, memory usage and execution time. The results demonstrate an impact mainly on the execution time and on the increase in the computational effort for sending data to the blockchain, however, justifiable considering the privacy and security provided with the architecture and encryption.
REVIEW | doi:10.20944/preprints202209.0032.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: cybersecurity; machine learning; deep learning; artificial intelligence; data-driven decision making; automation; cyber analytics; intelligent systems;
Online: 2 September 2022 (03:32:48 CEST)
Due to the digitization and Internet of Things revolutions, the present electronic world has a wealth of cybersecurity data. Efficiently resolving cyber anomalies and attacks is becoming a growing concern in today's cyber security industry all over the world. Traditional security solutions are insufficient to address contemporary security issues due to the rapid proliferation of many sorts of cyber-attacks and threats. Utilizing artificial intelligence knowledge, especially machine learning technology, is essential to providing a dynamically enhanced, automated, and up-to-date security system through analyzing security data. In this paper, we provide an extensive view of machine learning algorithms, emphasizing how they can be employed for intelligent data analysis and automation in cybersecurity through their potential to extract valuable insights from cyber data. We also explore a number of potential real-world use cases where data-driven intelligence, automation, and decision-making enable next-generation cyber protection that is more proactive than traditional approaches. The future prospects of machine learning in cybersecurity are eventually emphasized based on our study, along with relevant research directions. Overall, our goal is to explore not only the current state of machine learning and relevant methodologies but also their applicability for future cybersecurity breakthroughs.
ARTICLE | doi:10.20944/preprints202209.0009.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Digital Twin; Internet-of-Medical-Things (IoMT); Security; Privacy; Blockchain; Non-fungible Token (NFT); Virtual Healthcare Services; Access Control; Data Sharing
Online: 1 September 2022 (07:21:25 CEST)
Seniors safety is a compelling need, which necessitates 24/7 real-time monitoring and timely dangerous action recognition. Being able to mirror characteristics of physical objects (PO) to corresponding logical objects (LO) and seamlessly monitor their footprints thus detect anomaly parameters, Digital Twins (DT) has been considered a practical way to provide virtual health services for seniors safety. Meanwhile, widely adopted Internet of Medical Things (IoMT) consisting of wearable sensors and non-contact optical cameras for self and remote health data monitoring also raises concerns on information security and privacy violation. Therefore, security of POs, LOs and reliable data sharing among healthcare professionals are challenging as constructing trust and privacy-preserving virtual health services. Thanks to characteristics of decentralization, traceability and unalterability, Blockchain is promising to enhance security and privacy properties in many areas like data analysis, finance and healthcare. This paper envisions a lightweight authentication framework (LAF) to enable secure and privacy-preserving virtual healthcare services. Leveraging Non-Fungible Token (NFT) technology to tokenize LOs and data streams on blockchain, anyone can certify the authenticity of a digital LO along with its synchronized data between PO without relying on a third-party agency. In addition, the NFT-based tokenization not only allows owners fully control their IoMT devices and data, but it also enables verifiable ownership and traceable transferability during data sharing process. Moreover, NFT only contains references to encrypted raw data that are saved on off-chain storage like local files or distributed databases, such a hybrid storage strategy ensures privacy-preservation for sensitive information. A proof-of-concept prototype is implemented and tests are conducted on a case study of seniors safety. The experimental evaluation shows the feasibility and effectiveness of the proposed LAF solution.
ARTICLE | doi:10.20944/preprints202208.0427.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: cloud-native; observability; cloud computing; logging; structured logging; logs; metrics; traces; distributed tracing; log aggregation; log forwarding; log consolidation
Online: 25 August 2022 (07:32:18 CEST)
Background: Cloud-native software systems often have a much more decentralized structure and many independently deployable and (horizontally) scalable components, making it more complicated to create a shared and consolidated picture of the overall decentralized system state. Today, observability is often understood as a triad of collecting and processing metrics, distributed tracing data, and logging. The result is often a complex observability system composed of three stovepipes whose data is difficult to correlate. Objective: This study analyzes whether these three historically emerged observability stovepipes of logs, metrics and distributed traces could be handled more integrated and with a more straightforward instrumentation approach. Method: This study applied an action research methodology used mainly in industry-academia collaboration and common in software engineering. The research design utilized iterative action research cycles, including one long-term use case. Results: This study presents a unified logging library for Python and a unified logging architecture that uses the structured logging approach. The evaluation shows that several thousand events per minute are easily processable. Conclusion: The results indicate that a unification of the current observability triad is possible without the necessity to develop utterly new toolchains.
ARTICLE | doi:10.20944/preprints202208.0382.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: IP traceback; smart mesh Microgrid; NS-3; real secure testbed
Online: 22 August 2022 (11:16:04 CEST)
Today's major challenge for smart Microgrids is to ensure the security of communications in a large number of changing data sets that are vulnerable to attacks by denial of services in constant evolution. The Internet Protocol Traceback defines a set of methods that help identify the source of an attack with minimal requirements for memory and processing. However, the concept of Traceback is not yet being used in smart Microgrids. As a result, the main challenge of this article is to incorporate a new Traceback approach into the cybernetic system of a smart mesh Microgrid, which can be tested using a network simulator (NS-3) based on delay, debit, and packet loss rate parameters. In fact, the simulation results show the efficacy of this approach compared to others existing in the literature. Furthermore, using the proposed Traceback technique and the mesh nodes, we were able to create a smart meshed Microgrid. Moreover, using the Traceback approach given for merging Intel Galileo Gen.1 nodes with the Compex WLE200NX.11a/b/g/n to establish a secure test bench, which is deployed as a prototype at the Sfax Digital Research Center in Tunisia, we were able to create an intelligent Microgrid. In fact, by identifying all attack vectors and revealing their origins, we could boost the efficiency of our operation by 100%.
ARTICLE | doi:10.20944/preprints202208.0353.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: recommender; multimodal; context-aware
Online: 19 August 2022 (03:07:04 CEST)
The advent of the era of big data will bring more convenience to people and greater development to society. But at the same time, it will also bring people the problem of 'information overload', i.e., when people are faced with huge information data, there are many redundant and worthless data. The redundant and worthless data information seriously interferes with the accurate selection of information data. Even though people can use Internet search engines to access information data, they cannot meet the individual needs of specific users in specific contexts. The personalized needs of a particular user in a particular context. Therefore, how to find useful and valuable information quickly has become one of the key issues in the development of big data. With the advent of the era of big data, recommendation systems, as an important technology to alleviate information overload, have been widely used in the field of e-commerce. Recommender systems suffer from a key problem: data sparsity. The sparsity of user history rating data causes insufficient training of collaborative filtering recommendation models, which leads to a significant decrease in the accuracy of recommendations. In fact, traditional recommendation systems tend to focus on scoring information and ignore the contextual context in which users interact. There are various contextual modal information in people's real life, which also plays an important role in the recommendation process. In this paper we achieve data degradation and feature extraction, solving the problem of sparse data in the recommendation process. An interaction context-aware sub-model is constructed based on a tensor decomposition model with interaction context information to model the specific influence of interaction context in the recommendation process. Then an attribute context-aware sub-model is constructed based on the matrix decomposition model and using attribute context information to model the influence of user attribute contexts and item attribute contexts on recommendations. In the process of building the model, the method not only utilizes the explicit feedback rating information of users in the original dataset, but also utilizes the interaction context and attribute context information of the implicit feedback and the unlabeled rating data. We evaluate our model by extensive experiments. The results illustrate the effectiveness of our recommender model.
ARTICLE | doi:10.20944/preprints202208.0331.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Drug-Target Binding Affinity; Multi-Instance Learning; Transformer
Online: 18 August 2022 (03:58:34 CEST)
The prediction of drug-target interactions plays a fundamental role in facilitating drug discovery, where the goal is to find prospective drug candidates. With the increase in the number of drug-protein interactions, machine learning techniques, especially deep learning methods, have become applicable for drug-target interaction discovery because they significantly reduce the required experimental workload. In this paper, we present a spontaneous formulation of the drug-target interaction prediction problem as an instance of multi-instance learning. We address the problem in three stages, first organizing given drug and target sequences into instances via a private-public mechanism, then identifying the predicted scores of all instances in the same bag, and finally combining all the predicted scores as the output prediction. A comprehensive evaluation demonstrates that the proposed method outperforms other state-of-the-art methods on three benchmark datasets.
ARTICLE | doi:10.20944/preprints202208.0023.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: IoT; IoE; Blockchain; Rollup; Zero-Knowledge; Zk-Rollup; Scalability
Online: 1 August 2022 (11:46:06 CEST)
Internet of Things includes all connected objects from small embedded systems with low computational power and storage capacities to efficient ones, as well as moving objects like drones and autonomous vehicles. The concept of Internet of Everything expands upon this idea by adding people, data and processing. The adoption of such systems is exploding and becoming ever more significant, bringing with it questions related to the security and the privacy of these objects. A natural solution to data integrity, confidentiality and single point of failure vulnerability is the use of blockchains. Blockchains can be used as an immutable data layer for storing information, avoiding single point of failure vulnerability via decentralization and providing strong security and cryptographic tools for IoE. However, the adoption of blockchain technology in such heterogeneous systems, containing light devices, presents several challenges and practical issues that need to be overcome. Indeed, most of the solutions proposed to adapt blockchains to devices with low resources confront difficulty in maintaining decentralization or security. The most interesting are probably the Layer 2 solutions which build offchain systems strongly connected to the blockchain. Among these, zk-rollup is a promising new generation of Layer 2/off-chain schemes which can remove the last obstacles to blockchain adoption in IoT, or more generally, in IoE. Despite their promises illustrated by recent systems proposed by startups and private companies, very few scientific publications explaining or applying this barely-known technology have been published, especially for non-financial systems. In this context, the objective of our paper is to fill this gap for IoE systems in two steps. We first propose a synthetic review of recent proposals to improve scalability including onchain (consensus, blockchain organization, ...) and offchain (sidechain, rollups) solutions and we demonstrate that zk-rollups are the most promising ones. In a second step, we focus on IoE by describing several interesting features (scalability, dynamicity, data management, ...) that are illustrated with various general IoE use cases.
ARTICLE | doi:10.20944/preprints202207.0461.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Malaria; digital; epidemic; mixed infections; reinforcement
Online: 29 July 2022 (11:25:46 CEST)
Malaria is a long-standing disease and one of the top life-threatening diseases, yet its treatment has not changed, while the world has already embraced the Fourth Industrial Revolution (4IR). A wave of research on digitizing monitoring mechanisms of such a deadly disease has surfaced. Automated malaria screening is one of the detection processes which are gaining popularity in the research domain. However, the process needs to be coupled with other processes aiming a nationally or regionally contextualised malaria monitoring system. This paper proposes a digital malaria monitoring system in the context of an African country or region. One advantage of such a digital system is that is enables a novel disease spread forecasting model based on the dynamics of different malaria types. The architecture of the diagnosis system is described, and the disease spread model is mathematically modelled in terms of a SPITR (Susceptible- Protected- Infected-Treated- Recovered) epidemic model which is further analysed. The forecasting model is expressed and analysed whereas experiments are conducted using a Monte Carlo simulation method. The design of the monitoring system has inspired how predictions can be made in the complex cases such as mixed infections. Results show that reinforcing the model parameter makes a significant improvement on the disease prediction.
COMMUNICATION | doi:10.20944/preprints202206.0172.v3
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Monkeypox; monkey pox; Twitter; Dataset; Tweets; Social Media; Big Data; Data Mining; Data Science
Online: 25 July 2022 (09:41:19 CEST)
COMMUNICATION | doi:10.20944/preprints202204.0299.v3
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: elderly; aging population; ambient intelligence; fall detection; indoor localization; real-world implementation; sensors; activities of daily living; assisted living
Online: 21 July 2022 (10:46:08 CEST)
Falls, highly common in the constantly increasing global aging population, can have a variety of negative effects on their health, well-being, and quality of life, including restricting their capabilities to conduct Activities of Daily Living (ADLs), which are crucial for one’s sustenance. Timely assistance during falls is highly necessary, which involves tracking the indoor location of the elderly during their diverse navigational patterns associated with ADLs to detect the precise location of a fall. With the decreasing caregiver population on a global scale, it is important that the future of intelligent living environments can detect falls during ADL.s while being able to track the indoor location of the elderly in the real world. Prior works in these fields have several limitations, such as – the lack of functionalities to detect both falls and indoor locations, high cost of implementation, complicated design, the requirement of multiple hardware components for deployment, and the necessity to develop new hardware for implementation, which make the wide-scale deployment of such technologies challenging. To address these challenges, this work proposes a cost-effective and simplistic design paradigm for an Ambient Assisted Living system that can capture multimodal components of user behaviors during ADLs that are necessary for performing fall detection and indoor localization in a simultaneous manner in the real world. Proof of concept results from real-world experiments are presented to uphold the effective working of the system. The findings from two comparison studies with prior works in this field are also presented to uphold the novelty of this work. The first comparison study shows how the proposed system outperforms prior works in the areas of indoor localization and fall detection in terms of the effectiveness of its software design and hardware design. The second comparison study shows that the cost for the development of this system is the least as compared to prior works in these fields, which involved real-world development of the underlining systems, thereby upholding its cost-effective nature.
DATA DESCRIPTOR | doi:10.20944/preprints202206.0146.v2
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: COVID-19; COVID; Omicron; online learning; remote learning; online education; Twitter; dataset; Tweets; social media; Big Data
Online: 21 July 2022 (08:05:19 CEST)
COMMUNICATION | doi:10.20944/preprints202206.0383.v2
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Exoskeleton; Twitter; Tweets; Big Data; social media; Data Mining; dataset; Data Science; Natural Language Processing; Information Retrieval
Online: 21 July 2022 (04:06:53 CEST)
The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and diverse use-cases in assisted living, military, healthcare, firefighting, and industry 4.0. The exoskeleton market is projected to increase by multiple times of its current value within the next two years. Therefore, it is crucial to study the degree and trends of user interest, views, opinions, perspectives, attitudes, acceptance, feedback, engagement, buying behavior, and satisfaction, towards exoskeletons, for which the availability of Big Data of conversations about exoskeletons is necessary. The Internet of Everything style of today's living, characterized by people spending more time on the internet than ever before, with a specific focus on social media platforms, holds the potential for the development of such a dataset, by the mining of relevant social media conversations. Twitter, one such social media platform, is highly popular amongst all age groups, where the topics found in the conversation paradigms include emerging technologies such as exoskeletons. To address this research challenge, this work makes two scientific contributions to this field. First, it presents an open-access dataset of about 140,000 tweets about exoskeletons that were posted in a 5-year period from May 21, 2017, to May 21, 2022. Second, based on a comprehensive review of the recent works in the fields of Big Data, Natural Language Processing, Information Retrieval, Data Mining, Pattern Recognition, and Artificial Intelligence that may be applied to relevant Twitter data for advancing research, innovation, and discovery in the field of exoskeleton research, a total of 100 Research Questions are presented for researchers to study, analyze, evaluate, ideate, and investigate based on this dataset.
REVIEW | doi:10.20944/preprints202207.0259.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Blockchain; Blockchain Technology; Cryptocurrency; Applications; Challenges; Opportunities
Online: 18 July 2022 (10:10:46 CEST)
Blockchain innovation stands out enough to be noticed and adopted in various countries and organizations around the world. Many businesses, including finance, medical services, inventory networks, security, libraries, and the internet of things, are currently under attack. For the benefit of the blockchain, many businesses incorporate blockchain technology into their frameworks. Despite its solidarity, blockchain faces a few challenges in security, protection, adaptability, and other areas. This paper examines the forward leap in blockchain innovation, as well as its applications and challenges. While many blockchain papers focus on digital currencies, IoT, and security, this paper focuses on the overall best in a class of blockchain innovation, its new twists and turns, and choices, particularly in areas other than cryptographic forms of money. The investigators' goal is to provide a thorough audit of the cryptography underlying blockchain to better understand the innovation. The examiners also conduct general research on people and venture blockchains, as well as future exploration opportunities and their implications for blockchain innovation.
REVIEW | doi:10.20944/preprints202207.0190.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: cloud computing; data storage; users; service provider; software; hardware
Online: 13 July 2022 (04:52:59 CEST)
The popularity of cloud computing is growing owing to its large data storage capacity and high computation power. It provides online, on-demand, scalable application solution, removes hardware and software barriers for non-specialist, rapidly integrates and deploys desired and necessary facilities, supports quick upgrading and addition of features. Users get benefitted with the selection of the appropriate cloud computing platform for their projects. Here, our paper provides a comprehensive overview of the services provided to the users by the most common cloud computing service providers. This paper could be used as a reference while selecting the best service provider based on the requirements of the projects.
REVIEW | doi:10.20944/preprints202007.0642.v2
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: COVID-19; IoT; Blockchain; Contact Tracing; Mobile Applications
Online: 8 July 2022 (08:04:39 CEST)
The COVID-19 is an exponentially growing disease that has intentioned nations to use technologies to detect the coronavirus infection. Several nations are working greatly to fight against COVID-19. Many nations have been using a range of devices to combat the pandemic, seeking information about growth, monitoring as well as the leaking the confidential information of the residents. This research aims to assist infected people online using the Internet of Things (IoT) and Blockchain technologies through smart devices. IoT-based healthcare devices gather useful information, provide additional insight through symptoms and behaviors, allow remote monitoring, and simply give people better self - determination and healthcare. Blockchain allows the secure transfer of patient health information, regulates the medical distribution network. A four-layer architecture is proposed using IoT and Blockchain to detect and prevent individuals to be COVID 19. This research provides a framework for patients with COVID-19 infectious disease and recognizes health issues and diagnoses online. Smart devices such as smartphones can install any mobile apps such as Aarogya Setu, Tawakkalna, and so on. These applications can track COVID-19 patients properly. The installation of mobile apps on smart devices focuses to reduce the time and cost and increase the performance of the infectious patient's condition. A four-layer architecture is proposed using IoT and Blockchain technologies. Many research works focus on investigating, analyzing, and highlighting the affected individuals through guiding the COVID-19 infection. Eventually, various mobile apps are recognized and addressed in this paper.
ARTICLE | doi:10.20944/preprints202205.0238.v2
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: COVID-19; SARS-CoV-2; Omicron; Twitter; tweets; sentiment analysis; big data; Natural Language Processing; Data Science; Data Analysis
Online: 7 July 2022 (08:36:40 CEST)
This paper presents the findings of an exploratory study on the continuously generating Big Data on Twitter related to the sharing of information, news, views, opinions, ideas, knowledge, feedback, and experiences about the COVID-19 pandemic, with a specific focus on the Omicron variant, which is the globally dominant variant of SARS-CoV-2 at this time. A total of 12028 tweets about the Omicron variant were studied, and the specific characteristics of tweets that were analyzed include - sentiment, language, source, type, and embedded URLs. The findings of this study are manifold. First, from sentiment analysis, it was observed that 50.5% of tweets had the ‘neutral’ emotion. The other emotions - ‘bad’, ‘good’, ‘terrible’, and ‘great’ were found in 15.6%, 14.0%, 12.5%, and 7.5% of the tweets, respectively. Second, the findings of language interpretation showed that 65.9% of the tweets were posted in English. It was followed by Spanish or Castillian, French, Italian, Japanese, and other languages, which were found in 10.5%, 5.1%, 3.3%, 2.5%, and <2% of the tweets, respectively. Third, the findings from source tracking showed that “Twitter for Android” was associated with 35.2% of tweets. It was followed by “Twitter Web App”, “Twitter for iPhone”, “Twitter for iPad”, “TweetDeck”, and all other sources that accounted for 29.2%, 25.8%, 3.8%, 1.6%, and <1% of the tweets, respectively. Fourth, studying the type of tweets revealed that retweets accounted for 60.8% of the tweets, it was followed by original tweets and replies that accounted for 19.8% and 19.4% of the tweets, respectively. Fifth, in terms of embedded URL analysis, the most common domains embedded in the tweets were found to be twitter.com, which was followed by biorxiv.org, nature.com, wapo.st, nzherald.co.nz, recvprofits.com, science.org, and other URLs. Finally, to support similar research and development in this field centered around the analysis of tweets, we have developed an open-access Twitter dataset that comprises tweets about the SARS-CoV-2 omicron variant since the first detected case of this variant on November 24, 2021.
ARTICLE | doi:10.20944/preprints202207.0056.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: deep learning; convolutional neural networks; classification; machine learning; IoT
Online: 5 July 2022 (04:22:49 CEST)
In videos, the human's actions are of three-dimensional (3D) signals. These videos investigate the spatiotemporal knowledge of human behavior. The promising ability is investigated using 3D convolution neural networks (CNNs). The 3D CNNs have not yet achieved high output for their well-established two-dimensional (2D) equivalents in still photographs. Board 3D Convolutional Memory and Spatiotemporal fusion face training difficulty preventing 3D CNN from accomplishing remarkable evaluation. In this paper, we implement Hybrid Deep Learning Architecture that combines STIP and 3D CNN features to enhance the performance of 3D videos effectively. After implementation, the more detailed and deeper charting for training in each circle of space-time fusion. The training model further enhances the results after handling complicated evaluations of models. The video classification model is used in this implemented model. Intelligent 3D Network Protocol for Multimedia Data Classification using Deep Learning is introduced to further understand space-time association in human endeavors. In the implementation of the result, the well-known dataset, i.e., UCF101 to, evaluates the performance of the proposed hybrid technique. The results beat the proposed hybrid technique that substantially beats the initial 3D CNNs. The results are compared with state-of-the-art frameworks from literature for action recognition on UCF101 with an accuracy of 95%.
ARTICLE | doi:10.20944/preprints202207.0054.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: social engineering; security countermeasures; security awareness; security policies
Online: 5 July 2022 (03:39:36 CEST)
This research paper describes the social engineering concepts, techniques, and security countermeasures. This research aims to study various social engineering techniques to find the best countermeasures that would help to reduce social engineering attacks.
ARTICLE | doi:10.20944/preprints202207.0035.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: QoE; Fairness; SDN; Classification Prediction; DASH; Multimedia
Online: 4 July 2022 (06:08:03 CEST)
Quality of Experience (QoE) metrics can be used to assess user perception and satisfaction in data services applications delivered over the Internet. End-to-end metrics are formed because QoE is dependent on both the users’ perception and the service used. Traditionally, network optimization has focused on improving network properties such as the QoS. In this paper we examine the Adaptive streaming over a software defined network environment. We aimed to evaluate and study the media streams, aspects affecting the stream, and network. This was done to eventually reach a stage of analysing the network’s features and their direct relationship with the perceived QoE. We then use machine learning to build a prediction model based on subjective user experiments. This will help to eliminate future physical experiments and automate the process of predicting QoE.
ARTICLE | doi:10.20944/preprints202206.0429.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Botnets; Bots; bot controller; botnet detection; network; botnet architecture; botnet attacks; detection techniques
Online: 30 June 2022 (14:09:31 CEST)
Botnets, a prominent threat to IoT security. ‘botnet’ this word is the composition of robot and network. A network of robots used to commit cybercrime. A bot means a compromised end-host or a device which is a member of a botnet.  Governments have become a popular target for malicious attacks. This is due to them holding mass confidential data on their network.
ARTICLE | doi:10.20944/preprints202206.0390.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Object detection; Feature fusion network; Multiple feature selection; Angle prediction; Pixel Attention Mechanism
Online: 29 June 2022 (03:09:52 CEST)
The object detection task is usually affected by complex backgrounds. In this paper, a new image object detection method is proposed, which can perform multi-feature selection on multi-scale feature maps. By this method, a bidirectional multi-scale feature fusion network is designed to fuse semantic features and shallow features to improve the detection effect of small objects in complex backgrounds. When the shallow features are transferred to the top layer, a bottom-up path is added to reduce the number of network layers experienced by the feature fusion network, reducing the loss of shallow features. In addition, a multi-feature selection module based on the attention mechanism is used to minimize the interference of useless information on subsequent classification and regression, allowing the network to adaptively focus on appropriate information for classification or regression to improve detection accuracy. Because the traditional five-parameter regression method has severe boundary problems when predicting objects with large aspect ratios, the proposed network treats angle prediction as a classification task. The experimental results on the DOTA dataset, the self-made DOTA-GF dataset and the HRSC 2016 dataset show that, compared with several popular object detection algorithms, the proposed method has certain advantages in detection accuracy.
ARTICLE | doi:10.20944/preprints202206.0380.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Information System; Information Audit; Audit; Internet Banking
Online: 28 June 2022 (07:30:01 CEST)
This report is on online banking and information systems in online banking. It throws some light on the introduction and historical background of online banking. Next, in this term report online banking information is explained in detailed. We collected a few articles, related to internet banking, on which other researchers have worked. Then further we have brief them into a table where we have mentioned their problems and limitations. Next, we discussed them in detail, giving an overview of each article. Lastly, recommendations for the online banking sectors are also mentioned.
DATASET | doi:10.20944/preprints202206.0346.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: dataset; NLP; Human Resource Management; classification; Job description
Online: 27 June 2022 (03:43:51 CEST)
We describe a dataset that contains job description published on a popular online website in the information and technology sector. As the website focus mainly on United Kingdom based jobs, the data have a specific focus on this country. It contains 11.501 job vacancies and 13 related meta data information. The dataset is suitable for HR analysis using machine learning techniques such as natural language processing and neural networks.
ARTICLE | doi:10.20944/preprints202206.0335.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: metadata; contextual data; harmonization; genomic surveillance; data management
Online: 24 June 2022 (08:46:04 CEST)
DATA DESCRIPTOR | doi:10.20944/preprints202206.0246.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: dataset; twitter; tweets; IMDb ratings; movies; sentiment analysis; NLP
Online: 17 June 2022 (04:39:16 CEST)
In this paper we intend to present a dataset that contain a collection of tweets generated as reactions of the release of 50 different movies. The dataset can be used for gaining useful insights regarding the conversation that is generated around a particular movie. It is particularly suitable for conducting sentiment analysis and other NLP techniques. The dataset contains approximately 2.5 million tweets with their related meta data and cover 50 movies. For each movie, its IMDb rating is included. The movies are the 25 releases with the highest number of votes during 2020 and 2021. The collected tweets represent the reactions of the twitter community during the first week of the release date in US of that particular movie. The tweets per movie ranged from 1.000 to approximately 200.000 tweets with an average of 50.000 per release. We used The Internet Archive Wayback Machine in order to retrieve the IMDb movie rating after one week of the US release date. The tweets and related metadata have been collected using the Tweet Downloader tool.
ARTICLE | doi:10.20944/preprints202206.0225.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: heterogeneous network embedding; random walks; non-meta-path; type and node constraints
Online: 15 June 2022 (10:41:23 CEST)
In heterogeneous networks, the random walks based on meta-path requires prior knowledge and lacks flexibility. And the random walks based on non-meta-path only considers the number of node types, but does not consider the influence of schema and topology between node types in real networks. To solve the above problems, this paper proposes a novel model HNE-RWTIC (Heterogeneous Network Embedding Based on Random Walks of Type & Inner Constraint). Firstly, to realize the flexible walks, we design a Type strategy, which is the node type selection strategy based on the co-occurrence probability of node types. Secondly, to achieve the uniformity of node sampling, we design an Inner strategy, which is the node selection strategy based on the adjacency relationship between nodes. The Type & Inner strategy can realize the random walks based on meta-path, the flexibility of the walks, and can sample the node types and nodes uniformly in proportion. Thirdly, based on the above strategy, a transition probability model is constructed; then, we obtain the nodes embedding based on the random walks and Skip-Gram. Finally, in classification and clustering tasks, we conducted a thorough empirical evaluation of our method on three real heterogeneous networks. Experimental results shown that F1-Score and NMI of HNE-RWTIC outperform state-of-the-art approaches.
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Intrusion Detection Systems; IDS; Cyber Security; Information Technology; Security Systems; Systems Security
Online: 15 June 2022 (09:28:49 CEST)
Intrusion Detection Systems (IDS) plays a part in modern cyber security, as a result of the increasing need for cyber security systems in the “real” world due to the increasing number of cyber attacks, more sophisticated systems are required in order to prevent these attacks - an IDS can provide this protection. Due to the sophistication of these systems, they must be properly understood, developed and analyzed - research papers can be used as a tool to improve IDS systems. This paper is composed of two main sections: a survey and a taxonomy, providing information, reviews and interpretations from relevant papers, a timeline of important papers, a discussion on the future of IDS and a classification on IDS and how to apply this.
ARTICLE | doi:10.20944/preprints202206.0205.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: log analysis; monitoring data; anomaly detection; natural language processing; topic modeling; clustering technique; time series anomaly detection
Online: 14 June 2022 (11:10:15 CEST)
Context: Anomaly detection in a data center is a challenging task, having to consider different services on various resources. Current literature shows the application of artificial intelligence techniques to either log files or monitoring data: the former created by services at run time, while the latter produced by specific sensors directly on the physical or virtual machine. Objectives: We propose a model that exploits information both in log files and monitoring data to identify patterns and detect anomalies over time. Methods: The key idea is to use on one side natural language processing solutions to detect problems at service level, extracting words that represent anomalies. Clustering and topic modeling techniques have been used to identify patterns and group them with respect to topics. On the other side time series anomaly detection technique has been applied to sensors data in order to combine problems found in the log files with problems stored in the monitoring data. Results: We have tested our approach on a real data center equipped with log files and monitoring data that can characterize the behaviour of physical and virtual resources in production. We have observed a correspondence between anomalies in log files and monitoring data, e.g. an increase in memory usage or machine load. The results are extremely promising. Conclusion: Our model requires to integrate site administrators' expertise in order to consider all critical scenario in the data center and understand results properly.
ARTICLE | doi:10.20944/preprints202206.0115.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Explainable machine learning; COVID-19; Vaccination uptake; Shapley values; Feature importance.
Online: 8 June 2022 (05:30:18 CEST)
COVID-19 vaccine hesitancy is considered responsible for the lower rate of acceptance of vaccines in many parts of the world. However, sources of this hesitancy are rooted in many social, political, and economic factors. This paper strives to find the most important variables in predicting the COVID-19 vaccination uptake. We introduce an explainable machine learning (ML) framework to understand the COVID-19 vaccination uptake around the world. To predict vaccination uptake, we have trained a random forest (RF) regression model using a number of sociodemographic and socioeconomic data. The traditional decision tree (DT) regression model is also implemented as the baseline model. We found that the RF model performed better than the DT model since RF is more robust to handle nonlinearity and multi-collinearity. Also, we have presented feature importance based on impurity measure, permutation, and Shapley values to provide the most significant unbiased features. It is found that electrification coverage and Gross Domestic Product are the strongest predictors for higher vaccination uptake, whereas the Fragile state index (FI) contributed to lower vaccination uptake. These findings suggest addressing issues that are found responsible for lower vaccination uptake to combat any future public health crisis.
ARTICLE | doi:10.20944/preprints202205.0416.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Hypervisor; Debugging; Kernel-debugger; Fuzzing
Online: 31 May 2022 (09:41:16 CEST)
Software analysis, debugging, and reverse engineering have a crucial impact in today’s software industry. Efficient and stealthy debuggers are especially relevant for malware analysis. However, existing debugging platforms fail to address a transparent, effective, and high-performance low-level debugger due to their detectable fingerprints, complexity, and implementation restrictions. In this paper, we present HyperDbg, a new hypervisor-assisted debugger for high-performance and stealthy debugging of user and kernel applications. To accomplish this, HyperDbg relies on state-of-the-art hardware features available in today’s CPUs, such as VT-x and extended page tables. In contrast to other widely used existing debuggers, we design HyperDbg using a custom hypervisor, making it independent of OS functionality or API. We propose hardware-based instruction-level emulation and OS-level API hooking via extended page tables to increase the stealthiness. Our results of the dynamic analysis of 10,853 malware samples show that HyperDbg’s stealthiness allows debugging on average 22% and 26% more samples than WinDbg and x64dbg, respectively. Moreover, in contrast to existing debuggers, HyperDbg is not detected by any of the 13 tested packers and protectors. We improve the performance over other debuggers by deploying a VMX-compatible script engine, eliminating unnecessary context switches. Our experiment on three concrete debugging scenarios shows that compared to WinDbg as the only kernel debugger, HyperDbg performs step-in, conditional breaks, and syscall recording, 2.98x, 1319x, and 2018x faster, respectively. We finally show real-world applications, such as a 0-day analysis, structure reconstruction for reverse engineering, software performance analysis, and code-coverage analysis.
ARTICLE | doi:10.20944/preprints202205.0398.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Generative Software Development; Code Generation; Complexty Space
Online: 30 May 2022 (11:32:07 CEST)
This survey proposed an evaluation model to analyze and examine different approaches to generativity. In addition to problem domain concepts, the following concepts were used to define this model: Complexity and complexity management, and Systematics view to achieve unified and integrated insight into disparate evaluation criteria. The research's approach to the said concepts is first introduced. Then, the evaluation model is presented.
ARTICLE | doi:10.20944/preprints202205.0344.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Linked (open) Data; Semantic Interoperability; Data Mapping; Governmental Data; SPARQL; Ontologies
Online: 25 May 2022 (08:18:46 CEST)
In this paper, we present a method to map information regarding service activity provision residing in governmental portals across European Commission. In order to perform this, we used as a basis the enriched Greek e-GIF ontology, modeling concepts, and relations in one of the two data portals (i.e., Points of Single Contacts) examined, since relevant information on the second was not provided. Mapping consisted in transforming information appearing in governmental portals in RDF format (i.e., as Linked data), in order to be easily exchangeable. Mapping proved a tedious task, since description on how information is modeled in the second Point of Single Contact is not provided and must be extracted in a manual manner.
ARTICLE | doi:10.20944/preprints202205.0242.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Virtual Reality; Augmented Reality; Serious Games; IoT; Web App; Blender; Unity3D
Online: 18 May 2022 (10:46:25 CEST)
The use of games for non-ludic purposes, the serious games, is an interesting branch of science that has shown important results. With the advent of the pandemic that has made access to rehabilitation centres more problematic for patients with cognitive rehabilitation needs, the importance of being able to exercise these patients safely in their own homes has emerged strongly. Many studies show that immediate action and appropriate, specific rehabilitation can guarantee satisfactory results. Appropriate therapy is based on key factors to be taken into account such as frequency, intensity and specificity of the exercises. The aim of our work is to define a pathway for the creation of open source digital products that can allow access in any moment to a virtual environment through which patients who have suffered limitations in their cognitive abilities can exercise and recover all or part of these abilities in the shortest possible time. In view of the spread of IoT devices capable of easily monitoring various vital parameters, we propose with our system a low-cost and very efficient solution that can provide the doctor and therapist not only with quantitative data on the exercises performed (number and type of exercises, time spent, results obtained) but also an overview of vital parameters, so as to observe any states of agitation or excessive effort in completing the exercise.
ARTICLE | doi:10.20944/preprints202205.0220.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: security; privacy; blockchain; smartcontracts; IoT; encryption; transaction
Online: 18 May 2022 (02:43:20 CEST)
When we talk about the Internet of Things, we are referring to the connecting of things to a1 physical network that is embedded with software, sensors, and other devices that allow information2 to be exchanged between devices. It is possible that the interconnection of devices will present3 issues in terms of security, trustworthiness, reliability, and confidentiality, among other things.4 The proposed approach is effective at detecting intrusions into the Internet of Things network.5 Initially, the privacy-preserving technology was deployed utilising a Blockchain-based methodology6 to ensure that personal information was protected. Patients’ health records (PHR) security is the7 most crucial component of encryption over the Internet because of the value and importance of these8 records, particularly in the context of the Internet of Medical Things (IoMT). The search terms access9 mechanism is one of the most common approaches used to access personal health records from a10 database, but it is vulnerable to a number of security flaws. However, while blockchain-enabled11 healthcare systems provide increased security, they may also introduce weaknesses into the current12 state of the art. Blockchain-enabled frameworks have been proposed in the literature as a means13 of resolving those challenges. These solutions, on the other hand, are primarily concerned with14 data storage, with Blockchain serving as a database. To enable secure search and keyword-based15 access to a distributed database, this study proposes the use of blockchain technology as a distributed16 database, together with a homomorphic encryption mechanism. Aside from that, the suggested17 system includes a secure key revocation mechanism that can be used to automatically update various18 policies.As a result, our proposed approach provides greater security, efficiency, and transparency19 while also being more cost-effective. We have compared the findings of our proposed models with20 those of the benchmark models, if appropriate. Our comparison research demonstrates that our21 suggested framework provides a more secure and searchable mechanism for the healthcare system22 than the current state of the art.
ARTICLE | doi:10.20944/preprints202205.0185.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: AMS; Face Recognition; AAMS; Face Detector; Python; Monitoring; RFID; Arduino Uno; IoT
Online: 13 May 2022 (09:40:48 CEST)
The 21st century, where all things are depending upon technology, almost all the human tasks are being done with the help of technology to save a lot of time and make our life much more comfortable. The monitoring of students’ attendance is a crucial task for faculty in today’s world. There are more chances of errors while entering the students’ attendance records in the primary system, primarily when the class was over. The main concern of this study is to build an IoT-based automated attendance management system for educational institutes by biometric recognition to incorporate fake/proxy attendance and errors of entry and to replace old manual methods of taking students’ attendance by calling their names or roll numbers. The AAS will click the image of the classroom, and it will automatically detect the faces of students sitting in the lecture room and recognize them during lectures then mark their attendance daily to keep a record of their presence and also maintain and manage it for the management staff of the institution for future by using web services.
ARTICLE | doi:10.20944/preprints202205.0095.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Regression; AI based Tornado Analysis; Decision Support System; Mobile Application
Online: 9 May 2022 (03:15:11 CEST)
Tropical cyclones devastate large areas, take numerous lives and damage extensive property in Bangladesh. Research on landfalling tropical cyclones affecting Bangladesh has primarily focused on events occurring since AD1960 with limited work examining earlier historical records. We rectify this gap by developing a new tornado catalogue that include present and past records of tornados across Bangladesh maximizing use of available sources. Within this new tornado database, 119 records were captured starting from 1838 till 2020 causing 8,735 deaths and 97,868 injuries leaving more than 1,02,776 people affected in total. Moreover, using this new tornado data, we developed an end-to-end system that allows a user to explore and analyze the full range of tornado data on multiple scenarios. The user of this new system can select a date range or search a particular location, and then, all the tornado information along with Artificial Intelligence (AI) based insights within that selected scope would be dynamically presented in a range of devices including iOS, Android, and Windows. Using a set of interactive maps, charts, graphs, and visualizations the user would have a comprehensive understanding of the historical records of Tornados, Cyclones and associated landfalls with detailed data distributions and statistics.
REVIEW | doi:10.20944/preprints202205.0029.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Sleep tracking; Context aware recommender system; Quantified self; Personal informatics; Ubiquitous computing; Mobile computing; mHealth; CBI-I
Online: 5 May 2022 (09:34:09 CEST)
The practice of quantified-self sleep tracking is increasingly common nowadays among healthy individuals as well as patients with sleep problems. However, existing sleep-tracking technologies only support simple data collection and visualization, and are incapable of providing actionable recommendations that are tailored to users' physical, behavioral and environmental context. Here we coined the term context-aware sleep health recommender system (CASHRS) as an emerging multidisciplinary research field that bridges ubiquitous sleep computing and context-aware recommender systems. In this paper, we presented a narrative review to analyze the type of contextual information, the recommendation algorithms, the context filtering techniques, the behavior change techniques, the system evaluation, and the challenges in peer-reviewed publications that meet the characteristics of CASHRS. Analysis results identified current research trends, the knowledge gap, and future research opportunities in CASHRS.
ARTICLE | doi:10.20944/preprints202110.0065.v2
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: access to justice; cost of justice; judicial system; videoconferencing; judicial information system; litigants; internet technologies
Online: 11 April 2022 (10:10:21 CEST)
This venture targets giving a correspondence framework that can improve the legal framework execution and cooperation and furthermore make it simple to lead execution audits. This examination depicts restricted admittance to the right legitimate data just as admittance to courts and court administrations as the essential factors that limit admittance to equity. The basic target fundamental for accomplishing the point incorporates planning an online easy to understand data framework that would improve correspondence and association between legal professionals like legal advisors and disputants. The undertaking uses a subjective methodology that zeroed in on the substance investigation of essential information. Key members were court laborers like attorneys and agents, just as court clients like disputants. Pre-plan discoveries uncovered that current legitimate data frameworks – basically court sites – neglect to recognize their crowd by treating all court clients similarly comparative with the correspondence of lawful data. Likewise, the issue of admittance to equity includes the significant expense of equity also, restricted legal data that elevates admittance to equity. Post-plan discoveries from the online overview directed after the turn of events and execution of the IT curio uncovered a powerful and productive legitimate data framework. Exploration discoveries supported the advancement of an IT antiquity that records for huge shortcomings in current legal data frameworks, for example, their failure to pass on fitting legitimate data intelligently and productively. In outline, the investigation prescribes the reception of saw control to work fair and square of certainty of court clients and the differentiation between court clients to convey the right data to the right crowd.
CONCEPT PAPER | doi:10.20944/preprints202204.0074.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: NetDevOps; NetOps; Intent-Based Networking; artificial intelligence; Neural Network; Natural Language Processing; transformer
Online: 8 April 2022 (08:19:09 CEST)
The computer network world is changing and the NetDevOps approach has brought the dynamics of applications and systems into the field of communication infrastructure. Businesses are changing and businesses are faced with difficulties related to the diversity of hardware and software that make up those infrastructures. The "Intent-Based Networking - Concepts and Definitions" document describes the different parts of the ecosystem that could be involved in NetDevOps. The recognize, generate intent, translate and refine features need a new way to implement algorithms. This is where artificial intelligence comes in.
CONCEPT PAPER | doi:10.20944/preprints202204.0044.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Smart cities; data science; machine learning; Internet of Things; data-driven decision making; intelligent services; cybersecurity
Online: 6 April 2022 (11:35:15 CEST)
Cities are undergoing huge shifts in technology and operations in recent days, and `data science' is driving the change in the current age of the Fourth Industrial Revolution (Industry 4.0 or 4IR). Extracting insights or actionable knowledge from city data and building a corresponding data-driven model is the key to making a city system automated and intelligent. Data science is typically the study and analysis of actual happenings with historical data using a variety of scientific methodology, machine learning techniques, processes, and systems. In this paper, we concentrate on and explore ``Smart City Data Science", where city data collected from various sources like sensors and Internet-connected devices, is being mined for insights and hidden correlations to enhance decision-making processes and deliver better and more intelligent services to citizens. To achieve this goal, various machine learning analytical modeling can be employed to provide deeper knowledge about city data, which makes the computing process more actionable and intelligent in various real-world services of today's cities. Finally, we identify and highlight ten open research issues for future development and research in the context of data-driven smart cities. Overall, we aim to provide an insight into smart city data science conceptualization on a broad scale, which can be used as a reference guide for the researchers, professionals, as well as policy-makers of a country, particularly, from the technological point of view.
ARTICLE | doi:10.20944/preprints202203.0189.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Smart device; Users behavior; human computer interaction; exploratory analysis; statistical methods
Online: 14 March 2022 (12:28:44 CET)
Purpose: The use of smart devices has increased greatly in the last ten years with users reaching out to the possibility to do more with them especially in the networking front. In this context there is a need to understand the connection between users’ social demographic factors and their way to related to their smart devices. Objective: This study was designed to evaluate the senso of belonging of a community in order to evaluate intangible benefits that employees may gain from a more immerse relationship with their devices. Method: We used a dataset of 586 anonymous respondent of an existing survey designed for capturing the relationships that humans develop with their smart devices. In particular, we investigate the relationships with smart device and particular background variables of the respondents using a chi-square test. Results: The study showed that there is a significant relationship between users’ sex and smart device type and their dependency on smart device. Male tends to think that smart device (in general) enables them to connect with a larger community. At the same time, female using smart phones feels more connected more to large community than when using other smart devices. Conclusion: This study provided several significant findings that confirm and strength previous literature works on the subject. In addition, socio demographics variables (like gender) as well as the type of smart device present a correlation between the smart device users and their tendency to stay in touch with a larger community.
ARTICLE | doi:10.20944/preprints202203.0141.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: rural areas; urban areas; cloud server; monitoring; road monitoring
Online: 10 March 2022 (07:47:15 CET)
We are living in a world where people always strive for the luxuries they can get with less hard work and for these smart systems are developed and established all over the world to make life easier for the people. However, in Pakistan being a third-world country we are unable to achieve better results due to the lack of proper transit from one area to another. Sometimes the roads are not in their optimum conditions and hence people have to face a lot of problems while traveling. In urban areas, people cannot reach their destination in time, in some cases, they damage their vehicles due to the cracks on the roads, in medical emergency patient dies most of the time in their transit from rescuing point to the hospital. Similarly, in rural areas where farmers face countless problems while bringing their yield of the season towards the markets. Therefore, having described the severe damage the bad quality of roads is making. A solution is proposed to solve the problems of both the rural and urban population in our project. The project aims to provide the solution to their problem to a certain extent by monitoring the conditions of the roads. The sensors in the system will calculate the values and send them to the cloud server. The cloud server used is the blynk platform where the data will be stored. Moreover, data can be provided to the government in the future for the timely maintenance of the roads, and hence the citizens and lifestyle of the people will be changed. It is expected that the problems of both urban and rural population will be solved to a greater percentage.
REVIEW | doi:10.20944/preprints202203.0087.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Internet of Things; cyber-attacks; anomalies; machine learning; deep learning; IoT data analytics; intelligent decision-making; security intelligence
Online: 7 March 2022 (02:39:58 CET)
The Internet of Things (IoT) is one of the most widely used technologies today, and it has a significant effect on our lives in a variety of ways, including social, commercial, and economic aspects. In terms of automation, productivity, and comfort for consumers across a wide range of application areas, from education to smart cities, the present and future IoT technologies hold great promise for improving the overall quality of human life. However, cyber-attacks and threats greatly affect smart applications in the environment of IoT. The traditional IoT security techniques are insufficient with the recent security challenges considering the advanced booming of different kinds of attacks and threats. Utilizing artificial intelligence (AI) expertise, especially machine and deep learning solutions, is the key to delivering a dynamically enhanced and up-to-date security system for the next-generation IoT system. Throughout this article, we present a comprehensive picture on IoT security intelligence, which is built on machine and deep learning technologies that extract insights from raw data to intelligently protect IoT devices against a variety of cyber-attacks. Finally, based on our study, we highlight the associated research issues and future directions within the scope of our study. Overall, this article aspires to serve as a reference point and guide, particularly from a technical standpoint, for cybersecurity experts and researchers working in the context of IoT.
ARTICLE | doi:10.20944/preprints202202.0238.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: homomorphic; digital signature; IoT; authentication
Online: 18 February 2022 (17:04:53 CET)
In this paper, we address the problem of compatibility between digital signature schemes and in-network aggregation approaches. In the IoT world, the gateways alter the signed network flows when performing in-network aggregation. Therefore, existing conventional approaches are not suitable for verifying the authenticity of the original flows. This raises the need for energy-effective and secure schemes that enable the destination to validate aggregated network flows. In this regard, a lightweight homomorphic signature scheme is proposed which supports the implementation of aggregation procedures without affecting the verification process. We demonstrate the unforgeability and the privacy of our scheme. We also perform an analytical study of its energy-efficiency. The results suggest that the proposed scheme considerably decreases the processing overhead of the existing set-homomorphic signature schemes. Moreover, it does not add any communication overhead to traditional (non-homomorphic) signature schemes. This, in turn, improves the energy consumption by 30% compared to existing homomorphic signature techniques.
ARTICLE | doi:10.20944/preprints202005.0151.v3
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: deep learning; CNN; DenseNet; COVID-19; transfer learning
Online: 18 February 2022 (14:44:55 CET)
COVID-19 has a severe risk of spreading rapidly, the quick identification of which is essential. In this regard, chest radiology images have proven to be a practical screening approach for COVID-19 aﬀected patients. This study proposes a deep learning-based approach using Densenet-121 to detect COVID-19 patients eﬀectively. We have trained and tested our model on the COVIDx dataset and performed both 2-class and 3-class classification, achieving 96.49% and 93.71% accuracy, respectively. By successfully utilizing transfer learning, we achieve comparable performance to the state-of-the-art method while using 15x fewer model parameters. Moreover, we performed an interpretability analysis using Grad-CAM to highlight the most significant image regions at test time. Finally, we developed a website that takes chest radiology images as input and detects the presence of COVID-19 or pneumonia and a heatmap highlighting the infected regions. Source code for reproducing results and model weights are available.
ARTICLE | doi:10.20944/preprints202106.0196.v3
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Twitter; Social Media; Social Networking; Social Network Analytic; DistilBERT; Text Similarity; Natural Language Processing; Character Computing
Online: 17 February 2022 (13:15:23 CET)
Social media platforms have been entirely an undeniable part of the lifestyle for the past decade. Analyzing the information being shared is a crucial step to understanding human behavior. Social media analysis aims to guarantee a better experience for the user and risen user satisfaction. For deriving any further conclusion, first, it is necessary to know how to compare users. In this paper, a hybrid model has been proposed to measure Twitter profiles’ similarity and quantifies the likeness degree of profiles by calculating features considering users’ behavioral habits. For this, first, the timeline of each profile has been extracted using the official TwitterAPI. Then, in parallel, three aspects of a profile are deliberated. Behavioral ratios are time-series-related information showing the consistency and habits of the user. Dynamic time warping has been utilized to compare the behavioral ratios of two profiles. Next, the audience network is extracted for each user, and for estimating the similarity of two sets, Jaccard similarity is used. Finally, for the Content similarity measurement, the tweets are preprocessed respecting the feature extraction method; TF-IDF and DistilBERT for feature extraction are employed and then compared using the cosine similarity method. Results have shown that TF-IDF has slightly better performance; therefore, the more straightforward solution is selected for the model. Similarity level of different profiles. As in the case study, a Random Forest classification model was trained on almost 20000 users revealed a 97.24% accuracy. This comparison enables us to find duplicate profiles with nearly the same behavior and content.
ARTICLE | doi:10.20944/preprints202202.0109.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Industry 4.0; industry 5.0 interoperability; Machine Learning; AI; HR; Attrition
Online: 8 February 2022 (12:31:01 CET)
This paper aims to raise awareness on certain interoperability issues as we intend to shape industry 5.0 in order to enable a human-centric resilient society. We advocate that the need of sharing small and specific data will become more intensive as AI-based solutions will become more pervasive. Consequently, dataspaces should be carefully designed to address this need. We advance the conversation by presenting a case study from HR demonstrating how to predict the possibility of an employee experiencing attrition. Our experimental results show that we need more than 500 samples for developing a machine learning model to be sufficiently capable to generalize the problem. Consequently, our experimental results show the feasibility of the idea. However, in small and medium sized companies this approach cannot be implemented due to the limited number of samples. At the same time, we advocate that this obstacle may be overcome if multiple companies will join a shared dataspace, thus raising interoperability issues
ARTICLE | doi:10.20944/preprints202202.0093.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: mHealth; App; Stroke; Caregiver; Usability; User Experience; Needs; Design
Online: 7 February 2022 (15:17:42 CET)
(1) Background: Existing research has demonstrated the potential of mHealth apps in improving the caregiving outcomes of stroke. Since several apps were published in commercially available app stores without explaining their design and evaluation processes, it is necessary to identify the usability and user experience issues to promote long-term adherence and usage; (2) Methods: User reviews were extracted from the 47 previously identified apps that support stroke caregiving needs using a python-scraper. The reviews were pre-processed and filtered using python scripts. The final corpus was classified based on usability and user experience dimensions to highlight issues within the app; (3) Results: A total of 162,095 were extracted from the two app stores. After filtration, 15,818 reviews were included and classified based on the usability and user experience dimensions. Findings highlight critical issues related to the errors/effectiveness, efficiency and support that contribute to decreased satisfaction, emotion and frustration in using the app; (4) Conclusion: The study identified several usability and user experience issues due to the inability of the app developers to understand the needs of the user. Further, the study describes the inclusion of a participatory design approach to promote an improved understanding of user needs; therefore, limiting any issues and ensuring continued use.
REVIEW | doi:10.20944/preprints202202.0001.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Artificial intelligence; machine learning; data science; advanced analytics; intelligent computing; automation; smart systems; industry 4.0 applications
Online: 1 February 2022 (10:26:21 CET)
Artificial Intelligence (AI) is a leading technology of the current age of the Fourth Industrial Revolution (Industry 4.0 or 4IR), with the capability of incorporating human behavior and intelligence into machines or systems. Thus AI-based modeling is the key to building automated, intelligent, and smart systems according to today's needs. To solve real-world issues various types of AI such as analytical, functional, interactive, textual, and visual AI can be applied to enhance the intelligence and capabilities of an application. However, developing an effective AI model is a challenging task due to the dynamic nature and variation in real-world problems and data. In this paper, we present a comprehensive view on "AI-based Modeling" with the principles and capabilities of potential AI techniques that can play an important role in developing intelligent and smart systems in various real-world application areas including business, finance, healthcare, agriculture, smart cities, cybersecurity and many more. We also emphasize and highlight the research issues within the scope of our study. Overall, the goal of this paper is to provide a broad overview of AI-based modeling that can be used as a reference guide by academics and industry people as well as decision-makers in various real-world scenarios and application domains.
ARTICLE | doi:10.20944/preprints202201.0322.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: NMEA; cybersecurity; anomaly analysis and detection; maritime
Online: 21 January 2022 (12:53:43 CET)
Several disruptive attacks against companies in the maritime industry have led experts to consider the increased risk imposed by cyber threats as a major obstacle to undergoing digitization. The industry is heading toward increased automation and connectivity, leading to reduced human involvement in the different navigational functions and increased reliance on sensor data and software for more autonomous modes of operations. To meet the objectives of increased automation under the threat of cyber attacks, the different software modules that are expected to be involved in different navigational functions need to be prepared to detect such attacks utilizing suitable detection techniques. Therefore, we propose a systematic approach for analyzing the navigational NMEA messages carrying the data of the different sensors, their possible anomalies, malicious causes of such anomalies as well as the appropriate detection algorithms. The proposed approach is evaluated through two use cases, traditional Integrated Navigation System (INS) and Autonomous Passenger Ship (APS). The results reflect the utility of specification and frequency-based detection in detecting the identified anomalies with high confidence. Also, the analysis is found to facilitate the communication of threats through indicating the possible impact of the identified anomalies against the navigational operations. Moreover, we have developed a testing environment that facilitates conducting the analysis. The environment includes a developed tool, NMEA−Manipulator that enables the invocation of the identified anomalies through a group of cyber attacks on sensor data. Our work paves the way for future work in the analysis of NMEA anomalies toward the development of an NMEA intrusion detection system.
REVIEW | doi:10.20944/preprints202201.0269.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: ICT in education; ICT equipment; cost
Online: 19 January 2022 (11:43:52 CET)
Information and Communication Technology (ICT) education has attracted attention in recent years. Despite the importance of the spread of the reason for this low spread rate is that there are many issues specific to ICT education such as monetary cost, time, environment, teacher education system, motivation, curriculum, and health problems. The reason for this low spread rate is that there are many issues specific to ICT education such as monetary cost, time, environment, teacher education system, motivation, curriculum, and health problems. In this paper, we investigated and considered the issues in ICT education from 10 viewpoints. In this paper, we investigated and considered the issues in ICT education in Japan from 10 viewpoints. There are a number of conclusions obtained in this paper. The main conclusions is as follows: (1)Monetary cost is the most important issue; (2)It is necessary to teach not only technology but also theory and common sense; The main conclusions are as follows: (1)Monetary cost is the most important issue; (2)It is necessary to teach not only technology but also theory and common sense; (3)It is necessary to urgently prepare an environment that requires ICT education after considering issues.
REVIEW | doi:10.20944/preprints202201.0232.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: artificial intelligence; data science; Kernel trick; machine learning; pedagogy
Online: 17 January 2022 (14:19:52 CET)
The aim of this tutorial is to help students grasp the theory and applicability of support vector machines (SVMs). The contribution is an intuitive style tutorial that helped students gain insights into SVM from a unique perspective. An internet search will reveal many videos and articles on SVM, but many of them give simplified explanations that leave gaps in the derivations that beginning students cannot fill. Most free tutorials lack guidance on practical applications and considerations. The software wrappers in popular programming libraries such as Python and R hide many of the operational complexities. Free software tools often use default parameters that ignore domain knowledge or leave knowledge gaps about the important effects of SVM hyperparameters, resulting in misuse and subpar outcomes. The author uses this tutorial as a course reference for students studying artificial intelligence and machine learning. The tutorial derives the classic SVM classifier from first principles and then derives the practical form that a computer uses to train a classification model. An intuitive explanation about confusion matrices, F1 score, and the AUC metric extend insights into the inherent tradeoff between sensitivity and specificity. A discussion about cross-validation provides a basic understanding of hyperparameter tuning to maximize generalization by balancing underfitting and overfitting. Even seasoned self-learners with advanced statistical backgrounds have gained insights from this tutorial style of intuitive explanations, with all related considerations for tuning and performance evaluations in one place.
ARTICLE | doi:10.20944/preprints202201.0070.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Web app; Cloud computing; High Availability; High performance computing; Docker container; Horizontal Scaling
Online: 6 January 2022 (10:33:58 CET)
This study analyses some of the leading technologies for the construction and configuration of IT infrastructures to provide services to users. For modern applications, guaranteeing service continuity even in very high computational load or network problems is essential. Our configuration has among the main objectives of being highly available (HA) and horizontally scalable, that is, able to increase the computational resources that can be delivered when needed and reduce them when they are no longer necessary. Various architectural possibilities are analysed, and the central schemes used to tackle problems of this type are also described in terms of disaster recovery. The benefits offered by virtualisation technologies are highlighted and are bought with modern techniques for managing Docker containers that will be used to build the back-end of a sample infrastructure related to a use-case we have developed. In addition to this, an in-depth analysis is reported on the central autoscaling policies that can help manage high loads of requests from users to the services provided by the infrastructure. The results we have presented show an average response time of 21.7 milliseconds with a standard deviation of 76.3 milliseconds showing excellent responsiveness. Some peaks are associated with high-stress events for the infrastructure, but the response time does not exceed 2 seconds even in this case. The results of the considered use case studied for nine months are presented and discussed. In the study period, we improved the back-end configuration and defined the main metrics to deploy the web application efficiently.
ARTICLE | doi:10.20944/preprints202201.0024.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: wireless sensor networks; heterogeneous; hazardous environment; energy efficient.
Online: 5 January 2022 (10:17:02 CET)
Wireless Sensor Networks (WSNs) continue to provide essential services for various applications such as surveillance, data gathering, and data transmission from the hazardous environments to safer destinations. This has been enhanced by the energy-efficient routing protocols that are mostly designed for such purposes. Gateway-based Energy-Aware Multi-hop Routing protocol (MGEAR) is one of the homogenous routing schemes that was recently designed to more efficiently reduce the energy consumption of distant nodes. However, it has been found that the protocol has a high energy consumption rate, lower stability period, less data transmission to the Base station (BS). In this paper, an enhanced Heterogeneous Gateway-based Energy-Aware multi-hop routing protocol ( HMGEAR) is proposed. The proposed routing scheme is based on the introduction of heterogeneous nodes in the existing scheme, selection of the head based on the residual energy, introduction of multi-hop communication strategy in all the regions of the network, and implementation of energy hole elimination technique. Results show that the proposed routing scheme outperforms two existing ones.
ARTICLE | doi:10.20944/preprints202112.0511.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: real sea surface; object detection; performance detection
Online: 31 December 2021 (11:16:15 CET)
The video images captured at long range usually have low contrast floating objects of interest on a sea surface. A comparative experimental study of the statistical characteristics of reflections from floating objects and from the agitated sea surface showed the difference in the correlation and spectral characteristics of these reflections. The functioning of the recently proposed modified matched subspace detector (MMSD) is based on the separation of the observed data spectrum on two subspaces: relatively low and relatively high frequencies. In the literature the MMSD performance has been evaluated in generally and moreover using only a sea model (additive Gaussian background clutter). This paper extends the performance evaluating methodology for low contrast object detection and moreover using only the real sea dataset. This methodology assumes an object of low contrast if the mean and variance of the object and the surrounding background are the same. The paper assumes that the energy spectrum of the object and the sea are different. The paper investigates a scenario in which an artificially created model of a floating object with specified statistical parameters is placed on the surface of a real sea image. The paper compares the efficiency of the classical Matched Subspace Detector (MSD) and MMSD for detecting low-contrast objects on the sea surface. The article analyzes the dependence of the detection probability at a fixed false alarm probability on the difference between the statistical means and variances of a floating object and the surrounding sea.
ARTICLE | doi:10.20944/preprints202112.0356.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: quantum computing; quantum computing framework; quantum simulator; quantum technology; quantum theory; quantum mechanics; quantum circuit; quantum algorithm; ibm qiskit; notebooks jupyter
Online: 22 December 2021 (12:00:48 CET)
The shortage of quantum computers, and their current state of development, constraints research in many fields that could benefit from quantum computing. Although the work of a quantum computer can be simulated with classical computing, personal computers take so long to run quantum experiments that they are not very useful for the progress of research. This manuscript presents an open quantum computing simulation platform that enables quantum computing researchers to have access to high performance simulations. This platform, called QUTE, relies on a supercomputer powerful enough to simulate general purpose quantum circuits of up to 38 qubits, and even more under particular simulations. This manuscript describes in-depth the characteristics of the QUTE platform and the results achieved in certain classical experiments in this field, which would give readers an accurate idea of the system capabilities.
ARTICLE | doi:10.20944/preprints202112.0244.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Attention mechanism; CHMM; LSTM; Multi-modal fusion; Human behavior recognition
Online: 14 December 2021 (15:09:15 CET)
The multi-modal data fusion method based on IA-net and CHMM technical proposed is designed to solve the problem that the incompleteness of target behavior information in complex family environment leads to the low accuracy of human behavior recognition.The two improved neural networks(STA-ResNet50、STA-GoogleNet)are combined with LSTM to form two IA-Nets respectively to extract RGB and skeleton modal behavior features in video. The two modal feature sequences are input CHMM to construct the probability fusion model of multi-modal behavior recognition.The experimental results show that the human behavior recognition model proposed in this paper has higher accuracy than the previous fusion methods on HMDB51 and UCF101 datasets. New contributions: attention mechanism is introduced to improve the efficiency of video target feature extraction and utilization. A skeleton based feature extraction framework is proposed, which can be used for human behavior recognition in complex environment. In the field of human behavior recognition, probability theory and neural network are cleverly combined and applied, which provides a new method for multi-modal information fusion.
ARTICLE | doi:10.20944/preprints202112.0196.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Speech Rehabilitation; Speech Quality Assessment; LSTM
Online: 13 December 2021 (10:10:36 CET)
The article considers an approach to the problem of assessing the quality of speech during speech rehabilitation as a classification problem. For this, a classifier is built on the basis of an LSTM neural network for dividing speech signals into two classes: before the operation and immediately after. At the same time, speech before the operation is the standard to which it is necessary to approach in the process of rehabilitation. The metric of belonging of the evaluated signal to the reference class acts as an assessment of speech. An experimental assessment of rehabilitation sessions and a comparison of the resulting assessments with expert assessments of phrasal intelligibility were carried out.
ARTICLE | doi:10.20944/preprints202112.0031.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Satellite Communication; Signal Propagation; Rain Attenuation; Urban area ground station; SNR, ITU-R; LSTM, Neural network
Online: 2 December 2021 (11:18:57 CET)
Free-space communication is a leading component in global communications. Its advantages relate to a broader signal spread, no wiring, and ease of engagement. Satellite communication services became recently attractive to mega-companies that foresee an excellent opportunity to connect disconnected remote regions, serve emerging machine-to-machine communication, Internet-of-things connectivity, and more. Satellite communication links suffer from arbitrary weather phenomena such as clouds, rain, snow, fog, and dust. In addition, when signals approach the ground station, it has to overcome buildings blocking the direct access to the ground station. Therefore, satellites commonly use redundant signal strength to ensure constant and continuous signal transmission, resulting in excess energy consumption, challenging the limited power capacity generated by solar energy or the fixed amount of fuel. This research proposes LTSM, an artificial recurrent neural network technology that provides a time-dependent prediction of the expected attenuation level due to rain and fog and the signal strength that remained after crossing physical obstacles surrounding the ground station. The satellite transmitter is calibrated accordingly. The satellite outgoing signal strength is based on the predicted signal strength to ensure it will remain strong enough for the ground station to process it. The instant calibration eliminates the excess use of energy resulting in energy savings.
DATA DESCRIPTOR | doi:10.20944/preprints202111.0511.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Social network analysis; Natural language processing; Dataset; Multimode; Opinion Dynamics
Online: 26 November 2021 (14:23:36 CET)
At the end of 2018, a high school student asked a question in Zhihu community, claiming that he had proved Goldbach's conjecture. The problem caused an explosive reaction and a large number of users participated in the discussion. And has caused the widespread influence. On January 1, 2019, the questioner issued his "proof". His proof was soon proved wrong. The heated discussion caused by the incident contains a lot of information of social science analysis value. Therefore, we follow up the event in the first time and build a time series dataset for the event. Taking the questioner's "proof" as the dividing line, all the answers, comments, sub comments and user information of writing these texts before and after two days were recorded. This series of temporal information can reflect the dynamic features of the interaction between user opinions, and the impact of exogenous shocks (proof release) on community opinions. The dataset can be used not only for the demonstration of various social network analysis algorithms, but also for a series of natural language processing tasks such as fine-grained sentiment analysis for long texts, as well as multimodal tasks combining natural language processing and social network analysis. This paper introduces the characteristics and structure of the dataset, shows the visualization effect of social network, and uses the dataset train the benchmark model of emotion analysis.
ARTICLE | doi:10.20944/preprints202111.0499.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Social Networks; Data Mining; Graph Structure; Natural Language Processing; Machine Learning
Online: 26 November 2021 (10:45:06 CET)
The herd effect is a common phenomenon in social society. The detection of this phenomenon is of great significance in many tasks based on social network analysis such as recommendation. However, the research on social network and natural language processing seldom focuses on this issue. In this paper, we propose an unsupervised data mining method to detect herding in social networks. Taking shopping review as an example, our algorithm can identify other reviews which are affected by some previous reviews and detect a herd effect chain. From the overall perspective, the cross effects of all views form the herd effect graph. This algorithm can be widely used in various social network analysis methods through graph structure, which provides new useful features for many tasks.
ARTICLE | doi:10.20944/preprints202111.0472.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: cryptocurrency; Benford’s law; anomaly detection; method application
Online: 25 November 2021 (12:02:11 CET)
Blockchain-based currencies or cryptocurrencies have become a global phenomenon known to most people as a disruptive technology, and a new investment vehicle. However, due to their decentralized nature, regulating these markets has presented regulators with difficulties in finding a balance between nurturing innovation, and protecting consumers. The growing concerns about illicit activity have forced regulators to seek new ways of detecting, analyzing, and ultimately policing public blockchain transactions. Extensive research on machine learning, and transaction graph analysis algorithms has been done to track suspicious behaviour. However, having a macro view of a public ledger is equally important before pursuing a more fine-grained analysis. Benford’s law, the law of first digit, has been extensively used as a tool to discover accountant frauds (many other use cases exist). The basic motivation that drove our research presented in this paper was to test he applicability of the well established method to a new domain, in this case the identification of anomalous behavior using Benford’s law conformity test to the cryptocurrency domain. The research focused on transaction values in all major cryptocurrencies. A suitable time-period was identified that was long enough to sport sufficiently large number of observations for Benford’s law conformity tests and was also situated long enough in the past so that the anomalies were identified and well documented. The results show that most of the cryptocurrencies that did not conform to Benford’s law had well documented anomalous incidents, the first digits of aggregated transaction values of all well known cryptocurrency projects were conforming to Benford’s law. Thus the proposed method is applicable to the new domain.
CONCEPT PAPER | doi:10.20944/preprints202111.0418.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: general theory of information; named set; knowledge structure; structural machine; autopoietic machine; multi-cloud infrastructure.
Online: 23 November 2021 (10:42:36 CET)
The General Theory of Information (GTI) tells us that information is represented, processed and communicated using physical structures. The physical universe is made up of structures combining matter and energy. According to GTI, “Information is related to knowledge as energy is related to matter.” GTI also provides tools to deal with transformation of information and knowledge. We present here, the application of these tools for the design of digital autopoietic machines with higher efficiency, resiliency and scalability than the information processing systems based on the Turing machines. We discuss the utilization of these machines for building autopoietic and cognitive applications in a multi-cloud infrastructure.
ARTICLE | doi:10.20944/preprints202111.0369.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: WorkflowDSL, DSL, Domain-Specific Language, Workflow.
Online: 19 November 2021 (15:08:02 CET)
This paper aims to provide an overview of the complete process in the development of a Domain-Specific Language (DSL). It explains the construction steps such as preliminary research, language implementation, and evaluation. Moreover, it provides details for different key components which are commonly found in the DSLs such as the abstraction layer, DSL metamodel, and the applications. It also explains the general limitations related to the Domain-Specific Languages for Workflows.
ARTICLE | doi:10.20944/preprints202111.0208.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Technology analysis; Trend analysis; Patent keyword analysis; Text mining; Natural language processing
Online: 10 November 2021 (15:25:21 CET)
Thanks to rapid development of artificial intelligence technology in recent years, the current artificial intelligence technology is contributing to many part of society. Education, environment, medical care, military, tourism, economy, politics, etc. are having a very large impact on society as a whole. For example, in the field of education, there is an artificial intelligence tutoring system that automatically assigns tutors based on student's level. In the field of economics, there are quantitative investment methods that automatically analyze large amounts of data to find investment laws to create investment models or predict changes in financial markets. As such, artificial intelligence technology is being used in various fields. So, it is very important to know exactly what factors have an important influence on each field of artificial intelligence technology and how the relationship between each field is connected. Therefore, it is necessary to analyze artificial intelligence technology in each field. In this paper, we analyze patent documents related to artificial intelligence technology. We propose a method for keyword analysis within factors using artificial intelligence patent data sets for artificial intelligence technology analysis. This is a model that relies on feature engineering based on deep learning model named KeyBERT, and using vector space model. A case study of collecting and analyzing artificial intelligence patent data was conducted to show how the proposed model can be applied to real-world problems.
ARTICLE | doi:10.20944/preprints202111.0162.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Internet of Things; IoTivity; HEMS; HAN; Cloud; Backend-as-a-Service; RTOS; Contiki-OS
Online: 9 November 2021 (09:22:51 CET)
In developing countries today, population growth and the penetration of higher standard of living appliances in homes has resulted in a rapidly increasing residential load. In South Africa, the recent rolling blackouts and electricity price increase only highlighted this reality calling for sustainable measures to reduce the overall consumption and peak load. The dawn of the smart grid concept, embedded systems and ICTs have paved the way to novel HEMS design. In this regard, the Internet of Things (IoT), an enabler for smart and efficient energy management systems is seeing increasing attention for optimizing HEMS design and mitigate its deployment cost constraints. In this work, we propose an IoT platform for residential energy management applications focusing on interoperability, low-cost, technology availability and scalability. We focus on the backend complexities of IoT Home Area Networks (HAN) using the OCF IoTivity-Lite middleware. To augment the quality, servicing and reduce cost and complexities, this work leverages open-source Cloud technologies from Back4App as BaaS to provide consumer and Utilities with a data communication platform within an experimental study illustrating time and space agnostic “mind-changing” energy feedback, Demand Response Management (DRM) and appliance operation control via a HEM App via an Android smartphone.
ARTICLE | doi:10.20944/preprints202111.0027.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Blockchain; Agriculture; Traceability; Food Supply chain; Crops.
Online: 1 November 2021 (16:04:00 CET)
In recent years, Blockchain has been favorably adopted in the Supply Chain industry as it provides guaranteed transparency and traceability. The flexibility of Blockchain allows different applications to enable to exchange information; a significant middleweight layer is responsible for information transfer in the Agricultural Sector. The products that are manufactured at a global level in the agriculture industry are improved in safety, validation of some criteria, and quality. In the agriculture industry, the increasing number of complications associated with food safety and impurity risks needs high-level effective traceability solutions that act as necessary quality management tools to make sure satisfactory safety of crops. The agricultural supply chain today has complex ecosystems, consisting of several stockholders to authenticate criteria which are important like crop development stages, monitoring and validation, and compliance with the quality standard. In this proposed research, a systematic literature review will be done that includes smart contracts, Blockchain, and business transactions exclusively for crop production traceability across the agricultural and food supply chain. By using Blockchain in the agriculture sector, productivity, consistency, safety, reliability, and advanced security are increased. All transactions are kept and recorded on the immutable record of Blockchain associates with a decentralized system. Thus, it provides more traceability and clarity in the agriculture system in asafe, trustworthy, and effective way. A systematic literature review is thus enforced to classify the papers which are selected by the following classification: crop traceability, contribution type, research type, and their approach. For this systematic review, the papers which extracted are classified according to defined criteria. The purpose of this study is to fill the gap by collecting and analyzing studies available within the literature aiming to firstly, gain complete insight on the integration of Blockchain in the agriculture sector. Secondly, provide a summary of the present state of research on this area and identify gaps in existing studies. To achieve this aim an SLR was conducted. The findings of this SLR are discussed and researchers were provided with suggestions on possible directions for future research.
ARTICLE | doi:10.20944/preprints202111.0006.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Digital Twin; Blockchain; Proof-of-Work; Microservices; Singular Spectrum Analysis (SSA); Byzantine Fault Tolerance
Online: 1 November 2021 (11:21:41 CET)
Blockchain technology has been recognized as a promising solution to enhance the security and privacy of Internet of Things (IoT) and Edge Computing scenarios. Taking advantage of the Proof-of-Work (PoW) consensus protocol, which solves a computation intensive hashing puzzle, Blockchain assures the security of the system by establishing a digital ledger. However, the computation intensive PoW favors members possessing more computing power. In the IoT paradigm, fairness in the highly heterogeneous network edge environments must consider devices with various constraints on computation power. Inspired by the advanced features of Digital Twins (DT), an emerging concept that mirrors the lifespan and operational characteristics of physical objects, we propose a novel Miner-Twins (MinT) architecture to enable a fair PoW consensus mechanism for blockchains in IoT environments. MinT adopts an edge-fog-cloud hierarchy. All physical miners of the blockchain are deployed as microservices on distributed edge devices, while fog/cloud servers maintain digital twins that periodically update miners’ running status. By timely monitoring miner’s footage that is mirrored by twins, a lightweight Singular Spectrum Analysis (SSA) based detection achieves to identify individual misbehaved miners that violate fair mining. Moreover, we also design a novel Proof-of-Behavior (PoB) consensus algorithm to detect byzantine miners that collude to compromise a fair mining network. A preliminary study is conducted on a proof-of-concept prototype implementation, and experimental evaluation shows the feasibility and effectiveness of proposed MinT scheme under a distributed byzantine network environment.
ARTICLE | doi:10.20944/preprints202110.0334.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Nesting; cutting; irregular pattern; genetic algorithm; smart manufacturing
Online: 22 October 2021 (15:41:54 CEST)
In industrial environments, nesting consists in cutting or extracting pieces from a material sheet, with the purpose of minimizing the surface of the sheet used. This problem is present in different types of industries, such as shipping, aeronautics, woodworking, footwear, and so on. In this work, the aim is to find an acceptable solution to solve complex nesting problems. The research developed is oriented to sacrifice accuracy for speed so as to obtain robust solutions in less computational time. To achieve this, a greedy method and a genetic algorithm have been implemented, being the latter responsible for generating a sequence for the placement of the pieces, where each piece is placed in its current optimal position with the help of a representation system for both the pieces and the material sheet.
ARTICLE | doi:10.20944/preprints202110.0259.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Blockchain technology; Process authenticity; Tokens; Anchors; Oracles
Online: 18 October 2021 (15:54:39 CEST)
In the last four years, the evolution and adoption of blockchain and, more generally, distributed ledger systems have shown the affirmation of many concepts and models with significant differences in system governance and suitable applications. This work aims to update the critical analysis of blockchain technologies carried out by our previous contribution to this journal, extending the focus to distributed ledger components and systems. Starting from the topical concept of decentralization, we introduce concepts and building blocks currently adopted in the available systems centering on their functional aspects and impact on possible applications. We present some conceptual framing tools helpful in the application context, and we will propose the concept of process authenticity, which we will discuss through two use cases: blockchain document dematerialization and e-voting.
ARTICLE | doi:10.20944/preprints202109.0185.v2
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: WiMAX IEEE 802.16e; National Broadband Project; rural area connectivity; Connectivity challenges in developing countries
Online: 18 October 2021 (12:55:20 CEST)
Amongst the advantages of using Worldwide Interoperability Microwave Access (WiMAX) technology at the last-mile level as access technology include an extensive range of 50 km Line of Sight (LOS), 5 to 15 km Non-Line of Sight and few infrastructure installations compared to other wireless broadband access technologies. Despite positive investments in ICT fibre infrastructure by developing countries, including Botswana, servicing end-users is subjected to high prices and service disparities. The alternative, the Wi-Fi hotspot initiative by the Botswana government, falls short as a solution for last-mile connectivity and access. This study used OPNET simulation Modeller 14,5 to investigate whether Botswana’s national broadband project could adopt WiMAX IEEE 802.16e as an access technology. Therefore, using the simulation method, this paper evaluates the WiMAX IEEE 802.16e/m over three subscriber locations in Botswana. The results obtained indicate that the deployment of the WiMAX IEEE 802.16e standard can solve most of the deployment issues and access at the last-mile level. Although the findings suggest that WiMAX IEEE 802.16e is more suitable for high-density areas, it could also solve rural areas’ infrastructure development challenges and provide the required high-speed connectivity access. However, unlike the Wi-Fi initiative, which requires more infrastructure deployment and less on institutional and regulatory frameworks, the deployment of WiMAX IEEE802.16e requires institutional and regulatory standards.
ARTICLE | doi:10.20944/preprints202110.0237.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Software reliability; deep learning; long short-term memory; project similarity and clustering; cross-project prediction
Online: 18 October 2021 (10:33:39 CEST)
Software reliability is an important characteristic for ensuring the qualities of software products. Predicting the potential number of bugs from the beginning of a development project allows practitioners to make the appropriate decisions regarding testing activities. In the initial development phases, applying traditional software reliability growth models (SRGMs) with limited past data does not always provide reliable prediction result for decision making. To overcome this, herein we propose a new software reliability modeling method called deep cross-project software reliability growth model (DC-SRGM). DC-SRGM is a cross-project prediction method that uses features of previous projects’ data through project similarity. Specifically, the proposed method applies cluster-based project selection for training data source and modeling by a deep learning method. Experiments involving 15 real datasets from a company and 11 open source software datasets show that DC-SRGM can more precisely describe the reliability of ongoing development projects than existing traditional SRGMs and the LSTM model.
ARTICLE | doi:10.20944/preprints202110.0151.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Health Information Technology; Adoption; Assimilation; Technology; Organisation and Environment; TOE; TAM
Online: 11 October 2021 (08:45:06 CEST)
The adoption of health information technology (HIT) has increased considerably, contributing to better communication between physicians and patients and providing technological bases for learning and institutional improvement. This type of technology brings many challenges; therefore, understanding its adoption and assimilation is important to assess its potential for engendering desirable outcomes in health management. The assimilation of health information systems should be highlighted as their importance in health organisations is now recognised as a key facilitator assisting in providing better health outcomes. Thus, this study aimed to analyse HIT adoption based on models such as Technology, Organisation and Environment (TOE), which analyses at the organisational level, with other models, such as the Technology Acceptance Model (TAM), which analyses at the individual level, and the assimilation of the adopted technologies.
ARTICLE | doi:10.20944/preprints202110.0103.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Data Analytics; Analytics; Supply Chain Input; Supply Chain; Data Science; Data
Online: 6 October 2021 (10:38:42 CEST)
One of the most remarkable features in the 20th century was the digitalization of technical progress, which changed the output of companies worldwide and became a defining feature of the century. The growth of information technology systems and the implementation of new technical advances, which enhance the integrity, agility and long-term organizational performance of the supply chain, can distinguish a digital supply chain from other supply chains. For example, the Internet of Things (IoT)-enabled information exchange and Big Data analysis might be used to regulate the mismatch between supply and demand. In order to assess contemporary ideas and concepts in the field of data analysis in the context of supply chain management, this literary investigation has been decided. The research was conducted in the form of a comprehensive literature review. In the SLR investigation, a total of 71 papers from leading journals were used. SLR has found that data analytics integrate into supply chain management can have long-term benefits on supply chain management from the input side, i.e., improved strategic development, management and other areas.
ARTICLE | doi:10.20944/preprints202110.0070.v1
Subject: Mathematics & Computer Science, Information Technology & Data Management Keywords: Online social networks (OSNs); Deep Learning; cyberbullying; Twitter
Online: 5 October 2021 (08:27:41 CEST)
Online social networks (OSNs) play an integral role in facilitating social interaction; however, these social networks increase antisocial behavior, such as cyberbullying, hate speech, and trolling. Aggression or hate speech that takes place through short message service (SMS) or the Internet (e.g., in social media platforms) is known as cyberbullying. Therefore, automatic detection utilizing natural language processing (NLP) is a necessary first step that helps prevent cyberbullying. This research proposes an automatic cyberbullying method to detect aggressive behavior using a consolidated deep learning model. This technique utilizes multichannel deep learning based on three models, namely, the bidirectional gated recurrent unit (BiGRU), transformer block, and convolutional neural network (CNN), to classify Twitter comments into two categories: aggressive and not aggressive. Three well-known hate speech datasets were combined to evaluate the performance of the proposed method. The proposed method achieved promising results. The accuracy of the proposed method was approximately 88%.