Search | Preprints.org

Advances in machine learning (ML) and artificial intelligence (AI) are transforming the way we treat patients in ways not even imagined a few years ago. Cancer research is at the forefront of this movement. Infertility, though not a life-threatening condition, affects around 15% of couples trying for a pregnancy. Increasing availability of large datasets from various sources creates an opportunity to introduce ML and AI into infertility prevention and treatment. At present in the field of assisted reproduction, very little is done in order to prevent infertility from arising, with the main focus put on treatment when often advanced maternal age and low ovarian reserve make it very difficult to conceive. A shift from this disease-centric model to a health centric model in infertility is already taking place with more emphasis on the patient as an active participator in the process. Poor quality and incomplete data as well as biological variability remain the main limitations in the widespread and reliable implementation of AI in the field of reproductive medicine. That said, one of the areas where this technology managed to find a foothold is identification of developmentally competent embryos. More work is required however to learn about ways to improve natural conception, the detection and diagnosis of infertility, and improve assisted reproduction treatments (ART) and ultimately, develop clinically useful algorithms able to adjust treatment regimens in order to assure a successful outcome of either fertility preservation or infertility treatment. Progress in genomics, digital technologies and advances in integrative biology has had a tremendousimpact on research and clinical medicine. With the rise of ‘big data’, artificial intelligence, and the advances in molecular profiling, there is an enormous potential to transform not only scientific research progress, but also clinical decision making towards predictive, preventive, and personalized medicine. In the field of reproductive health, there is now an exciting opportunity to leverage these technologies and develop more sophisticated approaches to diagnose and treat infertility disorders. In this review, we present a comprehensive analysis and interpretation of different innovation forces that are driving the emergence of a system approach to the infertility sector. Here we discuss recent influential work and explore the limitations of the use of Machine Learning models in this rapidly developing area.

Preprint ARTICLE | doi:10.20944/preprints202002.0294.v1

Data Processing and Information Classification: An In-Memory Approach

Milena Andrighetti, Giovanna Turvani, Giulia Santoro, Marco Vacca, Andrea Marchesin, Fabrizio Ottati, Massimo Ruo Roch, Mariagrazia Graziano, Maurizio Zamboni

Subject: Engineering, Electrical And Electronic Engineering Keywords: bitmap indexing; processing in memory; memory wall; Big Data; Internet Of Things

Online: 20 February 2020 (08:24:48 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints201808.0350.v2

Integration of Data Mining Clustering Approach with the Personalized E-Learning System

Samina Kausar, Huahu Xu, Iftikhar Hussain, Wenhau Zhu, Misha Zahid

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: big data; clustering; data mining; educational data mining; e-learning; profile learning

Online: 19 October 2018 (05:58:05 CEST)

Show abstract| Download PDF| Share

Preprint CONCEPT PAPER | doi:10.20944/preprints202111.0117.v1

Competitive Approaches of Strategic Alliance in the Big Data Environment, a Moderating Role of Big Data Predictive Analytics in the Case of Telecommunication Sector of Pakistan

Hassan Abbas, Ye Ze, Waqar Ahmad

Subject: Business, Economics And Management, Business And Management Keywords: Big data predictive analytics; competitive strategies; strategic alliance performance; Telecom sector

Online: 5 November 2021 (11:29:12 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202406.0091.v1

Open Data and Sustainable Mobility in Slovenia

Klara Žnideršič, Vid Klopčič, Andraž Juvan, Matija Marolt, Matevž Pesek

Subject: Computer Science And Mathematics, Computer Science Keywords: open data; mobility applications; sustainable services; socio-economic impact; big data; real-time data

Online: 4 June 2024 (03:54:58 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202301.0415.v2

Psychological Health and Drugs: Data-Driven Discovery of Causes, Treatments, Effects, and Abuses

Sarah Alswedani, Rashid Mehmood, Iyad Katib, Saleh M. Altowaijri

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Psychological Health; Drugs; Twitter; Machine Learning; Big Data; Drug Abuse; Toxicology; Social Factors; Economic Factors; Environmental Factors

Online: 27 February 2023 (13:31:40 CET)

Show abstract| Download PDF| Share

Preprint CONCEPT PAPER | doi:10.20944/preprints202102.0203.v1

An Application Overview of IoT Enabled-Big Data Analytics in Health Sector with Special Reference to Covid-19

Rajib Biswas

Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: Bigdata; IoT; Big Data Analytics; Covid-19; healthcare

Online: 8 February 2021 (12:19:28 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1724.v1

Data Loss Prevention Method Based on Multiprotocol Connectivity for IoT

Hamza Takrouni, Larbi Talbi, Youcef Fouzar

Subject: Engineering, Telecommunications Keywords: Internet of Things, Cloud Computing, Edge Computing, Big Data, IoT Communications Protocols

Online: 29 February 2024 (11:06:12 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201811.0339.v1

HOLMeS: eHealth in the Big Data and Deep Learning Era

Flora Amato, Stefano Marrone, Vincenzo Moscato, Gabriele Piantadosi, Antonio Picariello, Carlo Sansone

Subject: Engineering, Electrical And Electronic Engineering Keywords: eHealth; big data; deep learning; watson; spark; decision support system; prevention pathways

Online: 15 November 2018 (04:14:36 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202208.0083.v1

A Big Data Analysis with Machine Learning Techniques in Accounting Dataset from the Greek Banking System

Leonidas Theodorakopoulos, Georgios Thanasas, Spyridon Lampropoulos

Subject: Business, Economics And Management, Accounting And Taxation Keywords: Ratios; Financial Crisis; Covid-19; Big Data; Accounting Data

Online: 3 August 2022 (10:42:06 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201810.0601.v1

Predicting Freeway Travelling Time Using Multiple-Source Data

Kejun Long, Wukai Yao, Jian Gu, Wei Wu

Subject: Engineering, Civil Engineering Keywords: support vector machine; travelling time; intelligent transportation system; artificial fish swarm algorithm; big data

Online: 25 October 2018 (10:48:45 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201710.0076.v2

Addressing Complexities of Machine Learning in Big Data: Principles, Trends and Challenges from Systematical Perspectives

Qi Wang, Xia Zhao, Jincai Huang, Yanghe Feng, Jiahao Su, Zhihao Luo

Subject: Computer Science And Mathematics, Information Systems Keywords: big data; machine learning; regularization; data quality; robust learning framework

Online: 17 October 2017 (03:47:41 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints201904.0027.v2

The Pipeline of Processing fMRI Data with Python Based on the Ecosystem NeuroDebian

Qiang Li, Rong Xue

Subject: Computer Science And Mathematics, Analysis Keywords: neuroscience; big data; functional Magnetic Resonance (fMRI); pipeline; one platform system

Online: 8 April 2019 (05:46:55 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202401.0912.v2

Intrusion Detection System for Big Data Environment Using Deep Learning

Pooja Potnurwar, Ayush Ainchwar, Rahul Neware, Vrushali Bongirwar

Subject: Computer Science And Mathematics, Security Systems Keywords: intrusion detection system; big data; deep learning; CNN; LSTM; GAN; cybersecurity; network security

Online: 15 January 2024 (08:11:36 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints201805.0418.v1

Data Science in Education, Employment, Research: Data Revolution for Sustainable Development

Fionn Murtagh, Keith Devlin

Subject: Computer Science And Mathematics, Computer Science Keywords: big data training and learning; company and business requirements; ethics; impact; decision support; data engineering; open data; smart homes; smart cities; IoT

Online: 29 May 2018 (08:45:52 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202007.0078.v1

Data Driven Analytics for Personalized Medical Decision Making

Nataliia Melnykova, Nataliya Shakhovska, Michal Gregus, Volodymyr Melnykov, Mariana Zakharchuk, Olena Vovk

Subject: Computer Science And Mathematics, Information Systems Keywords: personalization; decision making; medical data; artificial intelligence; Data-driving; Big Data; Data Mining; Machine Learning

Online: 5 July 2020 (15:04:17 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201609.0027.v1

Optimizing Bus Passenger Complaint Service through Big Data Analysis: Systematized Analysis for Improved Public Sector Management

Weng-Kun Liu, Chia-Chun Yen

Subject: Business, Economics And Management, Business And Management Keywords: customer complaint process improvement; customer complaint service; big data analysis

Online: 7 September 2016 (11:38:33 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202209.0413.v2

An Approach for Data Privacy Management for Banking Using Consortium Blockchain

Shady Nabih, hanan fahmy, sayed abdelgaber

Subject: Engineering, Chemical Engineering Keywords: Consortium Blockchain; Ring signature; Blockchain privacy; Blockchain security; Access Control; Blockchain big data

Online: 25 June 2023 (04:01:48 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202106.0654.v1

A Hybrid Deep Learning Model to Predict the Impact of COVID-19 on Mental Health form Social Media Big Data

Tapotosh Ghosh, Md. Hasan Al Banna, Md. Jaber Al Nahian, Kazi Abu Taher, M Shamim Kaiser, Mufti Mahmud

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: COVID-19; Mental Health; Depression; Big data; Social media.

Online: 28 June 2021 (13:50:49 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202103.0623.v1

How Schools affected COVID-19 Pandemic in Italy: Data Analysis for Lombardy Region, Campania Region, and Emilia Region

Davide Tosi, Alessandro Siro Campi

Subject: Computer Science And Mathematics, Information Systems Keywords: SARS-CoV-2; Big Data; Data Analytics; Predictive Models; Schools

Online: 25 March 2021 (14:35:53 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.1845.v1

Advanced Analytics and Data Management in the Procurement Function: An Aviation Industry Case Study

Andrea Altundag, Martin Wynn

Subject: Business, Economics And Management, Business And Management Keywords: data analytics; strategic procurement; big data; maturity model; aviation industry; aircraft manufacturer

Online: 29 March 2024 (10:36:02 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201812.0058.v1

Estimating the Parameters of Dynamical Systems from Big Data Using Sequential Monte Carlo Samplers

Peter Green, Simon Maskell

Subject: Engineering, Mechanical Engineering Keywords: big data; parameter estimation; model updating; system identification; sequential Monte Carlo sampler

Online: 4 December 2018 (11:17:24 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201810.0273.v1

Russian-German Astroparticle Data Life Cycle Initiative

Igor Bychkov, Andrey Demichev, Julia Dubenskaya, Oleg Fedorov, Andreas Haungs, Andreas Heiss, Yulia Kazarina, Elena Korosteleva, Dmitriy Kostunin, Alexander Kryukov, Andrey Mikhailov, Minh-Duc Nguyen, Stanislav Polyakov, Evgeny Postnikov, Alexey Shigarov, Dmitry Shipilov, Achim Streit, Viktoria Tokareva, Doris Wochele, Jürgen Wochele, Dmitry Zhurov

Subject: Physical Sciences, Astronomy And Astrophysics Keywords: astroparticle physics, cosmic rays, data life cycle management, data curation, meta data, big data, deep learning, open data

Online: 12 October 2018 (14:48:32 CEST)

Show abstract| Download PDF| Share

Preprint SHORT NOTE | doi:10.20944/preprints202211.0056.v1

Big Data Enabled Non-Invasive Rapid Sex Detection of Incubated Chicken Eggs

Suresh Neethirajan

Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: Precision Livestock Farming; Digital Agriculure; Smart Farming; In Ovo Sexing; Big Data; Artificial Intelligence

Online: 2 November 2022 (11:03:44 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.0208.v1

Big data reveals the change characteristics of 64 hexagrams and lines

Xinqi Zheng, Yang Cao

Subject: Computer Science And Mathematics, Analysis Keywords: I Ching; Coin toss method; 64 hexagram changes; Big data analysis; Hexagram changes topographic map

Online: 3 August 2023 (08:16:04 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.1570.v1

Consore: A Powerful Federated Data Mining Tool Driving a French Research Network to Accelerate Cancer Research

Julien Guérin, Amine Nahid, Louis Tassy, Marc Deloger, François Bocquet, Simon Thézenas, Emmanuel Desandes, Marie-Cécile Le Deley, Xavier DURANDO, Anne Jaffré, Ikram Es Saad, Hugo Crochet, Marie Le Morvan, François Lion, Judith Raimbourg, Oussama Khay, Franck Craynest, Alexia Giro, Yec'han Laizet, Aurélie Bertaut, Frédérik Joly, Alain Livartowski, Pierre Etienne Heudel

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: cancer research; cancer; natural language processing; data mining; data warehouse; big data

Online: 26 November 2023 (05:13:14 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202103.0402.v1

Big Data in Studying Acute Pain and Regional Anesthesia

Lukas M. Müller-Wirtz, Thomas Volk

Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: anesthesia; anesthesiology; big data; registries; database research; acute pain; pain management; postoperative pain; regional anesthesia; regional analgesia.

Online: 15 March 2021 (17:45:39 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201906.0174.v1

Exploring Challenges to Implementation of IT Service Management System ISO 20000: Implications in Managing Big Data in Emerging Economy

Nafis Ahmad, Md. Golam Rabbany, Syed Mithun Ali

Subject: Engineering, Industrial And Manufacturing Engineering Keywords: Business excellence; information technology; implementation challenge; ISO 20000; big data management.

Online: 18 June 2019 (10:56:19 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202111.0029.v1

Estimation of Real-world Fuel Consumption Rate of Light-duty Vehicles Based on Big Data

Isabella Yunfei Zeng, Shiqi Tan, Jianliang Xiong, Xuesong Ding, Yawen Li, Tian Wu

Subject: Social Sciences, Decision Sciences Keywords: Real-world fuel consumption rate; machine learning; big data; light-duty vehicle; China

Online: 2 November 2021 (09:40:05 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202106.0187.v3

The BFP (Benford-Fibonacci-Perez) Method Validates the Consistency of COVID-19 Epidemiological Data in France and Italy

Jean-Claude Perez

Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: SARS-CoV2; Biomathematics; Benford law; trials; Epidemiology; Fibonacci; data analysis; big data

Online: 11 June 2021 (15:47:44 CEST)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Model for the Collection and Analysis of Data from Teachers and Students, Supported by Academic Analytics

Fredys A. Simanca H., Isabel Hernández Arteaga, María Elsa Unriza Puin, Fabian Blanco Garrido, Jaime Paez Paez, Jairo Cortes Méndez

Subject: Computer Science And Mathematics, Information Systems Keywords: Academic Analytics; data storage; education and big data; analysis of data; learning analytics

Online: 19 July 2020 (20:37:39 CEST)

Show abstract| Download PDF| Share

Business Intelligence, defined by [1] as "the ability to understand the interrelations of the facts that are presented in such a way that it can guide the action towards achieving a desired goal", has been used since 1958 for the transformation of data into information, and of information into knowledge, to be used when making decisions in a business environment. But, what would happen if we took the same principles of business intelligence and applied them to the academic environment? The answer would be the creation of Academic Analytics, a term defined by [2] as the process of evaluating and analyzing organizational information from university systems for reporting and making decisions, whose characteristics allow it to be used more and more in institutions, since the information they accumulate about their students and teachers gathers data such as academic performance, student success, persistence, and retention [5]. Academic Analytics enables an analysis of data that is very important for making decisions in the educational institutional environment, aggregating valuable information in the academic research activity and providing easy to use business intelligence tools. This article shows a proposal for creating an information system based on Academic Analytics, using ASP.Net technology and trusting storage in the database engine Microsoft SQL Server, designing a model that is supported by Academic Analytics for the collection and analysis of data from the information systems of educational institutions. The idea that was conceived proposes a system that is capable of displaying statistics on the historical data of students and teachers taken over academic periods, without having direct access to institutional databases, with the purpose of gathering the information that the director, the teacher, and finally the student need for making decisions. The model was validated with information taken from students and teachers during the last five years, and the export format of the data was pdf, csv, and xls files. The findings allow us to state that it is extremely important to analyze the data that is in the information systems of the educational institutions for making decisions. After the validation of the model, it was established that it is a must for students to know the reports of their academic performance in order to carry out a process of self-evaluation, as well as for teachers to be able to see the results of the data obtained in order to carry out processes of self-evaluation, and adaptation of content and dynamics in the classrooms, and finally for the head of the program to make decisions.

Preprint ARTICLE | doi:10.20944/preprints201810.0469.v1

A Novel Framework and Enhanced QoS Big Data Protocol for Smart City Applications

Shalli Rani, Sajjad Hussain chauhdary

Subject: Engineering, Control And Systems Engineering Keywords: energy efficiency; big data analytics; QoS-IoT; internet of things; smart city; WSN; green computing

Online: 22 October 2018 (05:27:42 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202407.1803.v1

Digital Forensics Readiness in Big Data Wireless Networks: A Novel Framework and Incident Response Script for Linux-Hadoop Environments

Cephas Mpungu, Carlisle George, Glenford Mapp

Subject: Computer Science And Mathematics, Computer Networks And Communications Keywords: Wireless networks; digital forensics; digital forensics readiness; incident response; big data; Hadoop

Online: 23 July 2024 (16:02:46 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202407.2058.v1

The Role of Big Data Analytics in Retail Marketing and Supply Chain Optimization

Oliver Johnson, William Brown, George Wilson

Subject: Business, Economics And Management, Business And Management Keywords: Big data analytics, Retail marketing, Supply chain optimization, Customer segmentation, Demand forecasting, Personalized marketing, Ethical considerations

Online: 25 July 2024 (13:41:08 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201705.0116.v1

Big-Data-Based Thermal Runaway Prognosis of Battery Systems for Electric Vehicles

Jichao Hong, Zhenpo Wang, Peng Liu

Subject: Engineering, Mechanical Engineering Keywords: thermal runaway; big-data platform; battery systems; electric vehicles; National Service and Management Center for Electric Vehicles

Online: 16 May 2017 (03:18:57 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202004.0383.v1

Artificial Intelligence (AI) and Big Data for Coronavirus (COVID-19) Pandemic: A Survey on the State-of-the-Arts

Quoc-Viet Pham, Dinh C. Nguyen, Thien Huynh-The, Won-Joo Hwang, Pubudu N. Pathirana

Subject: Medicine And Pharmacology, Other Keywords: COVID-19; coronavirus pandemic; big data; epidemic outbreak; artificial intelligence (AI); deep learning

Online: 21 April 2020 (09:01:45 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202002.0143.v1

Research on the Visualization of Ocean Big Data Based on the Cite-Space Software

Jiajing Wu, Dongning Jia, Zhiqiang Wei, Xin Dou

Subject: Computer Science And Mathematics, Information Systems Keywords: ocean; big-data; cite-space; co-authorship analysis; co-citation analysis; keywords co-occurrence analysis; visualization

Online: 11 February 2020 (09:41:17 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201905.0263.v1

Modeling Natural Gas Compressibility Factor Using a Hybrid Group Method of Data Handling

Abdolhossein Hemmati-Sarapardeh, Sassan Hajirezaie, Mohamad Reza Soltanian, Amir Mosavi, Shahab Shamshirband

Subject: Computer Science And Mathematics, Computational Mathematics Keywords: natural gas; gas compressibility factor; group method of data handling (GMDH); big data; equation of state; correlation

Online: 22 May 2019 (08:29:32 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.0937.v1

Challenges and Opportunities in One Health: Google Trends Search Data

Lauren Wisnieski, Karen Gruszynski, Vina Faulkner, Barbara Shock

Subject: Public Health And Healthcare, Public, Environmental And Occupational Health Keywords: Google Trends; disease prediction; Lyme disease; Lyme; Big Data; One Health; negative binomial; mixed models; zoonotic disease; tick-borne disease

Online: 11 August 2023 (11:01:40 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202407.0163.v1

Challenges and Opportunities: Improve Patient Data Security and Privacy in Distributed Systems

Wisnu Uriawan, Sumitra Adriansyah, Siti Jahro Maulidiyah, Sigit Julianto, Wildan Sophal Jamil

Subject: Computer Science And Mathematics, Computer Science Keywords: blockchain technology; data encryption; healthcare security; patient data privacy; Big Data; Internet of Things (IoT); cybersecurity; digital healthcare

Online: 2 July 2024 (14:48:45 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202104.0482.v1

System Design and Data Governance for Environmental Disaster Management in Smart Scenic - A Case Study of Huangshan Mountain

Zhong Wang, Zhenjie Liao, Lijuan Zhang

Subject: Business, Economics And Management, Accounting And Taxation Keywords: Smart Scenic; environmental disasters management; organization transformation; system design; Big Data; Internet of Things

Online: 19 April 2021 (13:19:35 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202405.0508.v2

HCI and Data: Interacting in a New Era of Virtualization

Iván Durango, José A. Gallud, Victor M. R. Penichet

Subject: Computer Science And Mathematics, Information Systems Keywords: Human-Data Interaction; Human-Computer Interaction; Big Data; Data virtualization; Data Accessibility; Data Management; Data Privacy; Data Ethics; Data-Driven Decision-Making

Online: 1 July 2024 (08:12:01 CEST)

Show abstract| Download PDF| Share

The rapid technological progress has ushered in a new era of human-computer interaction, where the distinction between the physical and virtual realms is becoming increasingly blurred. This research paper explores the profound and multifaceted intersection of Human-Data Interaction (HDI) and Data Virtualization (DV), examining how emerging technologies can significantly enhance the exploration, comprehension, and utilization of complex, multidimensional data sets. Informed by the insights gleaned from prior research in this domain , the present study delves into the potential of DV techniques to improve HDI, with a particular focus on three experimental investigations conducted within the realms of education, healthcare, and retail. The findings reveal the benefits and potential challenges associated with the implementation of DV in these diverse contexts, offering valuable guidance for the design and development of future HDI systems. Drawing upon a diverse array of authoritative sources, this paper presents a holistic, forward-looking perspective on the future of HDI, underscoring the critical role that DV will play in shaping the next generation of human-computer interfaces and facilitating a deeper, more intuitive understanding of the digital world. Furthermore, the paper presents a preliminary framework for integrating HDI principles into standard design practices. This framework outlines key considerations and guidelines to help designers and developers incorporate HDI techniques more effectively into the development of data-driven applications and interfaces.The proposed framework outlines key considerations for enhancing data accessibility and comprehension, empowering users to exercise greater control over their data, and cultivating transparent dialogues between data providers and end-users. By establishing this conceptual foundation, the paper aims to facilitate the seamless integration of HDI principles into standard design practices, ultimately leading to more intuitive, user-centric, and ethically-grounded approaches to data interaction and utilization.

Preprint ARTICLE | doi:10.20944/preprints202308.1347.v1

Big Data Analyzing the Asymmetry of 64 Hexagrams Based on the Yarrow-stalk Method

Xinqi Zheng, Yang Cao

Subject: Computer Science And Mathematics, Analysis Keywords: Yijing; 64 hexagram changes; number in the great expansion method of divination; yin‐yang asymmetry; big data analysis

Online: 18 August 2023 (11:29:41 CEST)

Show abstract| Download PDF| Share

The divination function of China's Yijing has led to its circulation for thousands of years. In our exploration of Yijing's characteristics using big data, we have discovered variations in results between the coin toss method and the ancient yarrow-stalk method of divination, known as "the number for the great expansion method of divination(大衍之数)". The yarrow-stalk method serves as the fundamental method of divination in Yijing and continues to hold significance in studying the essential characteristics of Yijing. Despite the complexity of yarrow calculations, advancements in computer technology and big data have simplified its application. By employing the yarrow-stalk method, we simulated changes in the 64 hexagrams, calculated probabilities and proportions of hexagram alterations, and derived fundamental characteristics and patterns of hexagrams. Additionally, we constructed the spatial representation of lines and hexagrams. Through a binary system rearrangement, we created a 64x64 matrix illustrating hexagram transformations. Subsequently, we generated 100 million random hexagrams and analyzed line and hexagram changes accordingly. Our findings indicate the following:(1) Big data analysis reveals evident asymmetry in the hexagrams obtained through the yarrow-stalk method, with a triangular fractal characteristic forming the background.(2) Each of the 64 hexagrams exhibits a distinct probability distribution when transforming into other hexagrams, which can be categorized into five types.(3)The occurrence probabilities of Laoyang, Laoyin, Shaoyang, and Shaoyin are 18.61%, 6.387%, 31.38%, and 43.62% respectively. The probabilities of Yin and Yang occurrences are nearly equal, each close to 50%. However, the probability of Laoyang is approximately three times higher than that of Laoyin.(4) Visualized and analyzed the characteristics of hexagram changes greater than 100000 times using 3D statistical maps and Sankey diagram.These results demonstrate that the yarrow-stalk method effectively unveils the characteristics and underlying patterns of the 64 hexagrams. This study provides a novel approach for scientifically exploring the internal laws governing the 64 hexagrams in Yijing.

Preprint ARTICLE | doi:10.20944/preprints202211.0034.v1

SAUSA: Securing Access, Usage, and Storage of 3D Point Clouds Data by a Blockchain-based Authentication Network

Ronghua Xu, Yu Chen, Genshe Chen, Erik Blasch

Subject: Computer Science And Mathematics, Information Systems Keywords: Blockchain; Smart Contract; Point Cloud; Security; Privacy Preservation; Software-Defined Network (SND); Big Data; Assurance; Resilience.

Online: 2 November 2022 (02:18:50 CET)

Show abstract| Download PDF| Share

The rapid development of three-dimensional (3D) acquisition technology based on 3D sensors provides a large volume of data, which is often represented in the form of point clouds. Point cloud representation can preserve the original geometric information along with associated attributes in a 3D space. Therefore, it has been widely adopted in many scene-understanding-related applications such as virtual reality (VR) and autonomous driving. However, the massive amount of point cloud data aggregated from distributed 3D sensors also poses challenges for secure data collection, management, storage, and sharing. Thanks to the characteristics of decentralization and security nature, Blockchain has a great potential to improve point cloud services and enhance security and privacy preservation. Inspired by the rationales behind Software Defined Network (SDN) technology, this paper envisions SAUSA, a blockchain-based authentication network that is capable of recording, tracking, and auditing the access, usage, and storage of 3D point cloud data sets in their life-cycle in a decentralized manner. SAUSA adopts an SDN-enabled point cloud service architecture which allows for efficient data processing and delivery to satisfy diverse Quality-of-Service (QoS) requirements. A blockchain-based authentication framework is proposed to ensure security and privacy preservation in point cloud data acquisition, storage, and analytics. Leveraging smart contracts for digitizing access control policies and point cloud data on the blockchain, data owners have full control of their 3D sensors and point clouds. In addition, anyone can verify the authenticity and integrity of point clouds in use without relying on a third party. Moreover, SAUSA integrates a decentralized storage platform to store encrypted point clouds while recording references of raw data on the distributed ledger. Such a hybrid on-chain and off-chain storage strategy not only improves robustness and availability but also ensures privacy preservation for sensitive information in point cloud applications. A proof-of-concept prototype is implemented and tested on a physical network. The experimental evaluation validates the feasibility and effectiveness of the proposed SAUSA solution.

Preprint ARTICLE | doi:10.20944/preprints202401.2139.v1

Cartographic Journals in the Era of Artificial Intelligence: Terra Digitalis and the Emergence of Peer-Reviewed FAIR Geospatial Data

Francisco Javier Osorno-Covarrubias, Stéphane Couturier, Iván Martínez-Zazueta, Penélope López-Quiroz, Luca Ferrari, Manuel Suárez Lastra

Subject: Environmental And Earth Sciences, Geography Keywords: geospatial big data; FAIR; OJS; digital earth; OGC web services; data sharing and re-using

Online: 31 January 2024 (02:53:28 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202211.0161.v1

Data Locality in High Performance Computing, Big Data, and Converged Systems: An Analysis of the Cutting Edge and A Future System Architecture

Sardar Usman, Rashid Mehmood, Iyad Katib, Aiiad Albeshri

Subject: Computer Science And Mathematics, Information Systems Keywords: High Performance Computing (HPC); big data; High Performance Data Analytics (HPDS); con-vergence; data locality; spark; Hadoop; design patterns; process mapping; in-situ data analysis

Online: 9 November 2022 (01:38:34 CET)

Show abstract| Download PDF| Share

Big data has revolutionised science and technology leading to the transformation of our societies. High Performance Computing (HPC) provides the necessary computational power for big data analysis using artificial intelligence and methods. Traditionally HPC and big data had focused on different problem domains and had grown into two different ecosystems. Efforts have been underway for the last few years on bringing the best of both paradigms into HPC and big converged architectures. Designing HPC and big data converged systems is a hard task requiring careful placement of data, analytics, and other computational tasks such that the desired performance is achieved with the least amount of resources. Energy efficiency has become the biggest hurdle in the realisation of HPC, big data, and converged systems capable of delivering exascale and beyond performance. Data locality is a key parameter of HPDA system design as moving even a byte costs heavily both in time and energy with an increase in the size of the system. Performance in terms of time and energy are the most important factors for users, particularly energy, due to it being the major hurdle in high performance system design and the increasing focus on green energy systems due to environmental sustainability. Data locality is a broad term that encapsulates different aspects including bringing computations to data, minimizing data movement by efficient exploitation of cache hierarchies, reducing intra- and inter-node communications, locality-aware process and thread mapping, and in-situ and in-transit data analysis. This paper provides an extensive review of the cutting-edge on data locality in HPC, big data, and converged systems. We review the literature on data locality in HPC, big data, and converged environments and discuss challenges, opportunities, and future directions. Subsequently, using the knowledge gained from this extensive review, we propose a system architecture for future HPC and big data converged systems. To the best of our knowledge, there is no such review on data locality in converged HPC and big data systems.

Preprint ARTICLE | doi:10.20944/preprints202110.0260.v1

Online System for Power Quality Operational Data Management in Frequency Monitoring using Python and Grafana

Jose-María Sierra-Fernández, Olivia Florencias-Oliveros, Manuel-Jesús Espinosa-Gavira, Juan-José González-de-la-Rosa, Agustín Agüera-Pérez, José-Carlos Palomares-Salas

Subject: Engineering, Electrical And Electronic Engineering Keywords: big data; data acquisition; data visualization; data exchange; dashboard; frequency stability; Grafana lab; Power Quality; GPS reference; frequency measurement.

Online: 18 October 2021 (18:07:43 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201805.0353.v1

A Study on the Improvement of Thermal Energy Efficiency for District Thermal Energy Consumer Facility based on Reinforcement Learning

Young-gon Kim, Keol Heo, Ga-Eun You, Hyun-Seo Lim, Jung-In Choi, Jae-Sik Eom

Subject: Computer Science And Mathematics, Computer Science Keywords: big data; big data system; energy; district heating; reinforcement learning

Online: 24 May 2018 (16:05:27 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201901.0130.v1

Big Data-Driven Market-Oriented Information System for the Internationalisation and Strategic and Sustainable Management of SMEs

Yoseob Heo, Jungjoon Kim, Jongseok Kang

Subject: Business, Economics And Management, Business And Management Keywords: internationalisation of SMEs; big data; market-oriented information; relational database; supply chain network; optimized database; trade condition; data visualization

Online: 14 January 2019 (10:04:03 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202210.0472.v1

Data-Driven Deep Journalism to Discover Age Dynamics in Multi-Generational Labour Markets from LinkedIn Media

Abeer Abdullah Alaql, Fahad AlQurashi, Rashid Mehmood

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: media; journalism; deep journalism; labor markets; Great Resignation; Quiet Quitting; Millennials; Generation Z; Big Data Analytics; Natural Language Processing (NLP)

Online: 31 October 2022 (08:33:34 CET)

Show abstract| Download PDF| Share

We live in the information age and, ironically, meeting the core function of journalism – i.e., to provide people access to unbiased information – has never been more difficult. This paper explores deep journalism, our data-driven Artificial Intelligence (AI) based journalism approach to study how the LinkedIn media could be useful for journalism. Specifically, we apply our deep journalism approach to LinkedIn to automatically extract and analyse big data to provide the public with information about labour markets, people’s skills and education, and businesses and industries from multi-generational perspectives. The Great Resignation and Quiet Quitting phenomena coupled with rapidly changing generational attitudes are bringing unprecedented and uncertain changes to labour markets and our economies and societies, and hence the need for journalistic investigations into these topics is highly significant. We combine big data and machine learning to create a whole machine learning pipeline and a software tool for journalism that allows discovering parameters for age dynamics in labour markets using LinkedIn data. We collect a total of 57,000 posts from LinkedIn and use it to discover 15 parameters by Latent Dirichlet Allocation algorithm (LDA) and group them into five macro-parameters, namely Generations-Specific Issues, Skills & Qualifications, Employment Sectors, Consumer Industries, and Employment Issues. The journalism approach used in this paper can automatically discover and make objective, cross-sectional, and multi-perspective information available to all. It can bring rigour to journalism by making it easy to generate information using machine learning and can make tools and information available so that anyone can uncover information about matters of public importance. This work is novel since none of the earlier works have reported such an approach and tool and leveraged it to use LinkedIn media for journalism and to discover multigenerational perspectives (parameters) for age dynamics in labour markets. The approach could be extended with additional AI tools and other media.

Preprint ARTICLE | doi:10.20944/preprints202008.0254.v1

Unsupervised Feature Selection Using Recursive k-Means Silhouette Elimination (RkSE): A Two-Scenario Case Study for Fault Classification of High-Dimensional Sensor Data

Ahlam Mallak, Madjid Fathi

Subject: Computer Science And Mathematics, Information Systems Keywords: feature selection; k-means; silhouette measure; clustering; big data; fault classification; sensor data; time-series data

Online: 11 August 2020 (06:26:43 CEST)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202206.0383.v2

Twitter Big Data as A Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions

Nirmalya Thakur

Subject: Computer Science And Mathematics, Information Systems Keywords: Exoskeleton; Twitter; Tweets; Big Data; social media; Data Mining; dataset; Data Science; Natural Language Processing; Information Retrieval

Online: 21 July 2022 (04:06:53 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201812.0016.v1

From a Smoking Gun to Spent Fuel: Principled Subsampling Methods for Building Big Language Data Corpora from Monitor Corpora

Jacqueline Hettel Tidwell

Subject: Social Sciences, Library And Information Sciences Keywords: corpus linguistics; language modeling; big data; language data; databases; monitor corpora; documentary analysis; nuclear power; government regulation; tobacco documents

Online: 3 December 2018 (09:16:14 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202008.0053.v1

Analysis of Environmental Factors Caused by Exposure to Air Pollution in Yeosu Industrial Complex: Analyzing Google Trend Trends in Cancer Generation through Big Data

Kil Yong Choi, BuSoon Son, WonHo Yang

Subject: Physical Sciences, Atomic And Molecular Physics Keywords: Google Trend; Particulate Matter; National Ambient Air Quality Monitoring Information System; Chronic obstructive pulmonary disease; Big Data

Online: 2 August 2020 (18:29:51 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202401.2038.v1

Fractal Dimension of the Generalized Z-Entropy of The Rényian Formalism of Stable Queue with Some Potential Applications of Fractal Dimension to Big Data Analytics

Dr Ismail A Mageed

Subject: Computer Science And Mathematics, Geometry And Topology Keywords: Fractal Dimension(D), Generalized Z-Entropy, Google Earth satellite (GEs),GNU Image Manipulation, Big Data Analytics(BDAs).

Online: 29 January 2024 (14:39:18 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202207.0121.v7

Exploring Theoretical Modifications in Fabric of Spacetime

Amrit Ladhani

Subject: Physical Sciences, Astronomy And Astrophysics Keywords: big bang and big crunch; cosmology; dark energy; gravitational force; the cyclic universe

Online: 11 June 2024 (08:51:38 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202311.1257.v1

Navigating to Net Zero: Leveraging Big Data, AI, and Benchmarking for Sustainable Climate Action and Emissions Reduction

Suresh Neethirajan

Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: Climate Change; Net Zero Emissions; Dairy Farming; Big Data; Artificial Intelligence (AI); Greenhouse Gas Emissions; Sustainable Agriculture; Technological Innovation; Policy Framework; Environmental Sustainability

Online: 20 November 2023 (16:16:51 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202105.0226.v2

Frequency Converter as Node for Edge Computing of Big Data, Related to Drive Efficiency, in Industrial Internet of Things

Mariusz Piotr Hetmanczyk, Julian Malaka

Subject: Engineering, Industrial And Manufacturing Engineering Keywords: energy efficiency; electric drive; electric motor control; frequency converter; Industrial Internet of Things; edge computing; Big Data; Key Performance Indicators; KPI; dashboard

Online: 8 September 2021 (13:15:18 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202302.0066.v1

Smarter Sustainable Tourism: Data-Driven Multi-Perspective Parameter Discovery for Autonomous Design and Operations

Raniah Alsahafi, Ahmed Alzahrani, Rashid Mehmood

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Smart Tourism; Sustainable Tourism; Natural language Processing (NLP); Big Data Analytics; Deep Learning; Machine Learning; Unsupervised Learning; Bidirectional Encoder Representations from Transformers (BERT); Literature Review; Smart Societies

Online: 3 February 2023 (09:47:55 CET)

Show abstract| Download PDF| Share

The Global natural and manmade events are exposing the fragility of the tourism industry and its impact on the global economy. Prior to the COVID-19 pandemic, tourism contributed 10.3% to the global GDP and employed 333 million people but saw a significant decline due to the pandemic. Sustainable and smart tourism requires collaboration from all stakeholders and a comprehensive understanding of global and local issues to drive responsible and innovative growth in the sector. This paper presents an approach for leveraging big data and deep learning to dis-cover holistic, multi-perspective (e.g., local, cultural, national, and international) and objective information on a subject. Specifically, we develop a machine learning pipeline to extract parameters from academic literature and public opinions on Twitter, providing a unique and comprehensive view of the industry from both academic and public perspectives. The academic-view dataset was created from the Scopus database and contains 156,759 research articles from 2000 to 2022, which were modelled to identify 33 distinct parameters in 4 categories: Tourism Types, Planning, Challenges, and Media & Technologies. A Twitter dataset of 485,813 tweets was collected over 18 months starting March 2021 to August 2022 to showcase public perception of tourism in Saudi Arabia, which was modelled to reveal 13 parameters categorized into two broader sets: Tourist Attractions and Tourism Services. Discovering system parameters are re-quired to embed autonomous capabilities in systems and for decision-making and problem-solving during system design and operations. The proposed approach improves AI-based information discovery by extending the use of scientific literature, Twitter, and other sources for autonomous, dynamic optimizations of systems, promoting novel research in the tourism sector and contributing to the development of smart and sustainable societies. The paper also presents a comprehensive knowledge structure and literature review of the tourism sector based on over 250 research articles.

Preprint ARTICLE | doi:10.20944/preprints202012.0507.v1

The Demographic, Social, and Economic Correlates of HIV Infection Status in Sub-Saharan Africa

Eran Bendavid, Kajal Claypool, Eric Chow, Jake Chung, Don Mai, Chirag Patel

Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: HIV; big data; Africa; epidemiology

Online: 21 December 2020 (11:14:08 CET)

Show abstract| Download PDF| Supplementary Files| Share