Search | Preprints.org

Preprint ARTICLE | doi:10.20944/preprints202007.0078.v1

Data Driven Analytics for Personalized Medical Decision Making

Nataliia Melnykova, Nataliya Shakhovska, Michal Gregus, Volodymyr Melnykov, Mariana Zakharchuk, Olena Vovk

Subject: Computer Science And Mathematics, Information Systems Keywords: personalization; decision making; medical data; artificial intelligence; Data-driving; Big Data; Data Mining; Machine Learning

Online: 5 July 2020 (15:04:17 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.1378.v1

Algorithm-based Data Generation (ADG) Engine for Data Analytics

Iman I. M. Abu Sulayman, Peter Voege, Abdelkader Ouda

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Data Generation; Anomaly Data; User Behavior Generation; Big Data

Online: 19 June 2023 (16:31:37 CEST)

Show abstract| Download PDF| Share

Working Paper COMMUNICATION

Visual Analytics on Biomedical Dark Data

Shashwat Aggarwal, Ramesh Singh

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Data Visualization; Visual Analytics; Natural Language Processing; Dark Data; Pattern Recognition

Online: 28 October 2020 (07:47:26 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202110.0103.v1

Usage of Data Analytics in Improving Sourcing of Supply Chain Inputs

S M Nazmuz Sakib

Subject: Computer Science And Mathematics, Information Systems Keywords: Data Analytics; Analytics; Supply Chain Input; Supply Chain; Data Science; Data

Online: 6 October 2021 (10:38:42 CEST)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Model for the Collection and Analysis of Data from Teachers and Students, Supported by Academic Analytics

Fredys A. Simanca H., Isabel Hernández Arteaga, María Elsa Unriza Puin, Fabian Blanco Garrido, Jaime Paez Paez, Jairo Cortes Méndez

Subject: Computer Science And Mathematics, Information Systems Keywords: Academic Analytics; data storage; education and big data; analysis of data; learning analytics

Online: 19 July 2020 (20:37:39 CEST)

Show abstract| Download PDF| Share

Business Intelligence, defined by [1] as "the ability to understand the interrelations of the facts that are presented in such a way that it can guide the action towards achieving a desired goal", has been used since 1958 for the transformation of data into information, and of information into knowledge, to be used when making decisions in a business environment. But, what would happen if we took the same principles of business intelligence and applied them to the academic environment? The answer would be the creation of Academic Analytics, a term defined by [2] as the process of evaluating and analyzing organizational information from university systems for reporting and making decisions, whose characteristics allow it to be used more and more in institutions, since the information they accumulate about their students and teachers gathers data such as academic performance, student success, persistence, and retention [5]. Academic Analytics enables an analysis of data that is very important for making decisions in the educational institutional environment, aggregating valuable information in the academic research activity and providing easy to use business intelligence tools. This article shows a proposal for creating an information system based on Academic Analytics, using ASP.Net technology and trusting storage in the database engine Microsoft SQL Server, designing a model that is supported by Academic Analytics for the collection and analysis of data from the information systems of educational institutions. The idea that was conceived proposes a system that is capable of displaying statistics on the historical data of students and teachers taken over academic periods, without having direct access to institutional databases, with the purpose of gathering the information that the director, the teacher, and finally the student need for making decisions. The model was validated with information taken from students and teachers during the last five years, and the export format of the data was pdf, csv, and xls files. The findings allow us to state that it is extremely important to analyze the data that is in the information systems of the educational institutions for making decisions. After the validation of the model, it was established that it is a must for students to know the reports of their academic performance in order to carry out a process of self-evaluation, as well as for teachers to be able to see the results of the data obtained in order to carry out processes of self-evaluation, and adaptation of content and dynamics in the classrooms, and finally for the head of the program to make decisions.

Preprint ARTICLE | doi:10.20944/preprints202307.1199.v1

Towards Developing Big Data Analytics for Machining Decision-Making

Angkush Kumar Ghosh, Saman Fattahi, Sharifu Ura

Subject: Engineering, Industrial And Manufacturing Engineering Keywords: smart manufacturing; big data; manufacturing process; big data analytics; decision-making; uncertainty

Online: 18 July 2023 (09:38:31 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.1845.v1

Advanced Analytics and Data Management in the Procurement Function: An Aviation Industry Case Study

Andrea Altundag, Martin Wynn

Subject: Business, Economics And Management, Business And Management Keywords: data analytics; strategic procurement; big data; maturity model; aviation industry; aircraft manufacturer

Online: 29 March 2024 (10:36:02 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.1910.v1

A Scalable Data Structure for Efficient Graph Analytics and In-Place Mutations

Soukaina Firmli, Dalila Chiadmi

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: Data Structures; Concurrency; Graph Processing; Graph Mutations

Online: 27 September 2023 (15:10:25 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.0366.v1

Learning Analytics in the Era of Large Language Models

Elisabetta Mazzullo, Okan Bulut, Tarid Wongvorachan, Bin Tan

Subject: Social Sciences, Education Keywords: learning analytics; large language model; artificial intelligence; process data

Online: 4 August 2023 (07:23:55 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202104.0442.v1

Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective

Iqbal H. Sarker

Subject: Computer Science And Mathematics, Information Systems Keywords: data science; advanced analytics; machine learning; deep learning; smart computing; decision-making; predictive analytics; data science applications;

Online: 16 April 2021 (11:28:09 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202212.0522.v1

Using HyperLogLog to Prevent Data Retention in Social Media Streaming Data Analytics

Marc Löchner, Dirk Burghardt

Subject: Computer Science And Mathematics, Computer Networks And Communications Keywords: privacy; social media; data retention; hyperloglog

Online: 28 December 2022 (01:25:25 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201811.0074.v1

The Role of Big Data Analytics in Developing System Dynamic Models

Hamed Kianmehr, Nasim Sabounchi, Lina Begdache

Subject: Medicine And Pharmacology, Psychiatry And Mental Health Keywords: system dynamics modeling; big data; mental distress; diet

Online: 5 November 2018 (02:34:30 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202105.0698.v1

RoadLytics: Road Accidents Analytics Using Artificial Intelligence to Support Deaths’ Prevention on Highways

Kelvin Luz, João Elison da Rosa Tavares, Jorge Luis Victória Barbosa, Daniel Hernández de la Iglesia, Valderi Reis Quietinho Leithardt

Subject: Computer Science And Mathematics, Information Systems Keywords: Accidents, Data Analysis, Machine Learning, Transport

Online: 28 May 2021 (11:59:24 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202307.1583.v1

Environmental Resilience Technology: Sustainable Solutions Using Value-Added Analytics in a Changing World

E. Natasha Stavros, Caroline Gezon, Lise St Denis, Virginia Iglesias, Christina Zapata, Michael Byrne, Laurel Cooper, Maxwell Cook, Ethan Doyle, Jilmarie Stephens, Mario Tapia, Ty Tuff, Evan Thomas, SJ Maxted, Rana Sen, Jennifer K. Balch

Subject: Environmental And Earth Sciences, Sustainable Science And Technology Keywords: innovation; commercialization; decision making; human centered design; information technology; data analytics; resilience; environment

Online: 24 July 2023 (09:36:17 CEST)

Show abstract| Download PDF| Share

Preprint CONCEPT PAPER | doi:10.20944/preprints201811.0599.v1

Taming Disruption? Pervasive Data Analytics, Uncertainty, and Policy Intervention in Disruptive Technology and Its Geographic Spread

Roger C. Brackin, Michael J. Jackson, Andrew Leyshon, Jeremy G. Morley

Subject: Social Sciences, Geography, Planning And Development Keywords: geographies of disruption; data analytics; policy intervention; Uber; disruptive technology; disruptive innovation; path dependency; platform development; platform economics

Online: 27 November 2018 (06:52:38 CET)

Show abstract| Download PDF| Share

Preprint CONCEPT PAPER | doi:10.20944/preprints202102.0203.v1

An Application Overview of IoT Enabled-Big Data Analytics in Health Sector with Special Reference to Covid-19

Rajib Biswas

Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: Bigdata; IoT; Big Data Analytics; Covid-19; healthcare

Online: 8 February 2021 (12:19:28 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202111.0047.v1

Comparison of Different Image Data Augmentation Approaches

Loris Nanni, Michelangelo Paci, Sheryl Brahnam, Alessandra Lumini

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Data augmentation; Deep Learning; Convolutional Neural Networks; Ensemble.

Online: 2 November 2021 (11:18:23 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.0224.v1

Data Analytics-Driven Selection of Die Material in Multimaterial Co-extrusion of Ti-Mg Alloys

Daniel Fernández, Alvaro Rodríguez-Prieto, Ana María Camacho

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Data analytics, Methodologies, Multi-material; Co-extrusion; FEM; Machine Learning; SVR; MCDM.

Online: 5 February 2024 (05:33:02 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.0602.v1

The Improvement of the Use of Open Data in Public Institutions

Besart Hyseni, Lejla Abazi Bexheti

Subject: Computer Science And Mathematics, Information Systems Keywords: Improving use of open data; data utilization; data optimization; enhancing data access; open data impact; open data government; data transparency; data-driven decision making

Online: 12 February 2024 (09:34:51 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202309.2137.v1

Recent Optimization Methods and Techniques for Medical Image Analysis

jing wang

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Medical image analysis, Medical image data, Deep learning, Computer vision techniques, Optimisation methods

Online: 30 September 2023 (17:58:32 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.0380.v1

Secure and Fast Image Encryption Algorithm Based on Modified Logistic Map

Mamoon Riaz, Hammad Dilpazir, Sundus Naseer, Hasan Mahmood, Asim Anwar, Junaid Khan, Ian B. Benitez, Tanveer Ahmad

Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Image Encryption; Data Security; Chaotic Logistic Map, Substitution-permutation Network

Online: 6 February 2024 (15:27:14 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0012.v1

Flexible Techniques to Detect Typical Hidden Errors in Large Longitudinal Datasets

Renato Bruni, Cinzia Daraio, Simone Di Leo

Subject: Computer Science And Mathematics, Computer Science Keywords: big data; information processing; information reconstruction; data quality: longitudinal data sequences

Online: 1 March 2024 (10:33:16 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202303.0023.v1

Application of Variational AutoEncoder (VAE) Model and Image Processing Approaches in Game Design

Hugo Wai Leung Mak, Runze Han, Hoover H.F. Yin

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: Game Design; Variational AutoEncoder (VAE); Image and Video Generation; Bayesian Algorithm; Loss Function; Data Clustering; Data and Image Analytics; MNIST database; Generator and Discriminator

Online: 1 March 2023 (11:17:12 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.1679.v1

Long-tailed Image Classification Method Based on Enhanced Contrastive Visual-language

Ying Song, Mengxing Li, Bo Wang

Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: long-tailed image classification; contrastive learning; data augmentation

Online: 23 June 2023 (12:17:21 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.0556.v1

Image Data Extraction and Driving Behavior Analysis Based on Geographic Information and Driving Data

Huei-Yung Lin, Jun-Zhi Zhang, Chin-Chen Chang

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: image data extraction; driving behavior analysis; geographic information system; global position system

Online: 7 June 2023 (13:09:43 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202212.0405.v1

Small Sample Hyperspectral Image Classiﬁcation Based on the Random Patches Network and Recursive Filtering

Denis Uchaev, Dmitry Uchaev

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: hyperspectral data; few-shot learning; deep features; convolution kernels; edge-preserving filtering

Online: 22 December 2022 (01:44:48 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202307.1435.v2

Verifiable Privacy Preservation Scheme for Outsourcing Medical Image to Cloud Through ROI Based Crypto-Watermarking

Chuan Zhou, Yi Zhou, Xinghan An, Yan Liu, Min Wang, XiangZhi Liu

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: sharing secret; data outsourcing; reversible watermarking; chaotic map

Online: 21 September 2023 (11:24:52 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.0140.v1

Defining a Metacolor Space Representation to Perform Image Segmentation

Ciro Castiello, Nicoletta Del Buono, Flavia Esposito

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Color Spaces; Low Rank Data Representation; Feature Extraction; Machine Learning algorithm; Nonnegative Matrix Facorization; Image Segmentation

Online: 2 February 2024 (09:29:20 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.0798.v1

A General Framework for Visualizing Machine Learning Models

Ziqian Bi, Raymond Gao, Shiaofen Fang

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: machine learning; multi-dimensional data visualization; classification; morphing; scattered data interpolation

Online: 14 February 2024 (10:56:50 CET)

Show abstract| Download PDF| Share

Preprint CONCEPT PAPER | doi:10.20944/preprints202111.0117.v1

Competitive Approaches of Strategic Alliance in the Big Data Environment, a Moderating Role of Big Data Predictive Analytics in the Case of Telecommunication Sector of Pakistan

Hassan Abbas, Ye Ze, Waqar Ahmad

Subject: Business, Economics And Management, Business And Management Keywords: Big data predictive analytics; competitive strategies; strategic alliance performance; Telecom sector

Online: 5 November 2021 (11:29:12 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201708.0102.v1

A Content-Based Remote Sensing Image Change Information Retrieval Model

Caihong Ma, Wei Xia, Fu Chen, Jianbo Liu, Qin Dai, Liyuan Jiang, Jianbo Duan, Wei Liu

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Content-Based Remote Sensing Image Retrieval; Change Information Detection; Information Management; Remote Sensing Data Service

Online: 29 August 2017 (16:18:20 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1296.v1

Causal Meta-Reinforcement Learning for Multimodal Remote Sensing Data Classification

Wei Zhang, Xuesong Wang, Haoyu Wang, Yuhu Cheng

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Multimodal data; remote sensing; reinforcement learning; meta-learning; causal learning

Online: 22 February 2024 (15:30:22 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201608.0232.v2

ODK Scan: Digitizing Data Collection and Impacting Data Management Processes in the Tuberculosis Control Program of Pakistan

Syed Mustafa Ali, Rachel Powers, Jeffrey Beorse, Farah Naureen, Arif Noor, Naveed Anjum, Muhammad Ishaq, Javariya Aamir, Richard Anderson

Subject: Medicine And Pharmacology, Pulmonary And Respiratory Medicine Keywords: mHealth; ODK scan; mobile health application; digitizing data collection; data management processes; paper-to-digital system; technology-assisted data management; treatment adherence

Online: 2 September 2016 (03:17:38 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202403.0161.v1

4IR Applications in the Transport Industry: Systematic Review of the State of the Art with Respect to Data Collection and Processing Mechanisms

O.O. Ajayi, A.M. Kurien, K. Djouani, L. Dieng

Subject: Engineering, Transportation Science And Technology Keywords: transportation systems; systematic review; industrial revolution; data collection; data processing

Online: 6 March 2024 (04:30:45 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201611.0010.v1

Attenuation Correction for Ka-band Cloud Radar Using X-band Weather Radar Data

Peng Zhang, Yunjie Chen

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: millimeter-wavelength cloud radar; attenuation correction; dual-radar; data fusion

Online: 1 November 2016 (10:05:18 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201610.0012.v1

Bio-Resource Exchange: Study of Prevalence of Antibody Donation and Development of a Web Portal to Facilitate it

Sandeep Subramanian, Madhavi Ganapathiraju

Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: data exchange; resource donations; text mining

Online: 5 October 2016 (15:08:32 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint REVIEW | doi:10.20944/preprints201607.0075.v1

Travel mode detection based on GPS raw data collected by smartphones: a systematic review of the existing methodologies

Linlin Wu, Biao Yang, Peng Jing

Subject: Social Sciences, Behavior Sciences Keywords: travel mode detection; GPS raw data; smartphones

Online: 25 July 2016 (06:34:26 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.0447.v1

Research on the Detection Method of Organic Matter in Tea Garden Soil based on Image Information and Hyperspectral Data Fusion

Haowen Zhang, Chongshan Yang, Min Lu, Zhongyuan Liu, Xiaojia Zhang, Chunwang Dong

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Hyperspectral; machine visualization properties; data fusion; tea plantation soils; organic matter

Online: 8 November 2023 (01:33:37 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1313.v1

Hearables: In-ear Multimodal Data Fusion for Robust Heart Rate Estimation

Marek Zylinski, Amir Nassibi, Edoardo Occhipinti, Adil Malik, Matteo Bermond, Harry J. Davies, Danilo P. Mandic

Subject: Engineering, Bioengineering Keywords: Data fusion method; Heart rate tracing; Hearables; Wearables

Online: 22 February 2024 (12:21:03 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0169.v1

Visualising Daily PM10 Pollution in an Open-Cut Mining Valley of New South Wales, Australia - Part II: Classification of Synoptic Circulation Types and Local Meteorological Patterns and Their Relation to Elevated Air Pollution in Spring and Summer

Ningbo Jiang, Matthew Riley, Merched Azzi, Giovanni Di Virgilio, Hiep Nguyen Duc, Praveen Puppala

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: PM10 pollution; local meteorological pattern; synoptic circulation type; self-organising map (SOM); air pollution conduciveness; data clustering; data visualisation; open-cut mining valley

Online: 2 April 2024 (07:42:50 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0227.v1

Industrialization in Construction Companies – a Benchmark Study on Manufacturing Companies

Solmaz Mansoori, Janne Härkönen, Harri Haapasalo, Petteri Annunen

Subject: Engineering, Architecture, Building And Construction Keywords: predefined products; predefined processes; data management; industrialized construction

Online: 3 April 2024 (13:15:17 CEST)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Design Of a Dynamic and Self-Adapting System, Supported With Artificial Intelligence, Machine Learning and Real-Time Intelligence For Predictive Cyber Risk Analytics in Extreme Environments- Cyber Risk in the Colonisation of Mars

Petar Radanliev

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Artificial intelligence; machine learning; real-time probabilistic data; for cyber risk; super forecasting; red teaming;

Online: 12 April 2021 (12:18:14 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201612.0091.v2

The China Meteorological Assimilation Driving Datasets for the SWAT Model (CMADS) Application in China: A Case Study in Heihe River Basin

Xian-yong Meng, Hao Wang, Si-yu Cai, Xue-song Zhang, Guo-yong Leng, Xiao-hui Lei, Chun-xiang Shi, Shi-yin Liu

Subject: Environmental And Earth Sciences, Geophysics And Geology Keywords: reanalysis climate data; hydrologic modeling; comparative analysis

Online: 3 February 2017 (03:50:07 CET)

Show abstract| Download PDF| Share

Large-scale hydrological modeling in China is challenging given the sparse meteorological stations and large uncertainties associated with atmospheric forcing data.Here we introduce the development and use of the China Meteorological Assimilation Driving Datasets for the SWAT model (CMADS) in the Heihe River Basin(HRB) for improving hydrologic modeling, by leveraging the datasets from the China Meteorological Administration Land Data Assimilation System (CLDAS)(including climate data from nearly 40000 area encryption stations, 2700 national automatic weather stations, FengYun (FY) 2 satellite and radar stations). CMADS uses the Space Time Multiscale Analysis System (STMAS) to fuse data based on ECWMF ambient field and ensure data accuracy. In addition, compared with CLDAS, CMADS includes relative humidity and climate data of varied resolutions to drive hydrological models such as the Soil and Water Assessment Tool (SWAT) model. Here, we compared climate data from CMADS, Climate Forecast System Reanalysis (CFSR) and traditional weather station (TWS) climate forcing data and evaluatedtheir applicability for driving large scale hydrologic modeling with SWAT. In general, CMADS has higher accuracy than CFRS when evaluated against observations at TWS; CMADS also provides spatially continuous climate field to drive distributed hydrologic models, which is an important advantage over TWS climate data, particular in regions with sparse weather stations. Therefore, SWAT model simulations driven with CMADS and TWS achieved similar performances in terms of monthly and daily stream flow simulations, and both of them outperformed CFRS. For example, for the three hydrological stations (Ying Luoxia, Qilian Mountain, and ZhaMasheke) in the HRB at the monthly and daily Nash-Sutcliffe efficiency ranges of 0.75-0.95 and 0.58-0.78, respectively, which are much higher than corresponding efficiency statistics achieved with CFSR (monthly: 0.32-0.49 and daily: 0.26 – 0.45). The CMADS dataset is available free of charge and is expected to a valuable addition to the existing climate reanalysis datasets for deriving distributed hydrologic modeling in China and other countries in East Asia.

Preprint ARTICLE | doi:10.20944/preprints201607.0047.v1

Graphical Diagnostic for Mortality Data Modeling

M.L. Gamiz, M.D. Martinez-Miranda, R. Raya-Miranda

Subject: Business, Economics And Management, Econometrics And Statistics Keywords: SAINT model; SiZer; local linear fitting; mortality data

Online: 18 July 2016 (10:35:40 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202304.1077.v1

Multimodal Deep Learning Methods to Predict Radiotherapy Structure Names using Image and Textual Data from DICOM Files

Priyankar Bose, Pratip Rana, William C. Sleeman IV, Sriram Srinivasan, Rishabh Kapoor, Jatinder Palta, Preetam Ghosh

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Multimodal Data Integration; Radiotherapy Standard Name mapping; Radiation Oncology; Machine Learning; Deep Learning; TG-263 Names

Online: 27 April 2023 (11:01:18 CEST)

Show abstract| Download PDF| Share

Physicians often label anatomical structure sets in Digital Imaging and Communications in Medicine (DICOM) images with nonstandard names. As these names vary widely, the standardization of the nonstandard names in the Organs at Risk (OARs), Planning Target Volumes (PTVs), and 'Other' organs inside the area of interest is a vital problem. Prior works considered traditional machine learning approaches on structure sets with moderate success. This paper presents integrated deep learning methods applied to structure sets by integrating the multimodal data compiled from the radiotherapy centers administered by the US Veterans Health Administration (VHA) and the Department of Radiation Oncology at Virginia Commonwealth University (VCU). The de-identified radiation oncology data collected from VHA and VCU radiotherapy centers have 16,290 prostate structures. Our method integrates the heterogeneous (textual and imaging) multimodal data with Convolutional Neural Network (CNN)-based deep learning approaches like CNN, Visual Geometry Group (VGG) network, and Residual Network (ResNet). Our model presents improved results in prostate (RT) structure name standardization. Evaluation of our methods with macro-averaged F1 Score shows that our deep learning model with single-modal textual data usually performs better than the previous studies. We also experimented with various combinations of multimodal data (masked images, masked dose) besides textual data. The models perform well on the textual data alone, while the addition of imaging data shows that deep neural networks can achieve improved performance using information present in the other modalities. Additionally, using masked images and masked doses along with text leads to an overall performance improvement with the various CNN-based architectures than using all the modalities together. Undersampling the majority class leads to further performance enhancement. The VGG network on the masked image-dose data combined with CNNs on the text data performs the best and establishes the state-of-the-art in this domain.

Preprint ARTICLE | doi:10.20944/preprints202401.2038.v1

Fractal Dimension of the Generalized Z-Entropy of The Rényian Formalism of Stable Queue with Some Potential Applications of Fractal Dimension to Big Data Analytics

Dr Ismail A Mageed

Subject: Computer Science And Mathematics, Geometry And Topology Keywords: Fractal Dimension(D), Generalized Z-Entropy, Google Earth satellite (GEs),GNU Image Manipulation, Big Data Analytics(BDAs).

Online: 29 January 2024 (14:39:18 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0594.v1

A Deep Learning Classification Scheme for PolSAR Image Based on Polarimetric Features

Shuaiying Zhang, Lizhen Cui, Zhen Dong, Wentao An

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Polarimetric Synthetic Aperture Radar (PolSAR); Reflection Symmetric Decomposition (RSD); Data Input Scheme; Land Classification; Polarimetric Scattering Characteristics.

Online: 9 April 2024 (00:29:24 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202403.1470.v1

Enhancing Understanding Through Data Visualization: What Can Available Data Reveal About Access to Energy in Displacement Contexts on the African Continent?

Tim Ronan Britton, Philipp Baslik, Lena Anna Schmid, Boris Heinz

Subject: Social Sciences, Geography, Planning And Development Keywords: access to energy; displacement contexts; humanitarian settings; data assessment; data visualization; electricity; cooking

Online: 25 March 2024 (11:25:05 CET)

Show abstract| Download PDF| Share

The extent of access to energy of displaced populations in settlements and camps in Africa is largely unknown, given 94% of displaced persons without access to electricity and 81% with a reliance on biomass for cooking. A multitude of contextual factors, such as the location and the characteristics of housing, the legal status, the socio-cultural background and the availability of humanitarian and public services impact the living conditions and the needed energy services. Limitations in accessing energy services have direct, multilayered, and far-reaching implications, including impacts on health, nutrition, education, protection, and livelihood. The objective of this article is to contribute to a more comprehensive understanding of the current state of energy ac-cess in displacement contexts on the African continent by identifying and utilizing existing data. After a screening of the vast and various available information, setting up of a database, consoli-dating the gathered data as well as assessing the quality through a quality assessment method, the currently available information is visualized and discussed. Remarkable differences in the access to electricity for displaced persons across the countries are found. For both electricity and clean cooking, the availability for displaced persons ranges from nearly no access at all up to an access rate of 100%. More strikingly, the results also show that besides South Africa, and the se-lected countries in the Maghreb region, the access to both clean cooking and electricity for dis-placed persons is remarkably low. At the same time, the poor data quality does not allow to draw solid conclusions nor impactful implementation activities. Novel conceptual frameworks and indicators are needed. Future research needs to focus on a more comprehensive understand-ing of how energy is interwoven in the lives of displaced persons, before a set of energy indicators can be derived. It is essential that the concerned persons, the displaced persons themselves are in-cluded in the research in a meaningful way.

Preprint REVIEW | doi:10.20944/preprints202402.1493.v1

Real-World Data and Evidence in Lung Cancer: A Review of Recent Developments

Eleni Kokkotou, Maximilian Anagnostakis, Georgios Evangelou, Nikolaos K Syrigos, Ioannis Gkiozos

Subject: Medicine And Pharmacology, Oncology And Oncogenics Keywords: oncology; real-world data; real-world evidence; epidemiology; safety; efficacy; artificial intelligence; machine learning; data quality; lung cancer

Online: 27 February 2024 (08:04:33 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0411.v1

Methodology for identifying natural phenomena that form large floods using satellite data

Eugeniy Savchenko, Sergey Maklakov

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: floods, climate study, Meteosat satellites, data visualization, remote sensing, satellite sensing.

Online: 7 March 2024 (07:50:58 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0038.v1

Weakly Supervised Semantic Segmentation of Point Cloud Scenes using Boundary-based Feature Aggregation

Yongwei Miao, Guoxiang Ren, Xudong Zhang, Haijian Liu, Fuchang Liu

Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Point cloud; Semantic segmentation; Weakly supervised learning; Boundary feature aggregation; Data augmentation

Online: 1 April 2024 (11:12:43 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201609.0088.v1

Multi-range Conditional Random Field for Classifying Railway Electrification System Objects

Jaewook Jung, Leihan Chen, Gunho Sohn, Chao Luo, Jong-Un Won

Subject: Engineering, Civil Engineering Keywords: classification; railway; power line; mobile laser scanning data; conditional random field; layout compatibility

Online: 26 September 2016 (09:33:05 CEST)

Show abstract| Download PDF| Share

Railway has been used as one of the most crucial means of transportation in public mobility and economic development. For efficiently operating railways, the electrification system in railway infrastructure, which supplies electric power to trains, is essential facilities for stable train operation. Due to its important role, the electrification system needs to be rigorously and regularly inspected and managed. This paper presents a supervised learning method to classify Mobile Laser Scanning (MLS) data into ten target classes representing overhead wires, movable brackets and poles, which are recognized key objects in the electrification system. In general, the layout of railway electrification system shows a strong regularity of spatial relations among object classes. The proposed classifier is developed based on Conditional Random Field (CRF), which characterizes not only labeling homogeneity at short range, but also the layout compatibility between different object classes at long range in the probabilistic graphical model. This multi-range CRF model consists of a unary term and three pairwise contextual terms. In order to gain computational efficiency, MLS point clouds is converted into a set of line segments where the labeling process is applied. Support Vector Machine (SVM) is used as a local classifier considering only node features for producing the unary potentials of CRF model. As the short-range pairwise contextual term, Potts model is applied to enforce a local smoothness in short-range graph. While, long-range pairwise potentials are designed to enhance spatial regularities of both horizontal and vertical layouts among railway objects. We formulate two long-range pairwise potentials as the log posterior probability obtained by Naïve Bayes classifier. The directional layout compatibilities are characterized in probability look-up tables which represent co-occurrence rate of spatial relations in horizontal and vertical directions. The likelihood function is formulated by multivariate Gaussian distributions. In the proposed multi-range CRF model, the weight parameters to balance four sub-terms are estimated by applying the Stochastic Gradient Descent (SGD). The results show that the proposed multi-range CRF can effectively classify detailed railway elements, representing the average recall of 97.66% and the average precision of 97.07% for all classes.

Preprint ARTICLE | doi:10.20944/preprints202306.0123.v1

Generalizability of a Random Forest-Based Model of Maize Lodging Built with Satellite Image Data and Its Application to Monitoring and Evaluating Maize Lodging Risks

Huirong Guo, Bo Ming, Chenwei Nie, Guoqiang Zhang, Hongye Yang, Shang Gao, Beibei Xue, Jiangfeng Xin, Dayun Feng, Biao Jia, Peng Hou, Jun Xue, Ruizhi Xie, Keru Wang, Shaokun Li

Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Sentinel-2 multispectral data; Maize lodging; Random Forest classification; Predictive variables; Model generalizability

Online: 2 June 2023 (04:08:42 CEST)

Show abstract| Download PDF| Share

Lodging is a common problem in maize production that seriously impacts yield, quality, and the capacity for mechanical harvesting. Evaluation of site-specific lodging risks requires establishment of a method for multi-year monitoring. In this study, spectral images collected by the Sentinel-2 satellite were processed to obtain three types of data: gray-level co-occurrence matrix texture (GLCM), vegetation indices (VIs), and spectral reflectance (SR). Lodging classification models were then established with Random Forest (RF) using each of the three data types separately (the GLCM, VI, and SR models) and in combination (SR+VI model, SR+GLCM model, VI+GLCM mod-el, and SR+VI+GLCM model). By gradually removing features with low importance scores from the SR+VI+GLCM model and analyzing the changes in the overall accuracy (OA), the optimal set of predictive variables was identified and used to construct the optimal model. A model built us-ing data from a single timepoint in 2021 was tested on data collected at a similar timepoint in 2019 and vice versa to assess interannual model generalizability. The results of this study demon-strate that for monitoring maize lodging, models constructed with a single feature type, the GLCM model had significantly lower accuracy compared to the VI and SR models. During certain growth stages, the model constructed with combined features had significantly higher accuracy in monitoring maize lodging compared to models constructed with a single feature. During the pro-cess of selecting the optimal predictive variables, it was found that the accuracy of the model did not increase as the number of predictive variables increased. The results show that the positive and negative validation models had an accuracy of 96.55% and 95.18%, with kappa values of 0.93 and 0.83, respectively. This indicates that the model has strong generality for the same repro-ductive stage between years. This study provides a detailed method for large-scale maize lodging monitoring, allowing for identification of optimal planting practices to reduce the probability of lodging and ultimately improving regional maize yield and quality.

Preprint ARTICLE | doi:10.20944/preprints202402.1556.v1

Empirical Analysis of the Current Status and Potential of Service-oriented and Data-driven Business Models within the Sheet Metal Working Sector: Insights from Interview-based Research in SMEs

Jonas Wirth, Mirko Schneider, Leon Hanselmann, Kira Fink, Stephan Nebauer, Thomas Bauernhansl

Subject: Business, Economics And Management, Business And Management Keywords: service-oriented business models; data-driven business models; servitization; digital transformation; ecosystem innovation, SME

Online: 27 February 2024 (14:06:12 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201610.0067.v1

Point Information Gain and Multidimensional Data Analysis

Renata Rychtáriková, Jan Korbel, Petr Macháček, Petr Císař, Jan Urban, Dalibor Štys

Subject: Computer Science And Mathematics, Applied Mathematics Keywords: point information gain; Rényi entropy; data processing

Online: 17 October 2016 (11:35:13 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0846.v1

Identifying the Impact Factors on the Land Market in Nepal from Land Use Regulation

Nab Raj Subedi, Kevin McDougall, Dev Raj Paudyal

Subject: Social Sciences, Urban Studies And Planning Keywords: land use regulation, land market, stakeholders, qualitative data analysis, impact factors.

Online: 12 April 2024 (10:25:56 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201612.0079.v1

A Broad-Area Method for Estimation of Upwelling Medium Wave Infrared for Fire Detection

Bryan Hally, Luke Wallace, Karin Reinke, Simon Jones

Subject: Environmental And Earth Sciences, Environmental Science Keywords: fire detection; upwelling radiation; diurnal variation; training data; geostationary sensors

Online: 15 December 2016 (09:22:10 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201710.0166.v2

Using the Quantization Error from Self-Organizing Map (SOM) Output for Fast Detection of Critical Variations in Image Time Series

Birgitta Dresp-Langley, John Mwangi Wandeto, Henry Okola Nyongesa

Subject: Computer Science And Mathematics, Information Systems Keywords: satellite images; image analysis; self organizing maps; quantization error; structural change; demographic data

Online: 20 March 2018 (10:38:43 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201608.0123.v1

Time Domain Strain/Stress Reconstruction Based on Empirical Mode Decomposition: Numerical Study and Experimental Validation

Jingjing He, Yibin Zhou, Xuefei Guan, Wei Zhang, Wei Fang Zhang, Yongming Liu

Subject: Engineering, Civil Engineering Keywords: limited sensor data; structural health monitoring; strain/stress response reconstruction; empirical mode decomposition

Online: 11 August 2016 (11:06:16 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201611.0110.v1

Impacts of Capital Structure on Performance of Banks in a Developing Economy: Evidence from Bangladesh

Md. Nur Alam Siddik, Sajal Kabiraj, Shanmugan Joghee

Subject: Business, Economics And Management, Finance Keywords: capital structure; firm’s performance; panel data; unit root analysis; Bangladesh

Online: 22 November 2016 (09:36:36 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.1446.v1

Hardware in the Loop Simulation of Gas Turbines: How Real Time Control System Design Tools Can Be Exploited Also to Generate Fault Cases to Train and Tune Data Based Diagnostic Systems.

Attilio Brighenti, Chiara Brighenti

Subject: Engineering, Mechanical Engineering Keywords: HIL model(s); Dynamic simulation(s); Data-based modelling; Predictive diagnostics; Fault detection and isolation (FDI)

Online: 25 March 2024 (08:30:18 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.1198.v2

Monitoring the Wear Trend in Wind Turbines by Tracking the Fourier Vibration Spectrum and Base Density Support Vector Machine

Claudiu Bisu, Adrian Olaru, Serban Olaru, Adrian Alexei, Niculae Mihai, Haleema Ushaq

Subject: Engineering, Control And Systems Engineering Keywords: wind turbine; monitoring; wear trend; Fourier vibration spectrum; support vector machine; base density of the collected data; machine learning.

Online: 9 April 2024 (10:23:59 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201703.0058.v1

H-Rec²: A Novel Mobile-Social System for Automatic Health Recognition and Recommendation

Huan Li, Kejie Lu, Qi Zhang

Subject: Computer Science And Mathematics, Computer Science Keywords: Smartphone sensing; mobile-social integration; automatic recognition; social data; long-term health monitoring

Online: 10 March 2017 (17:32:31 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201702.0074.v1

v-Mapper: An Application-Aware Resource Consolidation Scheme for Cloud Data Centres

Aaqif Afzaal Abbasi, Hai Jin

Subject: Computer Science And Mathematics, Information Systems Keywords: network; systems; cloud computing; data centre; performance; software-defined; virtual machine; scheduling; admission control; application-aware;

Online: 20 February 2017 (04:56:24 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201608.0204.v1

A Grey Forecasting Approach for the Sustainability Performance of Logistics Companies

Min-Chun Yu, Chia-Nan Wang, Nguyen-Nhu-Y Ho

Subject: Business, Economics And Management, Economics Keywords: logistics industry; sustainability; data envelopment analysis (DEA); grey forecasting

Online: 25 August 2016 (10:12:27 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1130.v1

Bibliography Analysis on Bioremediation on Heavy Metal Pollution

Yuanzhao Ding

Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: Heavy metal pollution; Bioremediation; Bibliographic method; Big data; Machine learning

Online: 20 February 2024 (11:51:15 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1724.v1

Data Loss Prevention Method Based on Multiprotocol Connectivity for IoT

Hamza Takrouni, Larbi Talbi, Youcef Fouzar

Subject: Engineering, Telecommunications Keywords: Internet of Things, Cloud Computing, Edge Computing, Big Data, IoT Communications Protocols

Online: 29 February 2024 (11:06:12 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202403.1485.v1

Review of Fourth-Order Maximum Entropy Based Predictive Modelling and Illustrative Application to a Nuclear Reactor Benchmark. I. Typical High-Order Sensitivity and Uncertainty Analysis

Dan Gabriel Cacuci, Ruixian Fang

Subject: Physical Sciences, Applied Physics Keywords: predictive modeling; sensitivity analysis; uncertainty quantification; data assimilation; model calibration; reducing predicted uncertainties

Online: 25 March 2024 (13:31:57 CET)

Show abstract| Download PDF| Share

This work (in two parts) will review the recently developed predictive modeling methodology called “4th-BERRU-PM” and its applicability to energy systems as exemplified by an illustrative application to the Polyethylene-Reflected Plutonium (acronym: PERP) OECD/NEA reactor physics benchmark. The acronym 4th-BERRU-PM designates the “Fourth-Order Best-Estimate Results with Reduced Uncertainties Predictive Modeling” methodology, which yields best-estimate results with reduced uncertainties for the first fourth-order moments (mean values, covariance, skewness, and kurtosis) of the optimally predicted posterior distribution of model results and calibrated model parameters. The 4th-BERRU-PM uses the Maximum Entropy (MaxEnt) principle to incorporate fourth-order experimental and computational information, including fourth (and higher) order sensitivities of computed model responses to model parameters, thus incorporating, as particular cases, the results previously predicted by the second-order predictive modeling methodology 2nd-BERRU-PM, and vastly generalizing the results produced by extant data assimilation and data adjustment procedures. The 4th-BERRU-PM methodology encompasses the scopes of high-order sensitivity analysis (SA), uncertainty quantification (UQ), data assimilation (DA) and model calibration (MC). The application of the 4th-BERRU-PM methodology to energy systems is illustrated by means of the above-mentioned OECD/NEA reactor physics benchmark, which is modeled using the neutron transport Boltzmann equation involving 21976 imprecisely known parameters, the solution of which is representative of “large-scale computations.” The model result (“response”) of interest is the leakage of neutrons through the outer surface of this spherical benchmark, which can be computed numerically and measured experimentally. Part 1 of this work illustrates the impact of high-order sensitivities, in conjunction with parameter standard deviations of various magnitudes, on the determination of the expected value and variance of the computed response in terms of the first four moments of the distribution of the uncertain model parameters. Part 2 of this work will illustrate the capabilities of the 4th-BERRU-PM methodology for combining computational and experimental information, up to and including forth-order sensitivities and distributional moments, for producing best-estimate values for the predicted responses and model parameters while reducing their accompanying uncertainties.

Preprint REVIEW | doi:10.20944/preprints202404.0569.v1

A Review: Tree Species Classification Based on Remote Sensing Data and Classic Deep Learning-based Methods

Lihui Zhong, Zhengquan Dai, Panfei Fang, Yong Cao, Leiguang Wang

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: remote sensing; tree species classification; unimodal and multimodal remote sensing data; classic deep learning-based methods

Online: 8 April 2024 (15:05:04 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.1018.v1

Discovering Data Domains and Products in Data Meshes Using Semantic Blueprints

Michalis Pingos, Andreas S. Andreou

Subject: Computer Science And Mathematics, Computer Science Keywords: Big Data; Data Lakes; Data Meshes; Data Products; Data Blueprints; Metadata Semantic Enrichment

Online: 16 April 2024 (16:26:06 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0386.v1

Big Data Privacy Protection and Security Provisions of Healthcare SecPri-BGMPOP Method in Cloud Environment

Moorthi K, Jothi Prabha Appadurai, Balasubramanian Prabhu Kavin, Jeeva Selvaraj, Hong-Seng Gan, Wen-Cheng Lai

Subject: Computer Science And Mathematics, Security Systems Keywords: Big Data; Security; Privacy; Boost Graph convolutional network clustering algorithm; Magnify Pinpointing based encryption approach; Hybrid Particle swarm; Grey wolf optimization

Online: 4 April 2024 (14:32:08 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201703.0028.v1

A Segment-Based Trajectory Similarity Measure in the Urban Transportation Systems

Yingchi Mao, Haishi Zhong, Xianjian Xiao, Xiaofang Li

Subject: Computer Science And Mathematics, Information Systems Keywords: GPS trajectory; GPS sensor; trajectory similarity measure; spatial-temporal data

Online: 6 March 2017 (06:51:37 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202003.0268.v1

TEEDA: An Interactive Platform for Matching Data Providers and Users in Data Marketplace

Teruaki Hayashi, Yukio Ohsawa

Subject: Social Sciences, Library And Information Sciences Keywords: matching; data marketplace; data platform; data visualization; call for data

Online: 17 March 2020 (04:10:28 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0265.v1

Security and Ownership in User Defined Data Meshes

Michalis Pingos, Panayiotis Christodoulou, Andreas S. Andreou

Subject: Computer Science And Mathematics, Computer Science Keywords: Big Data; Smart Data Processing; Systems of Deep Insight; Data Meshes; Data Lakes; Data Products; Blockchain; NFT; Data Blueprints

Online: 5 March 2024 (15:04:49 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint CONFERENCE PAPER | doi:10.20944/preprints201612.0011.v1

Examining the Spatio-temporal Dynamics of PM_2.5in Saudi Arabia Using Satellite-derived Data: A Cluster Study

Yusuf Aina, Elhadi Adam, Fethi Ahmed

Subject: Environmental And Earth Sciences, Environmental Science Keywords: satellite data; fine particulate matter; air pollution; geographic information system; health risks; spatial analysis; Saudi Arabia

Online: 1 December 2016 (15:25:56 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0794.v1

Electronic Nose and GC-MS Analysis to Detect Mango Twig Tip Dieback in Mango (Mangifera indica) and Panama Disease (TR4) in Banana (Musa acuminata)

Wathsala Ratnayake, Stanley E. Bellgard, Hao Wang, Vinuthaa Murthy

Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: electronic nose; headspace GC-MS analysis; Linear Discriminant Analysis (LDA); Principal Component Analysis (PCA); Volatile Organic Compounds (VOC); Chemometric Data Analysis (CDA); Panama Disease (TR4); Mango Twig Tip Dieback (MTTD)

Online: 11 April 2024 (11:13:59 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints201608.0040.v1

Prevalence, Characterization and Mycotoxin Production Ability of Fusarium species on Korean Adlay (Coix lacrymal-jobi L.) Seeds

Tae Jin An, Kyu Seop Shin, Narayan Chandra Paul, Young Guk Kim, Seon Woo Cha, Yu Seok Moon, Seung Hun Yu, Sang-Keun Oh

Subject: Biology And Life Sciences, Immunology And Microbiology Keywords: seeds; ELISA; Fusarium; morphological data analysis; mycotoxins; phylogenetic analysis S

Online: 4 August 2016 (10:12:54 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202301.0162.v1

MR-Class: A Python Tool for Brain Mr Image Classification Utilizing One-Vs-All DCNNS to Deal With the Open-Set Recognition Problem

Patrick Salome, Francesco Sforazzini, Gianluca Grugnara, Andreas Kudak, Matthias Dostal, Christel Herold-Mende, Sabine Heiland, Jürgen Debus, Amir Abdollahi, Maximilian Knoll

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Content-based image classification; Data curation and preparation; Convolutional neural networks (CNN); Deep learning; Artificial intelligence (AI)

Online: 9 January 2023 (10:59:31 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints201612.0002.v1

Change Point Estimation in Panel Data without Boundary Issue

Barbora Peštová, Michal Pešta

Subject: Computer Science And Mathematics, Applied Mathematics Keywords: change point; estimation; consistency; panel data; short panels; boundary issue; structural change; bootstrap; non-life insurance; change in claim amounts

Online: 1 December 2016 (10:02:03 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint REVIEW | doi:10.20944/preprints202402.0424.v1

Forest Supply Chains during Digitalization: Current Implementations and Prospects in Near Future

Teijo Palander, Stelian Alexandru Borz, Timo Tokola, Peter Rauch

Subject: Computer Science And Mathematics, Information Systems Keywords: Data-driven optimization; Dynamics; ERP; Logistics; Simulation; Technology

Online: 7 February 2024 (07:55:41 CET)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Optical Data Transmission beyond 40Tb/s with a Soliton Crystal Micro-Comb

Bill Corcoran, Mengxi Tan, Xingyuan Xu, Andreas Boes, Jiayang Wu, Thach G. Nguyen, Sai T. Chu, Brent E. Little, Roberto Morandotti, Arnan Mitchell, David J. Moss

Subject: Engineering, Electrical And Electronic Engineering Keywords: optical fibre data; transmission; microcomb

Online: 15 March 2020 (15:20:23 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0470.v1

A New Computational Algorithm for Assessing Overdispersion in Machine Learning Count Models with Python

Luiz Paulo Lopes Fávero, Alexandre Duarte, Helder Prado Santos

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: Count data; Machine learning; Negative binomial regression; Overdispersion; Poisson regression; Python; Vuong Test; Zero inflation

Online: 8 March 2024 (09:30:50 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1237.v1

A Method to Enable Automatic Extraction of Cost and Quantity Data from Hierarchical Construction Information Documents to Enable Rapid Digital Comparison and Analysis

Daniel Adanza Dopazo, Lamine Mahdjoubi, Bill Gething

Subject: Engineering, Transportation Science And Technology Keywords: data mining; data extraction; data science; cost infrastructure projects

Online: 17 August 2023 (09:25:22 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202310.1972.v1

Integrating Aerial and 3D Data into a Data-Driven Decision-Making Workflow for Nature-Based Stormwater Solutions

Harry Edelman, Lasse Rosen, Emil Nyman, And Piia Leskinen

Subject: Arts And Humanities, Architecture Keywords: aerial; data; drones; urban; nature-based; photogrammetry; design; software; decision-making; stormwater; management

Online: 1 November 2023 (02:43:24 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202309.2113.v1

Navigating the Data Architecture Landscape: A Comparative Analysis of Data Warehouse, Data Lake, Data Lakehouse, and Data Mesh

Benjamin wong

Subject: Computer Science And Mathematics, Hardware And Architecture Keywords: Data, DWH, Data Warehouse, Architecture, Data Lake, Storage, Analysis, Data Mesh, Analytical, Architectural, Data Vault

Online: 3 October 2023 (03:28:55 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1324.v1

Spatial-Multitemporal Analysis of Heatwaves in Thailand: Discrepancies between In-Situ Air Temperature and Remote Sensing-Derived Land Surface Temperature

Thitimar Chongtaku, Attaphongse Taparugssanagorn, Hiroyuki Miyazaki, Takuji W Tsusaka

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: heat wave; heatwaves detection; land surface heatwaves; data gap-filling; machine learning algorithm; random forest regression; spatio-temporal databases; geospatial analysis; air temperature; land surface temperature

Online: 23 February 2024 (08:40:08 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202304.0130.v1

Data Cooperatives as Catalysts for Collaboration, Data Sharing, and the (Trans)Formation of the Digital Commons

Michael Max Bühler, Igor Calzada, Isabel Cane, Thorsten Jelinek, Astha Kapoor, Morshed Mannan, Sameer Mehta, Marina Micheli, Vijay Mookerje, Konrad Nübel, Alex Pentland, Trebor Scholz, Divya Siddarth, Julian Tait, Bapu Vaitla, Jianguo Zhu

Subject: Computer Science And Mathematics, Other Keywords: data; cooperatives; open data; data stewardship; data governance; digital commons; data sovereignty; open digital federation platform

Online: 7 April 2023 (14:14:02 CEST)

Show abstract| Download PDF| Share

Network effects, economies of scale, and lock-in-effects increasingly lead to a concentration of digital resources and capabilities, hindering the free and equitable development of digital entrepreneurship (SDG9), new skills, and jobs (SDG8), especially in small communities (SDG11) and their small and medium-sized enterprises (“SMEs”). To ensure the affordability and accessibility of technologies, promote digital entrepreneurship and community well-being (SDG3), and protect digital rights, we propose data cooperatives [1,2] as a vehicle for secure, trusted, and sovereign data exchange [3,4]. In post-pandemic times, community/SME-led cooperatives can play a vital role by ensuring that supply chains to support digital commons are uninterrupted, resilient, and decentralized [5]. Digital commons and data sovereignty provide communities with affordable and easy access to information and the ability to collectively negotiate data-related decisions. Moreover, cooperative commons (a) provide access to the infrastructure that underpins the modern economy, (b) preserve property rights, and (c) ensure that privatization and monopolization do not further erode self-determination, especially in a world increasingly mediated by AI. Thus, governance plays a significant role in accelerating communities’/SMEs’ digital transformation and addressing their challenges. Cooperatives thrive on digital governance and standards such as open trusted Application Programming Interfaces (APIs) that increase the efficiency, technological capabilities, and capacities of participants and, most importantly, integrate, enable, and accelerate the digital transformation of SMEs in the overall process. This policy paper presents and discusses several transformative use cases for cooperative data governance. The use cases demonstrate how platform/data-cooperatives, and their novel value creation can be leveraged to take digital commons and value chains to a new level of collaboration while addressing the most pressing community issues. The proposed framework for a digital federated and sovereign reference architecture will create a blueprint for sustainable development both in the Global South and North.

Preprint COMMUNICATION | doi:10.20944/preprints202401.0780.v1

Data Reuse in Agricultural Genomics Research: Present Challenges and Future Solutions

Alenka Hafner, Victoria DeLeo, Cecilia Deng, Christine G. Elsik, Damarius Fleming, Peter W. Harrison, Theodore S. Kalbfleisch, Bruna Petry, Boas Pucker, Elsa H. Quezada-Rodríguez, Christopher K. Tuggle, James Koltes

Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: data reuse; agriculture; open data; metadata; data standards; equity

Online: 10 January 2024 (10:07:03 CET)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202403.0090.v1

SERVIR West Africa Land Use and Land Cover Task Force: Building a Collaborative Network to Support Capacity Development in the Harmonization of Land Cover and Land Use Mapping and Monitoring in West Africa

Foster Mensah, Paul Bartel, Jacob Abramowitz, Emil A. Cherrington, Mansour Mahamane, Bako Mamane, Matieu Henry, Fatima Mushtaq, Antonio Di Gregorio, Amadou Moctar Dieye, Patrice Sanou, Glory Enaruvbe, Ndeye Fatou Mar

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: land cover and land use; land cover classification; data harmonization; semantic interoperability; ISO-19144-2; West Africa, geospatial.

Online: 4 March 2024 (09:53:33 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201609.0027.v1

Optimizing Bus Passenger Complaint Service through Big Data Analysis: Systematized Analysis for Improved Public Sector Management

Weng-Kun Liu, Chia-Chun Yen

Subject: Business, Economics And Management, Business And Management Keywords: customer complaint process improvement; customer complaint service; big data analysis

Online: 7 September 2016 (11:38:33 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0027.v1

Framework to Create Dataset for Disaster Behavior Analysis using Google Earth Engine: A Case Study in Peninsular Malaysia for Historical Forest Fire Behavior Analysis

Yee Jian Chew, Shih Yin Ooi, Ying Han Pang, Zheng You Lim

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Disaster behavior; forest fire behavior; forest fire dataset; data extraction framework; Google Earth Engine; remote sensing; Malaysia; ChatGPT; Noteable; Large Language Model

Online: 1 April 2024 (10:35:32 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202209.0271.v1

Does the Restriction of Human Mobility Significantly Control COVID-19 Transmission in Jakarta, Indonesia? Global Versus Local Regression Models

I Gede Nyoman Mindra Jaya, Anna Chadidjah, Gumgum Darmawan, Jane Christine Princidy, And Farah Kristiani

Subject: Computer Science And Mathematics, Analysis Keywords: COVID-19; human mobility; spatial autocorrelation; temporal autocorrelation; Facebook mobility data

Online: 19 September 2022 (09:33:10 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202103.0593.v1

Creating a Business and Supporting Digital Transformation

Miguel Ayala, Jorge Portella, Sergio Martinez, Maria Rojas, Luis Jimenez

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Business Inteligence; Data Mining; Data Warehouse.

Online: 24 March 2021 (13:47:31 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.0442.v1

Instrumental and Observational Problems of the Earliest Temperature Records in Italy: A Methodology for Data Recovery and Correction

Dario Camuffo, Antonio Della Valle, Francesca Becherini

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Thermometers; Temperature records; Early instrumental meteorological series; Data rescue; Data recovery; Data correction; Climate data analysis

Online: 7 August 2023 (03:01:24 CEST)

Show abstract| Download PDF| Share

A distinction is made between data rescue (i.e., copying, digitizing and archiving) and data recovery that implies deciphering, interpreting and transforming early instrumental readings and their metadata to obtain high-quality datasets in modern units. This requires a multidisciplinary approach that includes: palaeography and knowledge of Latin and other languages to read the handwritten logs and additional documents; history of science to interpret the original text, data e metadata within the cultural frame of the 17th, 18th and early 19th century; physics and technology to recognize bias of early instruments or calibrations, or to correct for observational bias; astronomy to calculate and transform the original time in canonical hours that started from twilight. The liquid-in-glass thermometer was invented in 1641 and the earliest temperature records started in 1654. Since then, different types of thermometers were invented, based on the thermal expansion of air or selected thermometric liquids with deviation from linearity. Reference points, thermometric scales, calibration methodologies were not comparable, and not always adequately described. Thermometers had various locations and exposures, e.g., indoor, outdoor, on windows, gardens or roofs, facing different directions. Readings were made only one or a few times a day, not necessarily respecting a precise time schedule: this bias is analysed for the most popular combinations of reading times. The time was based on sundials and local Sun, but the hours were counted starting from twilight. In 1789-90 Italy changed system and all cities counted hours from their lower culmination (i.e., local midnight), so that every city had its local time; in 1866, all the Italian cities followed the local time of Rome; in 1893, the whole Italy adopted the present-day system, based on the Coordinated Universal Time and the time zones. In 1873, when the International Meteorological Committee (IMO) was founded, later transformed in World Meteorological Organization (WMO), a standardization of instruments and observational protocols was established, and all data became fully comparable. In the early instrumental period, from 1654 to 1873, the comparison, correction and homogenization of records is quite difficult, mainly because of the scarcity or even absence of metadata. This paper deals about this confused situation, discussing the main problems, but also the methodologies to recognize missing metadata, distinguish indoor from outdoor readings; correct and transform early datasets in unknown or arbitrary units into modern units; finally, in which cases it is possible to reach the quality level required by WMO. The focus is to explain the methodology needed to recover early instrumental records, i.e., the operations that should be performed to interpret, correct, and transform the original raw data into a high-quality dataset of temperature, usable for climate studies.

Preprint ARTICLE | doi:10.20944/preprints202402.0946.v1

Finding Negative Associations from Medical Data Streams Based on Frequent and Regular Patterns

RajaRao Budaraju, Sastry Kodanda Rama Jammalamadaka

Subject: Computer Science And Mathematics, Computer Science Keywords: Data streams; Negative associations; Adverse effects; side reactions; Frequent and Regular patterns

Online: 19 February 2024 (11:35:26 CET)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Business Intelligence and Its Big Evolution

Andres Velosa, Gustavo Pabon

Subject: Engineering, Automotive Engineering Keywords: Business Intelligence; Data warehouse; Data Marts; Architecture; Data; Information; cloud; Data Mining; evolution; technologic companies; tools; software

Online: 24 March 2021 (13:06:53 CET)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202303.0453.v1

Analysis of Public Discourse on Twitter involving COVID-19 and MPox: Findings from Sentiment Analysis and Text Analysis

Nirmalya Thakur

Subject: Social Sciences, Media Studies Keywords: COVID-19; MPox; Twitter; Big Data; Data Mining; Data Analysis; Sentiment Analysis; Data Science; Social Media; Monkeypox

Online: 27 March 2023 (08:39:28 CEST)

Show abstract| Download PDF| Share

Mining and analysis of the Big Data of Twitter conversations have been of significant interest to the scientific community in the fields of healthcare, epidemiology, big data, data science, computer science, and their related areas, as can be seen from several works in the last few years that focused on sentiment analysis and other forms of text analysis of Tweets related to Ebola, E-Coli, Dengue, Human papillomavirus (HPV), Middle East Respiratory Syndrome (MERS), Measles, Zika virus, H1N1, influenza-like illness, swine flu, flu, Cholera, Listeriosis, cancer, Liver Disease, Inflammatory Bowel Disease, kidney disease, lupus, Parkinson's, Diphtheria, and West Nile virus. The recent outbreaks of COVID-19 and MPox have served as "catalysts" for Twitter usage related to seeking and sharing information, views, opinions, and sentiments involving both these viruses. While there have been a few works published in the last few months that focused on performing sentiment analysis of Tweets related to either COVID-19 or MPox, none of the prior works in this field thus far involved analysis of Tweets focusing on both COVID-19 and MPox at the same time. With an aim to address this research gap, a total of 61,862 Tweets that focused on Mpox and COVID-19 simultaneously, posted between May 7, 2022, to March 3, 2023, were studied to perform sentiment analysis and text analysis. The findings of this study are manifold. First, the results of sentiment analysis show that almost half the Tweets (the actual percentage is 46.88%) had a negative sentiment. It was followed by Tweets that had a positive sentiment (31.97%) and Tweets that had a neutral sentiment (21.14%). Second, this paper presents the top 50 hashtags that were used in these Tweets. Third, it presents the top 100 most frequently used words that are featured in these Tweets. The findings of text analysis show that some of the commonly used words involved directly referring to either or both viruses. In addition to this, the presence of words such as "Polio", "Biden", "Ukraine", "HIV", "climate", and "Ebola" in the list of the top 100 most frequent words indicate that topics of conversations on Twitter in the context of COVID-19 and MPox also included a high level of interest related to other viruses, President Biden, and Ukraine. Finally, a comprehensive comparative study that involves a comparison of this work with 49 prior works in this field is presented to uphold the scientific contributions and relevance of the same.

Preprint ARTICLE | doi:10.20944/preprints202402.1061.v1

Enhancing Reliability in Wind Turbine Power Curve Estimation

Pere Marti-Puig, José Ángel Hernández, Jordi Solé-Casals, Moisès Serra-Serra

Subject: Engineering, Energy And Fuel Technology Keywords: Wind Power Curve Modeling; Artificial neural networks (ANN); Wind Turbine (WT); SCADA Data.; Industrial AI

Online: 19 February 2024 (12:24:46 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202401.2194.v2

Cybersecurity & Data Privacy in Fintech

Rajath Karangara, Otilia Manta

Subject: Computer Science And Mathematics, Computer Science Keywords: Fintech; Cybersecurity; Data Privacy; Information Security; Regulatory Compliance

Online: 11 April 2024 (10:40:33 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0849.v1

Intellecta Cognitiva: A Comprehensive Dataset for Advancing Academic Knowledge and Machine Reasoning

Ditto PS, Ajmal PS, Jithin VG

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Synthetic data; pretrain data; llm training

Online: 12 April 2024 (12:46:27 CEST)

Show abstract| Download PDF| Share

Search Results

1396 articles found