Search | Preprints.org

Preprint ARTICLE | doi:10.20944/preprints202403.0012.v1

Flexible Techniques to Detect Typical Hidden Errors in Large Longitudinal Datasets

Renato Bruni, Cinzia Daraio, Simone Di Leo

Subject: Computer Science And Mathematics, Computer Science Keywords: big data; information processing; information reconstruction; data quality: longitudinal data sequences

Online: 1 March 2024 (10:33:16 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0227.v1

Industrialization in Construction Companies – a Benchmark Study on Manufacturing Companies

Solmaz Mansoori, Janne Härkönen, Harri Haapasalo, Petteri Annunen

Subject: Engineering, Architecture, Building And Construction Keywords: predefined products; predefined processes; data management; industrialized construction

Online: 3 April 2024 (13:15:17 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201608.0204.v1

A Grey Forecasting Approach for the Sustainability Performance of Logistics Companies

Min-Chun Yu, Chia-Nan Wang, Nguyen-Nhu-Y Ho

Subject: Business, Economics And Management, Economics Keywords: logistics industry; sustainability; data envelopment analysis (DEA); grey forecasting

Online: 25 August 2016 (10:12:27 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202105.0663.v1

MANAGEMENT OF BIG DATA IN THE CONTEMPORARY WORLD

Anjaneyulu Jinugu, Sreechandana Kodimela, Madhavi Laitha V V

Subject: Computer Science And Mathematics, Computer Science Keywords: Big Data, Internet Data Sources (IDS), Internet of Things (IoT), Sustainable Development Goals (SDGs), Big data Technologies, Big data Challenges

Online: 27 May 2021 (10:31:03 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202007.0330.v1

Data Privacy Management in Public Environments

Hugo Lopes, Valderi R. Q. Leithardt, Ivan Miguel Pires, Raúl García-Ovejero, María Navarro-Cáceres

Subject: Computer Science And Mathematics, Information Systems Keywords: Data Privacy; Mobile devices; Environment Privacy; General Data Protection Regulation (GDPR).

Online: 15 July 2020 (09:30:42 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.1859.v1

Enhancing Data Preservation and Security in Industrial Control Systems Through Integrated IOTA Implementation

Iuon-Chang Lin, Pai-Ching Tseng, Pin-Hsiang Chen, Shean-Juinn Chiou

Subject: Engineering, Control And Systems Engineering Keywords: DLT; IoT; data security; Docker; Container Technology; IOTA; Tangle

Online: 29 March 2024 (13:45:37 CET)

Show abstract| Download PDF| Share

: In the realm of data management, data preservation stands as a critical undertaking aimed at preserving and upholding the integrity of data. Regardless of whether it concerns personal or enterprise data, the detrimental effects of malicious alterations implemented by attackers cannot be overlooked. Particularly in conventional industrial control environments, the prevalent practice involves the transmission of data from sensors to databases for storage purposes. However, it is essential to recognize that this process exposes the data to various vulnerabilities. Thus, to ensure the long-term security and reliability of the data, it becomes imperative to implement robust data preservation strategies within these industrial control systems. However, the reliance of these databases on physical hard disks introduces inherent vulnerabilities, including the potential for data loss due to disk damage or targeted malicious attacks. Consequently, it becomes imperative to prioritize the implementation of robust data preservation measures. These measures are crucial in mitigating the risk of disruptions and protecting critical data from compromise. By establishing effective data backup systems, employing advanced security protocols, and implementing proactive monitoring mechanisms, organizations can bolster their data preservation capabilities and safeguard against potential threats to data integrity and availability. As a result, many enterprises opt to store their data with third-party providers to ensure data integrity. However, this approach carries inherent risks. If the third-party service experiences an attack or if the data is tampered with, it becomes challenging to verify the integrity of the data. To address these concerns and ensure data preservation within the context of the Internet of Things (IoT), a growing number of individuals are integrating IoT with Distributed Ledger Technology (DLT). By leveraging DLT, the integrity of data can be ensured, reducing reliance on centralized third-party storage and enhancing security in the IoT ecosystem. In this article, IOTA is the DLT, which employs Directed Acyclic Graph (DAG) to store transaction information. Compared to Ethereum or other blockchain technologies, IOTA offers notable advantages in terms of transaction verification speed, making it highly suitable for real-time IoT environments. However, the conventional transmission path from sensors to IOTA nodes entails a complex route, involving multiple hardware devices before reaching the intended destination. This complexity poses challenges in ensuring data integrity during transmission and introduces vulnerabilities such as man-in-the-middle attacks or SQL injection attacks. To address these issues, we propose a method to streamline the transmission path between sensors and IOTA, specifically tailored for industrial fields with numerous IoT devices. Our approach involves preprocessing the data stored on the server using our method before uploading, ensuring data confidentiality, and leveraging IOTA to guarantee data integrity. To achieve the shortest path between IoT and DLT nodes, it becomes necessary to establish IOTA nodes on lower-level devices, such as Raspberry Pi or IoT controllers. By simplifying the transmission path, we can reduce the potential for tampering and enhance overall data security. Implementing our proposed method enables the assurance of data confidentiality and integrity during both transmission and storage on the server, strengthening the trustworthiness of the IoT, and IOTA integration.

Preprint ARTICLE | doi:10.20944/preprints202110.0260.v1

Online System for Power Quality Operational Data Management in Frequency Monitoring using Python and Grafana

Jose-María Sierra-Fernández, Olivia Florencias-Oliveros, Manuel-Jesús Espinosa-Gavira, Juan-José González-de-la-Rosa, Agustín Agüera-Pérez, José-Carlos Palomares-Salas

Subject: Engineering, Electrical And Electronic Engineering Keywords: big data; data acquisition; data visualization; data exchange; dashboard; frequency stability; Grafana lab; Power Quality; GPS reference; frequency measurement.

Online: 18 October 2021 (18:07:43 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202209.0094.v1

Architecture using blockchain data privacy for healthcare data management

Anubis Rossetto, Christofer Sega, Valderi Reis Quietinho Leithardt

Subject: Computer Science And Mathematics, Information Systems Keywords: Blockchain; Cryptography; DApp; Health Data; Privacy.

Online: 7 September 2022 (03:06:09 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202203.0214.v1

Explainable AI-based Alzheimer’s Prediction and Management Using Multimodal Data

Sobhana Jahan, Kazi Abu Taher, M Shamim Kaiser, Mufti Mahmud, Md. Sazzadur Rahman, A. S. M. Sanwar Hosen, In-Ho Ra

Subject: Engineering, Control And Systems Engineering Keywords: Machine learning; Dementia; Data-level fusion

Online: 15 March 2022 (12:31:40 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201903.0205.v3

Innovative Data Management in Advanced Characterization: Implications for Materials Design

Nick Romanos, Maritini Kalogerini, Elias P. Koumoulos, Athanasios Morozinis, Marco Sebastiani, Costas Charitidis

Subject: Chemistry And Materials Science, Materials Science And Technology Keywords: characterisation; materials; ontology; data; metadata; nanoindentation

Online: 12 April 2019 (20:48:02 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.1845.v1

Advanced Analytics and Data Management in the Procurement Function: An Aviation Industry Case Study

Andrea Altundag, Martin Wynn

Subject: Business, Economics And Management, Business And Management Keywords: data analytics; strategic procurement; big data; maturity model; aviation industry; aircraft manufacturer

Online: 29 March 2024 (10:36:02 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.1712.v1

Secure Hydrogen Production Analysis and Prediction Based on Blockchain Service Framework for Intelligent Power Management System

Harun Jamil, Faiza Qayyum, Naeem Iqbal, Murad Ali Khan, Syed Shehryar Ali Naqvi, Salabat Khan, Do-Hyeun Kim

Subject: Engineering, Energy And Fuel Technology Keywords: blockchain; IoT; hydrogen production; secure data-driven analysis; historical data management

Online: 26 September 2023 (05:24:51 CEST)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202301.0335.v2

Safe Control of Autonomous Cloud Entities in Distributed Systems

Mostefa Kara

Subject: Computer Science And Mathematics, Information Systems Keywords: Cloud Computing; Data Protection; Secure Communication; Middleware; Protocols

Online: 30 January 2023 (09:24:01 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1170.v1

Mapping hierarchical file structures to semantic data models for efficient data integration into research data management systems

Henrik Tom Wörden, Florian Spreckelsen, Stefan Luther, Ulrich Parlitz, Alexander Schlemmer

Subject: Computer Science And Mathematics, Information Systems Keywords: research data management; FAIR; file structure; file crawler; semantic data model

Online: 16 August 2023 (11:05:47 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.1613.v1

FIWARE-compatible Smart Data Models for Satellite Imagery and Flood Risk Assessment to Enhance Data Management

Ioannis-Omiros Kouloglou, Gerasimos Antzoulatos, Georgios Vosinakis, Francesca Lombardo, Alberto Abella, Marios Bakratsas, Anastasia Moumtzidou, Evangelos Maltezos, Ilias Gialampoukidis, Eleftherios Ouzounoglou, Stefanos Vrochidis, Angelos Amditis, Ioannis Kompatsiaris, Michele Ferri

Subject: Computer Science And Mathematics, Other Keywords: Smart Data Models; Remote sensing; Satellite Imagery; Flood Monitoring and Mapping; Flood Risk Assessment; Data Sharing; Interoperability; Water Data Management

Online: 24 November 2023 (15:08:26 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202007.0369.v1

System for Control and Management of Data Privacy of Patients with COVID-19

Arielle Verri Lucca, Rodrigo Luchtenberg, Leonardo Garcez de Paula Conceicao, Luis Augusto Silva, Raúl García Ovejero, María Navarro-Cáceres, Valderi Reis Quietinho Leithardt

Subject: Computer Science And Mathematics, Mathematical And Computational Biology Keywords: Data privacy; Ambient intelligence; COVID-19

Online: 17 July 2020 (08:17:07 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201611.0110.v1

Impacts of Capital Structure on Performance of Banks in a Developing Economy: Evidence from Bangladesh

Md. Nur Alam Siddik, Sajal Kabiraj, Shanmugan Joghee

Subject: Business, Economics And Management, Finance Keywords: capital structure; firm’s performance; panel data; unit root analysis; Bangladesh

Online: 22 November 2016 (09:36:36 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201905.0174.v2

Resource-Aware Network Topology Management Framework

Aaqif Afzaal Abbasi, Shahab Shamshirband, Mohammed A. A. Al-qaness, Almas Abbasi, Nashat T. AL-Jallad, Amir Mosavi

Subject: Computer Science And Mathematics, Information Systems Keywords: cloud computing; big data; fog computing; software-defined; networking; network management; resource management; topology.

Online: 26 February 2020 (15:34:25 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202303.0391.v1

Feature Selection in the Context of Prognostic Health Management

Indrawata Wardhana

Subject: Medicine And Pharmacology, Veterinary Medicine Keywords: prognosis and health management, preprocessing data, feature extraction, feature selection.

Online: 22 March 2023 (04:31:53 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201608.0232.v2

ODK Scan: Digitizing Data Collection and Impacting Data Management Processes in the Tuberculosis Control Program of Pakistan

Syed Mustafa Ali, Rachel Powers, Jeffrey Beorse, Farah Naureen, Arif Noor, Naveed Anjum, Muhammad Ishaq, Javariya Aamir, Richard Anderson

Subject: Medicine And Pharmacology, Pulmonary And Respiratory Medicine Keywords: mHealth; ODK scan; mobile health application; digitizing data collection; data management processes; paper-to-digital system; technology-assisted data management; treatment adherence

Online: 2 September 2016 (03:17:38 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.0178.v1

A Blockchain Based Framework for Remote Sensing Data Management

Quan Zou, Wenyang Yu, Ziwei Bao

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Blockchain; Remote sensing data management; Distributed ledger technology; Trusted service; Security

Online: 2 August 2023 (08:43:07 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202304.0449.v1

Blockchain-Assisted Reputation Management Scheme for Internet of Vehicles

Qian Liu, JunQuan Gong, Qilie Liu

Subject: Computer Science And Mathematics, Computer Networks And Communications Keywords: IoV; MEC; Data Sharing; Reputation Management; Subjective Logic Trust Model; Blockchain

Online: 17 April 2023 (10:40:19 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202204.0016.v1

Unstructured Data Analysis for Risk Management of Electric Power Transmission Lines

Lucas Pereira, Rafael Pereira, Pedro Prado, Felipe Cunha, Fabrício Góes, Roger Fiusa, Lorrany Silva

Subject: Engineering, Electrical And Electronic Engineering Keywords: natural language processing; risk management; transmission lines; unstructured data

Online: 4 April 2022 (11:26:15 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202008.0693.v1

Management of the Engineering Data for Manufacturing

Gurcan Atakok, Mufit Cun

Subject: Engineering, Industrial And Manufacturing Engineering Keywords: Industry 4.0; Product Data Management; Product Life Cycle Management; Concurrent Engineering; Validation of Design

Online: 31 August 2020 (04:17:05 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201805.0470.v1

Open-Sourced Remote Sensing Data Management with the Irish Earth Observation (IEO) Python Module

Guy Serbin, Stuart Green

Subject: Environmental And Earth Sciences, Environmental Science Keywords: remote sensing; python; data management; landsat; open-source

Online: 31 May 2018 (11:12:27 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.0767.v1

Livestock Disease Data Management for E-Surveillance and Disease Mapping Using Cluster Analysis

Mohammed Kemal Ahmed, Durga Prasad Sharma, Hussein Seid Worku, Amir Ibrahim Tahir

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Data analytics; Cluster analysis; Disease mapping; Distance metrics; livestock Disease

Online: 12 June 2023 (05:10:55 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201906.0075.v1

Decision Support System Research on Forest Tending Problems Based on Process Management System

Hui Jing, Wukui Wang, Aline Umutoni

Subject: Engineering, Control And Systems Engineering Keywords: forest tending; group decision support system; process management; data integration

Online: 10 June 2019 (10:32:09 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201701.0080.v1

A Parameter Selection Method for Wind Turbine Health Management through SCADA Data

Mian Du, Jun Yi, Peyman Mazidi, Lin Cheng, Jianbo Guo

Subject: Engineering, Electrical And Electronic Engineering Keywords: wind turbine; failure detection; SCADA data; feature extraction; mutual information; copula

Online: 17 January 2017 (11:21:58 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201901.0130.v1

Big Data-Driven Market-Oriented Information System for the Internationalisation and Strategic and Sustainable Management of SMEs

Yoseob Heo, Jungjoon Kim, Jongseok Kang

Subject: Business, Economics And Management, Business And Management Keywords: internationalisation of SMEs; big data; market-oriented information; relational database; supply chain network; optimized database; trade condition; data visualization

Online: 14 January 2019 (10:04:03 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202209.0413.v2

An Approach for Data Privacy Management for Banking Using Consortium Blockchain

Shady Nabih, hanan fahmy, sayed abdelgaber

Subject: Engineering, Chemical Engineering Keywords: Consortium Blockchain; Ring signature; Blockchain privacy; Blockchain security; Access Control; Blockchain big data

Online: 25 June 2023 (04:01:48 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201811.0216.v1

Safety Helmet Wearing Management System for Construction Workers Using Three-Axis Accelerometer Sensor

SungHun Kim, Changwon Wang, Se Dong Min, Seung-Hyun Lee

Subject: Engineering, Architecture, Building And Construction Keywords: Construction, worker safety, safety helmet, three-axis accelerometer sensor, data mining

Online: 8 November 2018 (14:03:21 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0884.v1

Personalized Diabetes Management with Digital Twins: A Patient-Centric Knowledge Graph Approach

Fatemeh Sarani Rad, Rasha Hendawi, Xinyi Yang, Juan Li

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: Digital Twins; Personal Health Knowledge Graph; Data Integration; Ontology; Diabetes Man-agement

Online: 15 March 2024 (08:52:15 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.0713.v1

ClimInonda a Web Application for Management the Climate Da-Ta: Case Study of the Flooding Risk in Bayech Transboundary Basin

Zaineb Ali, Amine Saddik, Brahim Erraha, Adnane Labbaci, Mohamed Ouessar

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Web application; climate data; weather station; ClimInonda

Online: 9 June 2023 (11:51:39 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201905.0274.v1

Emergent Challenges for Science Suas Data Management: Fairness through Community Engagement and Best Practices Development

Jane Wyngaard, Lindsay Barbieri, Andrea Thomer, Josip Adams, Don Sullivan, Cynthia Parr, Sudhir Raj Shrestha, Christopher Crosby, Jens Klump, Tom Bell

Subject: Environmental And Earth Sciences, Environmental Science Keywords: sUAS; drone; RPAS; UAV; Data; Management; FAIR; Community; standards; practices

Online: 22 May 2019 (11:42:08 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202312.2159.v1

Working Capital Management Strategies and Financial Performance: A Cause-and-Effect Analysis

Ashok Panigrahi

Subject: Business, Economics And Management, Finance Keywords: Working Capital Management; Financial Performance; Indian Cement Companies; Bombay Stock Exchange; Panel Data

Online: 29 December 2023 (01:22:10 CET)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Case Study on Privacy-Aware Social Media Data Processing in Disaster Management

Marc Löchner, Ramian Fathi, David Schmid, Alexander Dunkel, Dirk Burghardt, Frank Fiedrich, Steffen Koch

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: disaster management; virtual operation support teams; privacy; data retention; hyperloglog; focus group discussion

Online: 1 October 2020 (13:58:16 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201609.0027.v1

Optimizing Bus Passenger Complaint Service through Big Data Analysis: Systematized Analysis for Improved Public Sector Management

Weng-Kun Liu, Chia-Chun Yen

Subject: Business, Economics And Management, Business And Management Keywords: customer complaint process improvement; customer complaint service; big data analysis

Online: 7 September 2016 (11:38:33 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202312.0403.v3

Does Working Capital Management Policies Affect Firm’s Performances? An insight from Indian Cement Companies.

CMA (Dr.) Ashok Panigrahi

Subject: Business, Economics And Management, Finance Keywords: Working Capital Management; Profitability; Indian Cement Industry; Bombay Stock Exchange; Panel Data

Online: 13 December 2023 (05:03:58 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201906.0174.v1

Exploring Challenges to Implementation of IT Service Management System ISO 20000: Implications in Managing Big Data in Emerging Economy

Nafis Ahmad, Md. Golam Rabbany, Syed Mithun Ali

Subject: Engineering, Industrial And Manufacturing Engineering Keywords: Business excellence; information technology; implementation challenge; ISO 20000; big data management.

Online: 18 June 2019 (10:56:19 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202401.1840.v1

Tree-based Modeling for Large-scale Management in Agriculture: Explaining Organic Matter Content in Soil

Woosik Lee, Juhwan Lee

Subject: Environmental And Earth Sciences, Soil Science Keywords: agricultural data analysis; agricultural business management; tree-based models; SHAP; soil organic matter; economic analysis

Online: 26 January 2024 (02:07:22 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202211.0295.v1

An Information Systems for Infrastructure Asset Management Tailored to Portuguese Water Utilities: Platform Conceptualization and A Prototype Demonstration

Nelson Carriço, Bruno Ferreira, André Antunes, Cedric I. C. Grueau, Raquel Barreira, Ana Mendes, Dídia I. C. Covas, Laura Monteiro, João Filipe Santos, Isabel Sofia Brito

Subject: Engineering, Civil Engineering Keywords: Data integration; Decision Support System; Information Systems; Infrastructure Asset Management; Water supply systems

Online: 16 November 2022 (03:31:31 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1679.v1

Design of Smart Flood Risk Management System: A Brunei Darussalam Vision 2035 (WAWASAN 2035) for Climate Resilience and Adaptation

Zaharaddeen Karami Lawal, Rufai Yusuf Zakari, Hayati Yassin

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Big Data; Climate Change; Flood Modelling; Internet of Things; Machine Learning; Risk Management

Online: 29 February 2024 (11:41:05 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202104.0482.v1

System Design and Data Governance for Environmental Disaster Management in Smart Scenic - A Case Study of Huangshan Mountain

Zhong Wang, Zhenjie Liao, Lijuan Zhang

Subject: Business, Economics And Management, Accounting And Taxation Keywords: Smart Scenic; environmental disasters management; organization transformation; system design; Big Data; Internet of Things

Online: 19 April 2021 (13:19:35 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1556.v1

Empirical Analysis of the Current Status and Potential of Service-oriented and Data-driven Business Models within the Sheet Metal Working Sector: Insights from Interview-based Research in SMEs

Jonas Wirth, Mirko Schneider, Leon Hanselmann, Kira Fink, Stephan Nebauer, Thomas Bauernhansl

Subject: Business, Economics And Management, Business And Management Keywords: service-oriented business models; data-driven business models; servitization; digital transformation; ecosystem innovation, SME

Online: 27 February 2024 (14:06:12 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1244.v1

Assessing the Impact of Data Sciences and Smart Technologies in Air Conditioning Project Management: A Delphi Method Analysis within the Construction Industry

Bashar Mahmood Ali, Mehmet Akkaş

Subject: Engineering, Mechanical Engineering Keywords: Intelligent Data Analyzing; energy consumption; thermal comfort; inclusion; exclusion criteria; Delphi method

Online: 18 August 2023 (10:45:39 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202305.0856.v1

Research on the Evaluation Model of School Management Quality in the Compulsory Education Stage Based on Big Data Technology

Guanghui Min, Muhui Lin, Ying Liu, Zhe Li

Subject: Computer Science And Mathematics, Computer Science Keywords: quality evaluation of school management; compulsory education stage; big data technology; visualization techniques; evaluation models

Online: 11 May 2023 (13:26:38 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201806.0108.v1

Reaching, Engaging and Advancing Research (REAR); An Assessment of Health Managers’ Skills and Knowledge in Data Management, Analysis, Utilization, and Dissemination Kenya, Tanzania and Rwanda

Peter Memiah, Tristi Ah Mu, Shreya Madhavaram, Caroline Kingori, Courtney Cook, Sarah Dawson, Hannah Funk, Jackson Sebeza, Michelle Mwangi, Mtebe Majigo, Samuel Muhula, Wairimu Mwangi, Vernon Mochache, Kevin Owour, John Paul Oyore, Eric Remera, Sabin Nsanzimana, Claude Kumalija, Carol Ngunu

Subject: Medicine And Pharmacology, Other Keywords: Data Management; Utilization and Analysis; Capacity Building; Health professionals; Workforce Development; Evidence Based

Online: 7 June 2018 (08:54:20 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202003.0141.v1

Sharing Is Caring – Data Sharing Initiatives in Healthcare

Tim Hulsen

Subject: Medicine And Pharmacology, Other Keywords: data sharing; data management; data science; big data; healthcare

Online: 8 March 2020 (16:46:20 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0834.v1

A Multi-Temporal Analysis on the Dynamics of the Impact of Land Use and Land Cover on NO₂ and CO Emissions in Argentina for Sustainable Environmental Management

Viviana Fernández-Maldonado, Ana Laura Navas, María Paula Fabani, Germán Mazza, Rosa Rodriguez

Subject: Environmental And Earth Sciences, Environmental Science Keywords: atmospheric pollution; remote sensing data; land uses and land cover; gases; environmental management

Online: 12 April 2024 (05:28:49 CEST)

Show abstract| Download PDF| Share

Air quality is a topic of growing relevance on the global agenda. Two key indicators of air quali-ty, nitrogen dioxide (NO2) and carbon monoxide (CO), represent significant hazards to human health and the environment putting the sustainability of natural resources at risk. Therefore, it is imperative to regularly monitor and reduce atmospheric pollutant concentrations to mitigate their harmful consequences. This study presents an analysis of NO2 total and CO emissions in Argentina, utilizing remote sensing data. The research aims to determine the spatiotemporal distribution of NO2 and CO emissions in Argentina from 2019 to 2021. Subsequently, it seeks to establish the influence of land uses and land cover on the emission of NO2 and CO through dif-ferent climatic, anthropic, and natural indicators. The study was carried out in Argentina during the period 2019-2021, where random points were placed for the different land covers with a total of 800 points surveyed. The year with the highest CO concentration (mol/m-2) was 2020. The values were highest for tree covers and herbaceous wetland coverage in the northern part of the country. For total NO2, the highest concentrations were reached during the years 2020 and 2021. Regarding its distribution, throughout the evaluated period, the highest concentrations of total NO2 were found in the built-up and cropland coverages, with the capital of the country and the northern region of the Buenos Aires province being the most affected areas. In addition, the concentration of CO was influenced by climatic variables (atmospheric pressure, wind speed, maximum environment temperature, Palmer index), natural (height, humidity, NDVI), and ur-ban variables (distances to mining extraction, airports, power plants, urban index) for the differ-ent uses and land covers. Finally, the concentrations of total NO2 were influenced by climatic variables (Palmer index and wind speed), natural (height and NDVI), and urban variables (dis-tance to airports, power plants, industries, service stations, and open dumpsites) for the differ-ent uses and land cover. This study contributes to sustainable environmental management, ena-bling the formulation of effective strategies for mitigating emissions and promoting the long-term health and well-being of communities.

Preprint ARTICLE | doi:10.20944/preprints202404.1018.v1

Discovering Data Domains and Products in Data Meshes Using Semantic Blueprints

Michalis Pingos, Andreas S. Andreou

Subject: Computer Science And Mathematics, Computer Science Keywords: Big Data; Data Lakes; Data Meshes; Data Products; Data Blueprints; Metadata Semantic Enrichment

Online: 16 April 2024 (16:26:06 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202206.0320.v4

Ten Simple Rules for Using Public Biological Data for Your Research

Vishal Oza, Jordan Whitlock, Elizabeth Wilk, Angelina Uno-Antonison, Brandon Wilk, Manavalan Gajapathy, Timothy Howton, Austyn Trull, Lara Ianov, Elizabeth Worthey, Brittany Lasseigne

Subject: Biology And Life Sciences, Other Keywords: data; reproducibility; FAIR; data reuse; public data; big data; analysis

Online: 2 November 2022 (02:55:49 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202003.0268.v1

TEEDA: An Interactive Platform for Matching Data Providers and Users in Data Marketplace

Teruaki Hayashi, Yukio Ohsawa

Subject: Social Sciences, Library And Information Sciences Keywords: matching; data marketplace; data platform; data visualization; call for data

Online: 17 March 2020 (04:10:28 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.0602.v1

The Improvement of the Use of Open Data in Public Institutions

Besart Hyseni, Lejla Abazi Bexheti

Subject: Computer Science And Mathematics, Information Systems Keywords: Improving use of open data; data utilization; data optimization; enhancing data access; open data impact; open data government; data transparency; data-driven decision making

Online: 12 February 2024 (09:34:51 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202309.2113.v1

Navigating the Data Architecture Landscape: A Comparative Analysis of Data Warehouse, Data Lake, Data Lakehouse, and Data Mesh

Benjamin wong

Subject: Computer Science And Mathematics, Hardware And Architecture Keywords: Data, DWH, Data Warehouse, Architecture, Data Lake, Storage, Analysis, Data Mesh, Analytical, Architectural, Data Vault

Online: 3 October 2023 (03:28:55 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0265.v1

Security and Ownership in User Defined Data Meshes

Michalis Pingos, Panayiotis Christodoulou, Andreas S. Andreou

Subject: Computer Science And Mathematics, Computer Science Keywords: Big Data; Smart Data Processing; Systems of Deep Insight; Data Meshes; Data Lakes; Data Products; Blockchain; NFT; Data Blueprints

Online: 5 March 2024 (15:04:49 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202304.0130.v1

Data Cooperatives as Catalysts for Collaboration, Data Sharing, and the (Trans)Formation of the Digital Commons

Michael Max Bühler, Igor Calzada, Isabel Cane, Thorsten Jelinek, Astha Kapoor, Morshed Mannan, Sameer Mehta, Marina Micheli, Vijay Mookerje, Konrad Nübel, Alex Pentland, Trebor Scholz, Divya Siddarth, Julian Tait, Bapu Vaitla, Jianguo Zhu

Subject: Computer Science And Mathematics, Other Keywords: data; cooperatives; open data; data stewardship; data governance; digital commons; data sovereignty; open digital federation platform

Online: 7 April 2023 (14:14:02 CEST)

Show abstract| Download PDF| Share

Network effects, economies of scale, and lock-in-effects increasingly lead to a concentration of digital resources and capabilities, hindering the free and equitable development of digital entrepreneurship (SDG9), new skills, and jobs (SDG8), especially in small communities (SDG11) and their small and medium-sized enterprises (“SMEs”). To ensure the affordability and accessibility of technologies, promote digital entrepreneurship and community well-being (SDG3), and protect digital rights, we propose data cooperatives [1,2] as a vehicle for secure, trusted, and sovereign data exchange [3,4]. In post-pandemic times, community/SME-led cooperatives can play a vital role by ensuring that supply chains to support digital commons are uninterrupted, resilient, and decentralized [5]. Digital commons and data sovereignty provide communities with affordable and easy access to information and the ability to collectively negotiate data-related decisions. Moreover, cooperative commons (a) provide access to the infrastructure that underpins the modern economy, (b) preserve property rights, and (c) ensure that privatization and monopolization do not further erode self-determination, especially in a world increasingly mediated by AI. Thus, governance plays a significant role in accelerating communities’/SMEs’ digital transformation and addressing their challenges. Cooperatives thrive on digital governance and standards such as open trusted Application Programming Interfaces (APIs) that increase the efficiency, technological capabilities, and capacities of participants and, most importantly, integrate, enable, and accelerate the digital transformation of SMEs in the overall process. This policy paper presents and discusses several transformative use cases for cooperative data governance. The use cases demonstrate how platform/data-cooperatives, and their novel value creation can be leveraged to take digital commons and value chains to a new level of collaboration while addressing the most pressing community issues. The proposed framework for a digital federated and sovereign reference architecture will create a blueprint for sustainable development both in the Global South and North.

Preprint COMMUNICATION | doi:10.20944/preprints202401.0780.v1

Data Reuse in Agricultural Genomics Research: Present Challenges and Future Solutions

Alenka Hafner, Victoria DeLeo, Cecilia Deng, Christine G. Elsik, Damarius Fleming, Peter W. Harrison, Theodore S. Kalbfleisch, Bruna Petry, Boas Pucker, Elsa H. Quezada-Rodríguez, Christopher K. Tuggle, James Koltes

Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: data reuse; agriculture; open data; metadata; data standards; equity

Online: 10 January 2024 (10:07:03 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.0104.v1

Conceptual Design of a Generic Data Harmonization Process for OMOP CDM

Elisa Henke, Michele Zoch, Yuan Peng, Ines Reinecke, Martin Sedlmayr, Franziska Bathelt

Subject: Public Health And Healthcare, Other Keywords: OMOP; OHDSI; interoperability; data harmonization; clinical data; claims data

Online: 2 November 2023 (07:45:02 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1237.v1

A Method to Enable Automatic Extraction of Cost and Quantity Data from Hierarchical Construction Information Documents to Enable Rapid Digital Comparison and Analysis

Daniel Adanza Dopazo, Lamine Mahdjoubi, Bill Gething

Subject: Engineering, Transportation Science And Technology Keywords: data mining; data extraction; data science; cost infrastructure projects

Online: 17 August 2023 (09:25:22 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.1378.v1

Algorithm-based Data Generation (ADG) Engine for Data Analytics

Iman I. M. Abu Sulayman, Peter Voege, Abdelkader Ouda

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Data Generation; Anomaly Data; User Behavior Generation; Big Data

Online: 19 June 2023 (16:31:37 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202007.0153.v1

A Hitchhiker’s Guide to Working with Large, Open-Source Neuroimaging Datasets

Corey Horien, Stephanie Noble, Abigail Greene, Kangjoo Lee, Daniel Barron, Siyuan Gao, Dave O'Connor, Mehraveh Salehi, Javid Dadashkarimi, Xilin Shen, Evelyn Lake, R. Todd Constable, Dustin Scheinost

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: Open-science; big data; fMRI; data sharing; data management

Online: 8 July 2020 (11:53:33 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202312.0851.v1

Revolutionizing Financial Portfolio Management: The NSTD Model’s Fusion of Macroeconomic Indicators and Sentiment Analysis in a Deep Reinforcement Learning Framework

Yuchen Liu, Daniil Mikriukov, Owen Christopher Tjahyadi, Gangmin Li, Terry R. Payne, Yong Yue, Kamran Siddique, Ka Lok Man

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Portfolio Management (PM); Deep Reinforcement Learning (DRL); Non-Stationary 13 Transformer; Sequential Processing; Data Heterogeneity; Market Uncertainty; Diverse Knowledge 14 Integration; Multimodal Learning15

Online: 12 December 2023 (09:59:50 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201810.0273.v1

Russian-German Astroparticle Data Life Cycle Initiative

Igor Bychkov, Andrey Demichev, Julia Dubenskaya, Oleg Fedorov, Andreas Haungs, Andreas Heiss, Yulia Kazarina, Elena Korosteleva, Dmitriy Kostunin, Alexander Kryukov, Andrey Mikhailov, Minh-Duc Nguyen, Stanislav Polyakov, Evgeny Postnikov, Alexey Shigarov, Dmitry Shipilov, Achim Streit, Viktoria Tokareva, Doris Wochele, Jürgen Wochele, Dmitry Zhurov

Subject: Physical Sciences, Astronomy And Astrophysics Keywords: astroparticle physics, cosmic rays, data life cycle management, data curation, meta data, big data, deep learning, open data

Online: 12 October 2018 (14:48:32 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202105.0589.v1

A Study on Ways to Extend Public Data for Game Ratings from Korea

HoSeong Kang, JungYoon Kim

Subject: Engineering, Automotive Engineering Keywords: Game Ratings; Public Data; Game Data; Data analysis; GRAC(Korea)

Online: 25 May 2021 (08:32:32 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202007.0078.v1

Data Driven Analytics for Personalized Medical Decision Making

Nataliia Melnykova, Nataliya Shakhovska, Michal Gregus, Volodymyr Melnykov, Mariana Zakharchuk, Olena Vovk

Subject: Computer Science And Mathematics, Information Systems Keywords: personalization; decision making; medical data; artificial intelligence; Data-driving; Big Data; Data Mining; Machine Learning

Online: 5 July 2020 (15:04:17 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0849.v1

Intellecta Cognitiva: A Comprehensive Dataset for Advancing Academic Knowledge and Machine Reasoning

Ditto PS, Ajmal PS, Jithin VG

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Synthetic data; pretrain data; llm training

Online: 12 April 2024 (12:46:27 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202103.0593.v1

Creating a Business and Supporting Digital Transformation

Miguel Ayala, Jorge Portella, Sergio Martinez, Maria Rojas, Luis Jimenez

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Business Inteligence; Data Mining; Data Warehouse.

Online: 24 March 2021 (13:47:31 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202012.0468.v1

Developing High-Resolution Gridded Rainfall and Temperature Data for Bangladesh: The ENACTS-BMD Dataset

Nachiketa Acharya, Rija Faniriantsoa, Bazlur Rashid, Razia Sultana, Carlo Montes, Tufa Dinku, S.M.Q. Hassan

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: climate data; gridded product; data merging

Online: 18 December 2020 (13:29:38 CET)

Show abstract| Download PDF| Share

Preprint CASE REPORT | doi:10.20944/preprints201801.0066.v1

Data Visualization of European Regional Operational Programmes: Unleashing the Informative Potential of Open Data for Performance Assessment

Emanuele Frontoni, Roberto Palloni

Subject: Engineering, Control And Systems Engineering Keywords: cohesion policy; data visualization; open data

Online: 8 January 2018 (11:11:47 CET)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202309.0047.v1

Analyzing Public Reactions during the MPox Outbreak: Findings from Topic Modeling of Tweets

Nirmalya Thakur, Yuvraj Nihal Duggal, Zihui Liu

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: MPox; big data; data analysis; data science; Twitter; natural language processing

Online: 1 September 2023 (10:23:41 CEST)

Show abstract| Download PDF| Share

In the last decade and a half, the world has experienced the outbreak of a range of viruses such as COVID-19, H1N1, flu, Ebola, Zika Virus, Middle East Respiratory Syndrome (MERS), Measles, and West Nile Virus, just to name a few. During these virus outbreaks, the usage and effectiveness of social media platforms increased significantly as such platforms served as virtual communities, enabling their users to share and exchange information, news, perspectives, opinions, ideas, and comments related to the outbreaks. Analysis of this Big Data of conversations related to virus outbreaks using concepts of Natural Language Processing such as Topic Modeling has attracted the attention of researchers from different disciplines such as Healthcare, Epidemiology, Data Science, Medicine, and Computer Science. The recent outbreak of the MPox virus has resulted in a tremendous increase in the usage of Twitter. Prior works in this field have primarily focused on the sentiment analysis and content analysis of these Tweets, and the few works that have focused on topic modeling have multiple limitations. This paper aims to address this research gap and makes two scientific contributions to this field. First, it presents the results of performing Topic Modeling on 601,432 Tweets about the 2022 Mpox outbreak, which were posted on Twitter between May 7, 2022, and March 3, 2023. The results indicate that the conversations on Twitter related to Mpox during this time range may be broadly categorized into four distinct themes - Views and Perspectives about MPox, Updates on Cases and Investigations about Mpox, MPox and the LGBTQIA+ Community, and MPox and COVID-19. Second, the paper presents the findings from the analysis of these Tweets. The results show that the theme that was most popular on Twitter (in terms of the number of Tweets posted) during this time range was - Views and Perspectives about MPox. It is followed by the theme of MPox and the LGBTQIA+ Community, which is followed by the themes of MPox and COVID-19 and Updates on Cases and Investigations about Mpox, respectively. Finally, a comparison with prior works in this field is also presented to highlight the novelty and significance of this research work.

Preprint ARTICLE | doi:10.20944/preprints202205.0344.v1

Transforming Points of Single Contact Data into Linked Data

Pavlina Fragkou, Leandros Maglaras

Subject: Computer Science And Mathematics, Information Systems Keywords: Linked (open) Data; Semantic Interoperability; Data Mapping; Governmental Data; SPARQL; Ontologies

Online: 25 May 2022 (08:18:46 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202111.0073.v1

Using the Data Quality Dashboard to Improve the EHDEN Network

Clair Blacketer, Erica A Voss, Frank DeFalco, Nigel Hughes, Martijn J Schuemie, Maxim Moinat, Peter Rijnbeek

Subject: Medicine And Pharmacology, Other Keywords: data quality; OMOP CDM; EHDEN; healthcare data; real world data; RWD

Online: 3 November 2021 (09:12:54 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202110.0103.v1

Usage of Data Analytics in Improving Sourcing of Supply Chain Inputs

S M Nazmuz Sakib

Subject: Computer Science And Mathematics, Information Systems Keywords: Data Analytics; Analytics; Supply Chain Input; Supply Chain; Data Science; Data

Online: 6 October 2021 (10:38:42 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202310.1998.v1

Marburg Virus Outbreak and a New Conspiracy Theory: Findings from a Comprehensive Analysis of Web Behavior

Nirmalya Thakur, Shuqi Cui, Kesha A. Patel, Nazif Azizi, Victoria Knieling, Changhee Han, Audrey Poon, Rishika Shah

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: Marburg virus; big data; data mining; data analysis; google trends; web behavior; data science; conspiracy theory

Online: 31 October 2023 (07:02:07 CET)

Show abstract| Download PDF| Share

During virus outbreaks in the recent past web behavior mining, modeling, and analysis have served as means to examine, explore, interpret, assess, and forecast the worldwide perception, readiness, reactions, and response linked to these virus outbreaks. The recent outbreak of the Marburg Virus disease (MVD), the high fatality rate of MVD, and the conspiracy theory linking the FEMA alert signal in the United States on October 4, 2023, with MVD and a zombie outbreak, resulted in a diverse range of reactions in the general public which has transpired in a surge in web behavior in this context. This resulted in “Marburg Virus” featuring in the list of the top trending topics on Twitter on October 3, 2023, and “Emergency Alert System” and “Zombie” featuring in the list of top trending topics on Twitter on October 4, 2023. No prior work in this field has mined and analyzed the emerging trends in web behavior in this context. The work presented in this paper aims to address this research gap and makes multiple scientific contributions to this field. First, it presents the results of performing time series forecasting of the search interests related to MVD emerging from 216 different regions on a global scale using ARIMA, LSTM, and Autocorrelation. The results of this analysis present the optimal model for forecasting web behavior related to MVD in each of these regions. Second, the correlation between search interests related to MVD and search interests related to zombies (in the context of this conspiracy theory) was investigated. The findings show that there were several regions where there was a statistically significant correlation between MVD-related searches and zombie-related searches (in the context of this conspiracy theory) on Google on October 4, 2023. Finally, the correlation between zombie-related searches (in the context of this conspiracy theory) in the United States and other regions was investigated. This analysis helped to identify those regions where this correlation was statistically significant.

Preprint ARTICLE | doi:10.20944/preprints202308.0442.v1

Instrumental and Observational Problems of the Earliest Temperature Records in Italy: A Methodology for Data Recovery and Correction

Dario Camuffo, Antonio Della Valle, Francesca Becherini

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: Thermometers; Temperature records; Early instrumental meteorological series; Data rescue; Data recovery; Data correction; Climate data analysis

Online: 7 August 2023 (03:01:24 CEST)

Show abstract| Download PDF| Share

A distinction is made between data rescue (i.e., copying, digitizing and archiving) and data recovery that implies deciphering, interpreting and transforming early instrumental readings and their metadata to obtain high-quality datasets in modern units. This requires a multidisciplinary approach that includes: palaeography and knowledge of Latin and other languages to read the handwritten logs and additional documents; history of science to interpret the original text, data e metadata within the cultural frame of the 17th, 18th and early 19th century; physics and technology to recognize bias of early instruments or calibrations, or to correct for observational bias; astronomy to calculate and transform the original time in canonical hours that started from twilight. The liquid-in-glass thermometer was invented in 1641 and the earliest temperature records started in 1654. Since then, different types of thermometers were invented, based on the thermal expansion of air or selected thermometric liquids with deviation from linearity. Reference points, thermometric scales, calibration methodologies were not comparable, and not always adequately described. Thermometers had various locations and exposures, e.g., indoor, outdoor, on windows, gardens or roofs, facing different directions. Readings were made only one or a few times a day, not necessarily respecting a precise time schedule: this bias is analysed for the most popular combinations of reading times. The time was based on sundials and local Sun, but the hours were counted starting from twilight. In 1789-90 Italy changed system and all cities counted hours from their lower culmination (i.e., local midnight), so that every city had its local time; in 1866, all the Italian cities followed the local time of Rome; in 1893, the whole Italy adopted the present-day system, based on the Coordinated Universal Time and the time zones. In 1873, when the International Meteorological Committee (IMO) was founded, later transformed in World Meteorological Organization (WMO), a standardization of instruments and observational protocols was established, and all data became fully comparable. In the early instrumental period, from 1654 to 1873, the comparison, correction and homogenization of records is quite difficult, mainly because of the scarcity or even absence of metadata. This paper deals about this confused situation, discussing the main problems, but also the methodologies to recognize missing metadata, distinguish indoor from outdoor readings; correct and transform early datasets in unknown or arbitrary units into modern units; finally, in which cases it is possible to reach the quality level required by WMO. The focus is to explain the methodology needed to recover early instrumental records, i.e., the operations that should be performed to interpret, correct, and transform the original raw data into a high-quality dataset of temperature, usable for climate studies.

Preprint DATA DESCRIPTOR | doi:10.20944/preprints202308.1701.v1

A Dataset of Search Interests Related to Disease X Originating from Different Geographic Regions

Nirmalya Thakur, Kesha A. Patel, Isabella Hall, Yuvraj Nihal Duggal, Shuqi Cui

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: disease X; big data; data science; data analysis; dataset development; database; google trends; data mining; healthcare; epidemiology

Online: 24 August 2023 (05:48:54 CEST)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202303.0453.v1

Analysis of Public Discourse on Twitter involving COVID-19 and MPox: Findings from Sentiment Analysis and Text Analysis

Nirmalya Thakur

Subject: Social Sciences, Media Studies Keywords: COVID-19; MPox; Twitter; Big Data; Data Mining; Data Analysis; Sentiment Analysis; Data Science; Social Media; Monkeypox

Online: 27 March 2023 (08:39:28 CEST)

Show abstract| Download PDF| Share

Mining and analysis of the Big Data of Twitter conversations have been of significant interest to the scientific community in the fields of healthcare, epidemiology, big data, data science, computer science, and their related areas, as can be seen from several works in the last few years that focused on sentiment analysis and other forms of text analysis of Tweets related to Ebola, E-Coli, Dengue, Human papillomavirus (HPV), Middle East Respiratory Syndrome (MERS), Measles, Zika virus, H1N1, influenza-like illness, swine flu, flu, Cholera, Listeriosis, cancer, Liver Disease, Inflammatory Bowel Disease, kidney disease, lupus, Parkinson's, Diphtheria, and West Nile virus. The recent outbreaks of COVID-19 and MPox have served as "catalysts" for Twitter usage related to seeking and sharing information, views, opinions, and sentiments involving both these viruses. While there have been a few works published in the last few months that focused on performing sentiment analysis of Tweets related to either COVID-19 or MPox, none of the prior works in this field thus far involved analysis of Tweets focusing on both COVID-19 and MPox at the same time. With an aim to address this research gap, a total of 61,862 Tweets that focused on Mpox and COVID-19 simultaneously, posted between May 7, 2022, to March 3, 2023, were studied to perform sentiment analysis and text analysis. The findings of this study are manifold. First, the results of sentiment analysis show that almost half the Tweets (the actual percentage is 46.88%) had a negative sentiment. It was followed by Tweets that had a positive sentiment (31.97%) and Tweets that had a neutral sentiment (21.14%). Second, this paper presents the top 50 hashtags that were used in these Tweets. Third, it presents the top 100 most frequently used words that are featured in these Tweets. The findings of text analysis show that some of the commonly used words involved directly referring to either or both viruses. In addition to this, the presence of words such as "Polio", "Biden", "Ukraine", "HIV", "climate", and "Ebola" in the list of the top 100 most frequent words indicate that topics of conversations on Twitter in the context of COVID-19 and MPox also included a high level of interest related to other viruses, President Biden, and Ukraine. Finally, a comprehensive comparative study that involves a comparison of this work with 49 prior works in this field is presented to uphold the scientific contributions and relevance of the same.

Working Paper ARTICLE

Business Intelligence and Its Big Evolution

Andres Velosa, Gustavo Pabon

Subject: Engineering, Automotive Engineering Keywords: Business Intelligence; Data warehouse; Data Marts; Architecture; Data; Information; cloud; Data Mining; evolution; technologic companies; tools; software

Online: 24 March 2021 (13:06:53 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.1570.v1

Consore: A Powerful Federated Data Mining Tool Driving a French Research Network to Accelerate Cancer Research

Julien Guérin, Amine Nahid, Louis Tassy, Marc Deloger, François Bocquet, Simon Thézenas, Emmanuel Desandes, Marie-Cécile Le Deley, Xavier DURANDO, Anne Jaffré, Ikram Es Saad, Hugo Crochet, Marie Le Morvan, François Lion, Judith Raimbourg, Oussama Khay, Franck Craynest, Alexia Giro, Yec'han Laizet, Aurélie Bertaut, Frédérik Joly, Alain Livartowski, Pierre Etienne Heudel

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: cancer research; cancer; natural language processing; data mining; data warehouse; big data

Online: 26 November 2023 (05:13:14 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202111.0410.v1

Design and Implementation of Efficient Transmission of Cloud Data in Wireless Media

Virendra Pandharipant Nikam, Sheetal S Dhande

Subject: Engineering, Control And Systems Engineering Keywords: Data compression; data hiding; psnr; mse; virtual data; public cloud; quantization error

Online: 22 November 2021 (15:17:12 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201808.0350.v2

Integration of Data Mining Clustering Approach with the Personalized E-Learning System

Samina Kausar, Huahu Xu, Iftikhar Hussain, Wenhau Zhu, Misha Zahid

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: big data; clustering; data mining; educational data mining; e-learning; profile learning

Online: 19 October 2018 (05:58:05 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints201807.0059.v1

Data Normalization in NMR-based Metabolomics

Helena Zacharias, Michael Altenbuchinger, Wolfram Gronwald

Subject: Biology And Life Sciences, Biophysics Keywords: data normalization; data scaling; zero-sum; metabolic fingerprinting; NMR; statistical data analysis

Online: 3 July 2018 (16:22:31 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0357.v2

The Path to Data Protection Governance in China Mainland

Bing Chen, Yongji Liu

Subject: Social Sciences, Law Keywords: data protection; personal privacy; cybersecurity; data security

Online: 9 April 2024 (12:02:20 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1372.v1

Leveraging Visualization and Machine Learning Techniques in Education: A Case Study of K-12 State Assessment Data

Loni Taylor, Vibhuti Gupta, Kwanghee Jung

Subject: Computer Science And Mathematics, Analysis Keywords: Data Visualization; Big Data; AI; Machine Learning

Online: 23 February 2024 (10:39:04 CET)

Show abstract| Download PDF| Share

Working Paper ARTICLE

The Analysis and the Measurement of Poverty: An Interval Based Composite Indicator Approach

Carlo Drago

Subject: Business, Economics And Management, Econometrics And Statistics Keywords: poverty; composite indicators; interval data; symbolic data

Online: 24 August 2021 (15:46:09 CEST)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Development of Cost and Schedule Data Integration Algorithm based on Big Data Technology

Daegu Cho, Myungdo Lee, Jihye Shin

Subject: Computer Science And Mathematics, Computer Science Keywords: big data; data integration; EVMS; construction management

Online: 30 October 2020 (15:35:00 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201701.0090.v1

An Automatic Matcher and Linker for Transportation Datasets

Ali Masri, Karine Zeitouni, Zoubida Kedad, Bertrand Leroy

Subject: Computer Science And Mathematics, Information Systems Keywords: transportation data; data interlinking; automatic schema matching

Online: 20 January 2017 (03:38:06 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1391.v1

An Automated Method for Extracting and Analyzing Railway Infrastructure Cost Data

Daniel Adanza Dopazo, Lamine Mahdjoubi, Bill Gething

Subject: Engineering, Transportation Science And Technology Keywords: data extraction; data mining; railway infrastructure costs; infrastructure costs data analysis; cost analysis

Online: 18 August 2023 (16:03:08 CEST)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Model for the Collection and Analysis of Data from Teachers and Students, Supported by Academic Analytics

Fredys A. Simanca H., Isabel Hernández Arteaga, María Elsa Unriza Puin, Fabian Blanco Garrido, Jaime Paez Paez, Jairo Cortes Méndez

Subject: Computer Science And Mathematics, Information Systems Keywords: Academic Analytics; data storage; education and big data; analysis of data; learning analytics

Online: 19 July 2020 (20:37:39 CEST)

Show abstract| Download PDF| Share

Business Intelligence, defined by [1] as "the ability to understand the interrelations of the facts that are presented in such a way that it can guide the action towards achieving a desired goal", has been used since 1958 for the transformation of data into information, and of information into knowledge, to be used when making decisions in a business environment. But, what would happen if we took the same principles of business intelligence and applied them to the academic environment? The answer would be the creation of Academic Analytics, a term defined by [2] as the process of evaluating and analyzing organizational information from university systems for reporting and making decisions, whose characteristics allow it to be used more and more in institutions, since the information they accumulate about their students and teachers gathers data such as academic performance, student success, persistence, and retention [5]. Academic Analytics enables an analysis of data that is very important for making decisions in the educational institutional environment, aggregating valuable information in the academic research activity and providing easy to use business intelligence tools. This article shows a proposal for creating an information system based on Academic Analytics, using ASP.Net technology and trusting storage in the database engine Microsoft SQL Server, designing a model that is supported by Academic Analytics for the collection and analysis of data from the information systems of educational institutions. The idea that was conceived proposes a system that is capable of displaying statistics on the historical data of students and teachers taken over academic periods, without having direct access to institutional databases, with the purpose of gathering the information that the director, the teacher, and finally the student need for making decisions. The model was validated with information taken from students and teachers during the last five years, and the export format of the data was pdf, csv, and xls files. The findings allow us to state that it is extremely important to analyze the data that is in the information systems of the educational institutions for making decisions. After the validation of the model, it was established that it is a must for students to know the reports of their academic performance in order to carry out a process of self-evaluation, as well as for teachers to be able to see the results of the data obtained in order to carry out processes of self-evaluation, and adaptation of content and dynamics in the classrooms, and finally for the head of the program to make decisions.

Preprint ARTICLE | doi:10.20944/preprints201812.0071.v1

Data Governance and Sovereignty in Urban Data Spaces Based on Standardized ICT Reference Architectures

Silke Cuno, Lina Bruns, Nikolay Tcholtchev, Philipp Lämmel, Ina Schieferdecker

Subject: Engineering, Electrical And Electronic Engineering Keywords: data governance; data sovereignty; urban data spaces; ICT reference architecture; open urban platform

Online: 6 December 2018 (05:09:54 CET)

Show abstract| Download PDF| Share

Preprint DATA DESCRIPTOR | doi:10.20944/preprints202109.0370.v1

The SERL Observatory Dataset: Longitudinal Smart Meter Electricity and Gas Data, Survey, EPC and Climate Data for Over 13,000 GB Households

Ellen Webborn, Jessica Few, Eoghan McKenna, Simon Elam, Martin Pullinger, Ben Anderson, David Shipworth, Tadj Oreszczyn

Subject: Engineering, Energy And Fuel Technology Keywords: smart meter data; household survey; EPC; energy data; energy demand; energy consumption; longitudinal; energy modelling; electricity data; gas data

Online: 22 September 2021 (10:16:05 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints201807.0038.v1

Towards the Provision of Accurate Atomic Data for Neutral Iron

Andrew Conroy, Catherine Ramsbottom, Connor Ballance, Francis Keenan

Subject: Physical Sciences, Atomic And Molecular Physics Keywords: atomic data

Online: 3 July 2018 (11:25:13 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0740.v1

Functional Process Control (FPC): A Methodology to Reduce Variability

Joaquín Sancho, Javier Martínez, Jorge Pastor, Carlos Cajal

Subject: Computer Science And Mathematics, Applied Mathematics Keywords: functional data; quality; non-normal data; variability; outlier

Online: 10 April 2024 (15:52:41 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.1016.v1

Three-Stage Sampling Algorithm for Highly Imbalanced Multi-Classification Time Series Data Sets

Haoming Wang

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Imbalanced data; Data preprocessing; Sampling; Tomek Links; DTW

Online: 14 September 2023 (14:00:42 CEST)

Show abstract| Download PDF| Share

Purpose To alleviate the data imbalance problem caused by subjective and objective reasons, scholars have developed different data preprocessing algorithms, among which undersampling algorithms are widely used because of their fast and efficient performance. However, when the number of samples of some categories in a multi-classification dataset is too small to be processed by sampling, or the number of minority class samples is only 1 to 2, the traditional undersampling algorithms will be weakened. Methods This study selects 9 multi-classification time series datasets with extremely few samples as the objects, fully considers the characteristics of time series data, and uses a three-stage algorithm to alleviate the data imbalance problem. Stage one: Random oversampling with disturbance items increases the number of sample points; Stage two: On this basis, SMOTE (Synthetic Minority Oversampling Technique) oversampling; Stage three: Using dynamic time warping distance to calculate the distance between sample points, identify the sample points of Tomek Links at the boundary, and clean up the boundary noise.Results This study proposes a new sampling algorithm. In the 9 multi-classification time series datasets with extremely few samples, the new sampling algorithm is compared with four classic undersampling algorithms, ENN (Edited Nearest Neighbours), NCR (Neighborhood Cleaning Rule), OSS (One Side Selection) and RENN (Repeated Edited Nearest Neighbours), based on macro accuracy, recall rate and F1-score evaluation indicators. The results show that: In the 9 datasets selected, the dataset with the most categories and the least number of minority class samples, FiftyWords, the accuracy of the new sampling algorithm is 0.7156, far beyond ENN, RENN, OSS and NCR; its recall rate is also better than the four undersampling algorithms used for comparison, at 0.7261; its F1-score is increased by 200.71%, 188.74%, 155.29% and 85.61%, respectively, relative to ENN, RENN, OSS, and NCR; In the other 8 datasets, this new sampling algorithm also shows good indicator scores.Conclusion The new algorithm proposed in this study can effectively alleviate the data imbalance problem of multi-classification time series datasets with many categories and few minority class samples, and at the same time clean up the boundary noise data between classes.

Preprint ARTICLE | doi:10.20944/preprints202307.1117.v1

Design and Analysis of Query Models Database Preservation Information Systems Digitization of History and Endowments; Case Study of History and Waqf of Sumedang Larang Kingdom Indonesia

R. Sudrajat, Budi Nurani Ruchjana, Atje Setiawan Abdullah, Rahmat Budiarto

Subject: Computer Science And Mathematics, Information Systems Keywords: history; endowments; query model; digital data; physical data

Online: 17 July 2023 (15:11:18 CEST)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202305.1694.v1

Synthetic Data & the Future of Women's Health: A Synergistic Relationship

Gayathri Delanerolle, Peter Phiri, Heitor Cavalini, David Benfield, Ashish Shetty, Yassine Bouchareb, Jian Shi, Alain Zemkoho

Subject: Medicine And Pharmacology, Clinical Medicine Keywords: Womens Health; Data Science; Data Methods; Artificial Intelligence

Online: 24 May 2023 (04:48:58 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202206.0335.v1

The Dataharmonizer: a Tool for Faster Data Harmonization, Validation, Aggregation, and Analysis of Pathogen Genomics Contextual Information

Ivan Gill, Emma Griffiths, Damion Dooley, Rhiannon Cameron, Sarah Savić Kallesøe, Nithu Sara John, Anoosha Sehar, Gurinder Gosal, David Alexander, Madison Chapel, Matthew Croxen, Benjamin Delisle, Rachelle Di Tullio, Daniel Gaston, Ana Duggan, Jennifer Guthrie, Mark Horsman, Esha Joshi, Levon Kearney, Natalie Knox, Lynette Lau, Jason LeBlanc, Vincent Li, Pierre Lyons, Keith MacKenzie, Andrew McArthur, Emilie Panousis, John Palmer, Natalie Prystajecky, Kerri Smith, Jennifer Tanner, Christopher Townend, Andrea Tyler, Gary Van Domselaar, William Hsiao

Subject: Computer Science And Mathematics, Information Systems Keywords: metadata; contextual data; harmonization; genomic surveillance; data management

Online: 24 June 2022 (08:46:04 CEST)

Show abstract| Download PDF| Share

Pathogen genomics is a critical tool for public health surveillance, infection control, outbreak investigations, as well as research. In order to make use of pathogen genomics data, it must be interpreted using contextual data (metadata). Contextual data includes sample metadata, laboratory methods, patient demographics, clinical outcomes, and epidemiological information. However, the variability in how contextual information is captured by different authorities and how it is encoded in different databases poses challenges for data interpretation, integration, and its use/re-use. The DataHarmonizer is a template-driven spreadsheet application for harmonizing, validating, and transforming genomics contextual data into submission-ready formats for public or private repositories. The tool’s web browser-based JavaScript environment enables validation and its offline functionality and local installation increases data security. The DataHarmonizer was developed to address the data sharing needs that arose during the COVID-19 pandemic, and was used by members of the Canadian COVID Genomics Network (CanCOGeN) to harmonize SARS-CoV-2 contextual data for national surveillance and for public repository submission.In order to support coordination of international surveillance efforts, we have partnered with the Public Health Alliance for Genomic Epidemiology to also provide a template conforming to its SARS-CoV-2 contextual data specification for use worldwide. Templates are also being developed for One Health and foodborne pathogens. Overall, the DataHarmonizer tool improves the effectiveness and fidelity of contextual data capture as well as its subsequent usability. Harmonization of contextual information across authorities, platforms and systems globally improves interoperability and reusability of data for concerted public health and research initiatives to fight the current pandemic and future public health emergencies. While initially developed for the COVID-19 pandemic, its expansion to other data management applications and pathogens is already underway.

Preprint ARTICLE | doi:10.20944/preprints202108.0471.v1

Identifying the Main Risk Factors for CVD Prediction Using Machine Learning Algorithms

Luis Rolando Guarneros-Nolasco, Nancy Aracely Cruz-Ramos, Giner Alor-Hernández, Lisbeth Rodríguez-Mazahua, José Luis Sánchez-Cervantes

Subject: Computer Science And Mathematics, Information Systems Keywords: Big data; Health prevention; Machine learning; Medical data

Online: 24 August 2021 (14:00:12 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202106.0738.v1

Combination of Using Pairwise Comparisons and Composite Reference Series: A New Approach in the Homogenization of Climatic Time Series

Peter Domonkos

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: time series; homogenization; ACMANT; observed data; data accuracy

Online: 30 June 2021 (13:08:39 CEST)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Generating Fake ECGs using GANs for Anonymizing Healthcare Data

Esteban Piacentino, Alvaro Guarner, Cecilio Angulo

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: GAN; ECG; anonymization; healthcare data; sensors; data transformation

Online: 3 September 2020 (05:26:01 CEST)

Show abstract| Download PDF| Share

Search Results

1400 articles found