Search | Preprints.org

Preprint ARTICLE | doi:10.20944/preprints202112.0286.v2

Multisource Spatial Data Integration for Use Cases Applications

Subject: Engineering, Control And Systems Engineering Keywords: data integration; interoperability; harmonization; GeoBIM; metadata

Online: 7 June 2022 (11:10:07 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202206.0335.v1

The Dataharmonizer: a Tool for Faster Data Harmonization, Validation, Aggregation, and Analysis of Pathogen Genomics Contextual Information

Ivan Gill, Emma Griffiths, Damion Dooley, Rhiannon Cameron, Sarah Savić Kallesøe, Nithu Sara John, Anoosha Sehar, Gurinder Gosal, David Alexander, Madison Chapel, Matthew Croxen, Benjamin Delisle, Rachelle Di Tullio, Daniel Gaston, Ana Duggan, Jennifer Guthrie, Mark Horsman, Esha Joshi, Levon Kearney, Natalie Knox, Lynette Lau, Jason LeBlanc, Vincent Li, Pierre Lyons, Keith MacKenzie, Andrew McArthur, Emilie Panousis, John Palmer, Natalie Prystajecky, Kerri Smith, Jennifer Tanner, Christopher Townend, Andrea Tyler, Gary Van Domselaar, William Hsiao

Subject: Computer Science And Mathematics, Information Systems Keywords: metadata; contextual data; harmonization; genomic surveillance; data management

Online: 24 June 2022 (08:46:04 CEST)

Show abstract| Download PDF| Share

Pathogen genomics is a critical tool for public health surveillance, infection control, outbreak investigations, as well as research. In order to make use of pathogen genomics data, it must be interpreted using contextual data (metadata). Contextual data includes sample metadata, laboratory methods, patient demographics, clinical outcomes, and epidemiological information. However, the variability in how contextual information is captured by different authorities and how it is encoded in different databases poses challenges for data interpretation, integration, and its use/re-use. The DataHarmonizer is a template-driven spreadsheet application for harmonizing, validating, and transforming genomics contextual data into submission-ready formats for public or private repositories. The tool’s web browser-based JavaScript environment enables validation and its offline functionality and local installation increases data security. The DataHarmonizer was developed to address the data sharing needs that arose during the COVID-19 pandemic, and was used by members of the Canadian COVID Genomics Network (CanCOGeN) to harmonize SARS-CoV-2 contextual data for national surveillance and for public repository submission.In order to support coordination of international surveillance efforts, we have partnered with the Public Health Alliance for Genomic Epidemiology to also provide a template conforming to its SARS-CoV-2 contextual data specification for use worldwide. Templates are also being developed for One Health and foodborne pathogens. Overall, the DataHarmonizer tool improves the effectiveness and fidelity of contextual data capture as well as its subsequent usability. Harmonization of contextual information across authorities, platforms and systems globally improves interoperability and reusability of data for concerted public health and research initiatives to fight the current pandemic and future public health emergencies. While initially developed for the COVID-19 pandemic, its expansion to other data management applications and pathogens is already underway.

Preprint ARTICLE | doi:10.20944/preprints202202.0139.v1

Segmentation Uncertainty Estimation as a Sanity Check for Image Biomarker Studies

Ivan Zhovannik, Dennis Bontempi, Alessio Romita, Elisabeth Pfaehler, Sergey Primakov, Andre Dekker, Johan Bussink, Alberto Traverso, René Monshouwer

Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: uncertainty; prognostic modeling; image biomarkers; radiomics; radiomics harmonization

Online: 9 February 2022 (11:50:10 CET)

Show abstract| Download PDF| Supplementary Files| Share

Problem. Image biomarker analysis, also known as radiomics, is a tool for tissue characterization and treatment prognosis that relies on routinely acquired clinical images and delineations. Due to the uncertainty in image acquisition, processing, and segmentation (delineation) protocols, radiomics often lacks reproducibility. Radiomics harmonization techniques have been proposed as a solution to reduce these sources of uncertainty and/or their influence on the prognostic model performance. A relevant question is how to estimate the protocol-induced uncertainty of a specific image biomarker, what the effect is on the model performance, and how to optimize the model given the uncertainty. In this manuscript, we show how protocol uncertainty can drastically reduce prognostic model performance. We introduce an effect-size measure η that assesses the protocol-induced uncertainty versus the measurable effect. Methods. Two non-small cell lung cancer (NSCLC) cohorts, composed of 421 and 240 patients respectively, were used for training and testing. Per patient, a Monte Carlo algorithm was used to generate three hundred synthetic contours with a surface dice tolerance measure less than 1.18 mm with respect to the original GTV. These contours were subsequently used to derive 104 radiomic features, which were ranked on their relative sensitivity to contour perturbation, expressed in the parameter η. The top four (low η) and the bottom four (high η) features were selected for two models based on Cox proportional hazards model. To investigate the influence of segmentation uncertainty on the prognostic model, we trained and tested the setup in 5000 augmented realizations (using a Monte Carlo sampling method); the log-rank test was used to assess the stratification performance and stability to segmentation uncertainty. Results. Although both low and high η setup showed significant testing set log-rank p-values (p=0.01) in the original GTV delineations (without segmentation uncertainty introduced), in the model with high uncertainty to effect ratio only around 30% of the augmented realizations resulted in model performance with p < 0.05 in the test set. In contrast, the low η setup performed with log-rank p < 0.05 in 90% of the augmented realizations. Moreover, the high η setup classification was uncertain for 50% of the subjects in the testing set (for 80% agreement rate), whereas the low η setup was uncertain only in 10% of the cases. The code and part of the data are available at https://github.com/Maastro-CDS-Imaging-Group/sure. Discussion. Estimating image biomarker model performance based only on the original GTV segmentation without considering segmentation uncertainty may be deceiving. The model might result in a significant stratification performance, but can be unstable for delineation variations, which are inherent to manual segmentation. Simulating segmentation uncertainty using the method described allows for more stable image biomarker estimation, selection, and model development. The segmentation uncertainty estimation method described here is universal and can be extended to estimate other protocol uncertainties (such as image acquisition and pre-processing).

Preprint ARTICLE | doi:10.20944/preprints202311.0104.v1

Conceptual Design of a Generic Data Harmonization Process for OMOP CDM

Elisa Henke, Michele Zoch, Yuan Peng, Ines Reinecke, Martin Sedlmayr, Franziska Bathelt

Subject: Public Health And Healthcare, Other Keywords: OMOP; OHDSI; interoperability; data harmonization; clinical data; claims data

Online: 2 November 2023 (07:45:02 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints201910.0275.v1

Harmonization of Landsat and Sentinel 2 for Crop Monitoring in Drought Prone Areas: Case Studies of Ninh Thuan (Vietnam) and Bekaa (Lebanon)

Minh D. Nguyen, Oscar B. Villanueva, Duong D. Bui, Phong T. Nguyen, Lars Ribbe

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Landsat; Sentinel 2; harmonization; crop monitoring; Google Earth Engine

Online: 24 October 2019 (06:02:04 CEST)

Show abstract| Download PDF| Share

Proper satellite-based crop monitoring applications at the farm-level often require near-daily imagery at medium to high spatial resolution. The synthesizing of ongoing satellite missions by ESA (Sentinel 2) and NASA (Landsat7/8) provides this unprecedented opportunity at a global scale; nonetheless, this is rarely implemented because these procedures are data demanding and computationally intensive. This study developed a complete stream processing in the Google Earth Engine cloud platform to generate harmonized surface reflectance images of Landsat7,8 and Sentinel 2 missions. The harmonized images were generated for two agriculture schemes in Bekaa (Lebanon) and Ninh Thuan (Vietnam) during the period 2018-2019. We evaluated the performance of several pre-processing steps needed for the harmonization including image co-registration, brdf correction, topographic correction, and band adjustment. This study found that the miss-registration between Landsat 8 and Sentinel 2 images, varied from 10 meters in Ninh Thuan, Vietnam to 32 meters in Bekaa, Lebanon, and if not treated, posed a great impact on the quality of the harmonized dataset. Analysis of a pair overlapped L8-S2 images over the Bekaa region showed that after the harmonization, all band-to-band spatial correlations were greatly improved from (0.57, 0.64, 0.67, 0.75, 0.76, 0.75, 0.79) to (0.87, 0.91, 0.92, 0.94, 0.97, 0.97, 0.96) in bands (blue, green, red, nir,swir1,swir2, ndvi) respectively. We demonstrated that dense observation of the harmonized dataset can be very helpful for characterizing cropland in highly dynamic areas. We detected unimodal, bimodal and trimodal shapes in the temporal NDVI patterns (likely cycles of paddy rice) in Ninh Thuan province only during the year 2018. We fitted the temporal signatures of the NDVI time series using harmonic (Fourier) analysis. Derived phase (angle from the starting point to the cycle's peak) and amplitude (the cycle's height) were combined with max-NDVI to generate an R-G-B image. This image highlighted croplands as colored pixels (high phase and amplitude) and other types of land as grey/dark pixels (low phase/amplitude). Generated harmonized datasets that contain surface reflectance images (bands blue, green, red, nir, swir1, swir2, and ndvi at 30 meters) over the two studied sites are provided for public usage and testing.

Working Paper CONCEPT PAPER

The Hearing Impairment Ontology: A tool for unifying hearing impairment knowledge to enhance collaborative research

Jade Hotchkiss, Noluthando Manyisa, Samuel Mawuli Adadey, Oluwafemi Oluwole, Edmond Wonkam, Khuthala Mnika, Abdoulaye Yalcouye, Victoria Nembaware, Melissa Haendel, Nicole Vasilevsky, Nicola Mulder, Simon Jupp, Ambroise Wonkam, Gaston Kuzamunu Mazandu

Subject: Medicine And Pharmacology, Otolaryngology Keywords: hearing impairment; hearing loss; ontology; data harmonization; meta-analysis

Online: 19 September 2019 (11:37:08 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202008.0133.v1

A Review on Viral Data Sources and Integration Methods for COVID-19 Mitigation

Anna Bernasconi, Arif Canakoglu, Marco Masseroli, Pietro Pinoli, Stefano Ceri

Subject: Biology And Life Sciences, Virology Keywords: epidemic; viral sequences; genomics; metadata; data harmonization; integration and search

Online: 5 August 2020 (10:58:27 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202307.1397.v1

Landsat-7 ETM+, Landsat-8 OLI, and Sentinel-2 MSI Surface Reflectance Cross-Comparison and Harmonization over the Mediterranean Basin Area

Martina Perez, Marcello Vitale

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Harmonization; Surface Reflectance; Landsat-7; Landsat-8; Sentinel-2; Mediterranean basin

Online: 20 July 2023 (10:49:30 CEST)

Show abstract| Download PDF| Supplementary Files| Share

In the Mediterranean area, vegetation dynamics and phenology analyzed over a long time can have an important role in highlighting changes in land use and cover as well as the effect of climate change. Over the last 30 years, remote sensing has played an essential role in bringing about these changes thanks to many types of observations and techniques. Satellite images are to be considered an important tool to grasp these dynamics and evaluate them in an inexpensive and multidisciplinary way thanks to Landsat and Sentinel satellite constellations. The integration of these tools holds a dual potential: on one hand, allowing to obtain longer historical series of reflectance data, while on the other hand, making data available with a higher frequency even within a specific timeframe. The study aims to conduct a comprehensive cross-comparison analysis of long-time series pixel values in the Mediterranean regions. For this scope comparisons between Landsat-7 (ETM+), Landsat-8 (OLI), and Sentinel-2 (MSI) satellite sensors were conducted based on surface reflectance products. We evaluated these differences using Ordinary Least Squares (OLS) and Major Axis linear regression (RMA) analysis on points extracted from over 15,000 images across the Mediterranean basin area from 2017 to 2020. Minor but consistent differences were noted, necessitating the formulation of suitable adjustment equations to better align Sentinel-2 reflectance values with those of Landsat-7 or Landsat-8. The results of the analysis are compared with the most used harmonization coefficients proposed in the literature, revealing significant differences. The root mean square deviation, the mean difference and the orthogonal distance regression (ODR) slope show an improvement of the parameters for both models used (OLS and RMA) in this study. The discrepancies in reflectance values lead to corresponding variations in the estimation of biophysical parameters, such as NDVI, showing an increase in the ODR slope of 0.3. Despite differences in spatial, spectral, and temporal characteristics, we demonstrate that integration of these datasets is feasible through the application of band-wise regression corrections for a sensitive and heterogeneous area like those of the Mediterranean basin area.

Preprint ARTICLE | doi:10.20944/preprints202404.1034.v1

Harmonization of Soft Power and Institutional Skills: Montenegro’s Path to Accession to the European Union in the Environmental Sector

Srna Sudar, Vladimir M. Cvetković, Aleksandar Ivanov

Subject: Social Sciences, Geography, Planning And Development Keywords: environment; soft power; accession; alignment; harmonization; institutional skills; governance; Montenegro; European Union.

Online: 16 April 2024 (13:52:06 CEST)

Show abstract| Download PDF| Share

This research investigates the alignment of soft power and institutional skills in Montenegro's journey towards accession to the European Union (EU), with a particular focus on the environmental sector. An online survey targeting individuals employed in state institutions directly engaged in negotiation processes, notably the Ministry of Sustainable Development and Tourism and the Agency for Nature and Environmental Protection, was conducted. The research conducted an online survey, distributed before and after the summer recess to accommodate the transition of power following parliamentary elections, aimed at assessing the effectiveness of current personnel and identifying areas for improvement in staffing and negotiation strategies within Montenegro's environmental sector. Employing diverse methodologies, the survey's analysis delved into the demographic, social, and professional backgrounds of respondents. It explored their roles within institutions, involvement in environmental negotiations, and possession of relevant skills and expertise. Furthermore, respondents' knowledge of environmental issues, legislation, and challenges facing the country was assessed to gauge institutional capacity for environmental governance. Demographic data, including gender, age, education and regional origin, were collected to understand gender-specific attitudes and regional disparities in environmental perspectives. The sample of 84 individuals, comprising executives and employees from both institutions, provided insights into the age structure and regional diversity of personnel involved in negotiation tasks for Chapter 27. The selection of the Ministry of Sustainable Development and Tourism and the Agency for Nature and Environmental Protection reflects their pivotal roles in shaping Montenegro's environmental policies and addressing climate change challenges. This study aims to illuminate the dynamics of environmental governance within Montenegro's state administration, contributing to the country's path towards EU accession. The research findings highlight the critical need for Montenegro to prioritize strategic initiatives in personnel management, skill development, and institutional capacity-building within its environmental sector. The implications of this research extend beyond academia to inform policymaking and societal action, emphasizing the urgency for Montenegro to bolster its environmental sector capabilities, fostering both EU alignment and sustainable governance practices for the benefit of present and future generations.

Preprint ARTICLE | doi:10.20944/preprints202402.1478.v1

Sustainable VAT harmonization theory for cross-border e-commerce transactions in the European Union

Patrícia Becsky-Nagy, Cserne Panka Póta