Preprint
Article

This version is not peer-reviewed.

Machine Learning-Enhanced Architecture Model for Integrated and FHIR-Based Health Data

  † These authors contributed equally to this work.

A peer-reviewed article of this preprint also exists.

Submitted:

09 September 2025

Posted:

09 September 2025

You are already at the latest version

Abstract
The widespread fragmentation of patient information across disparate systems and the absence of standardized integration mechanisms hinder efficient and comprehensive medical diagnostics. To overcome these limitations, this work presents an architecture model that supports physicians in the diagnostic process, combining clinical and socio- health information (patients’ medical history) with extracted data from diagnostic reports and images. This architecture allows the identification of risk assessment related to a clinical condition and displays only the necessary information for diagnosis, through the definition of a Decision Support System by leveraging the integration of data from diagnostic images, patient-collected data, and data from heterogeneous sources. Furthermore, the architecture includes the standardization of retrieved and processed information using the international HL7 Fast Healthcare Interoperability Resources (FHIR) standard to enable full integration with Health Information Systems (such as Electronic Health Records and Telemedicine Systems). In this context, a case study concerning the clinical condition of breast cancer is described to demonstrate the functionalities of the architecture, and an AI-based Risk Assessment is performed using ultrasound images. We demonstrate the capabilities of the architecture through a patient-centered mobile Android Application specifically developed for this purpose.
Keywords: 
;  ;  ;  ;  

1. Introduction

In recent years, digital healthcare has witnessed an exponential growth in the volume and variety of clinical data generated from heterogeneous sources such as Health Information Systems (HISs), wearable devices, imaging systems, laboratories, and mobile applications. However, this abundance of information has not translated into effective clinical integration: data often remain fragmented, inaccessible, or non-interoperable across different platforms and institutions. In addition, missing or incomplete modalities are common, as not all patients undergo the same imaging protocols or complete every form, while clinical reports may vary in detail and structure. Furthermore, differences in data formats, annotation standards, and acquisition devices introduce additional complexity. Addressing these issues often requires advanced preprocessing pipelines, including data harmonization techniques, natural language processing tools to extract information from unstructured text, and robust imputation or domain adaptation strategies to manage incomplete or noisy data. These methods are essential to ensure that multimodal models achieve reliable performance across heterogeneous and real-world clinical datasets.
To address these issues, HL7 Fast Healthcare Interoperability Resources (FHIR) has emerged as a modern and modular standard designed to support interoperable health data exchange through well-known web technologies such as RESTful APIs, JSON, and XML. FHIR organizes clinical information into granular and reusable "resources" (e.g., Patient, Observation, Condition), enabling flexible integration among HISs, clinical systems, and computational models [1], including those based on Artificial Intelligence (AI). This scenario hinders the full potential of AI-based tools and Clinical Decision Support Systems (CDSS), which rely on structured, standardized, and semantically rich data to deliver reliable, reproducible, and personalized recommendations [2].
The use of standardized architectures like FHIR not only improves data sharing and quality but also promotes traceability, explainability, and regulatory compliance. Moreover, according to a recent systematic review, over 98% of CDSS tools developed between 2018 and 2021 adopted FHIR as the main interoperability standard [3].
The need for standardized architectures is also evident in companies like Empatica 1, which are developing AI-based wearable devices for continuous monitoring of conditions such as epilepsy and respiratory infections. Although FHIR is not explicitly used, its emphasis on interoperability, regulatory compliance, and real-time processing reflects the same structural need for a unified clinical architecture. For instance, the FHIR standard was tested to collect ECG data from wearable devices, enabling the transformation of a privately defined data format into a common data model [4].
In the medical field, AI-based CDSS and Computer-Aided Diagnosis (CAD) tools can be used to support clinicians in their daily diagnosis process, e.g., radiologists, by automating the detection and classification of lesions in diagnostic images. Indeed, systems like the Automated Breast Ultrasound System (ABUS) have shown advantages over handheld ultrasound images, improving lesion visualization and reducing operator dependency [5]. Furthermore, the adoption of FHIR to standardize diagnostic metadata and clinical information enables the development of interoperable, reusable, and scalable CDSS, thus accelerating knowledge transfer across diverse clinical environments [6].
To concretely demonstrate the impact of such standards, breast cancer serves as a representative case study. It remains one of the leading causes of cancer mortality among women worldwide, and early diagnosis is crucial for improving clinical outcomes. Ultrasound is frequently employed as a screening modality due to its accessibility and safety, but its interpretation is highly operator-dependent, making the diagnosis prone to variability [7].
Our research aims to enhance the integration of AI-based CDSS into clinical practice by adopting the interoperable HL7 FHIR standard. The case study of automated breast cancer diagnosis via ultrasound images highlights the potential of this architecture as a scalable and extensible model for other clinical domains. Finally, we propose a patient-centered mobile application, namely "InferCare", in order to show the potential of the proposed architecture.

2. Related Works

The growing adoption of the HL7 FHIR standard has stimulated a wide range of initiatives addressing different aspects of healthcare data management, from interoperability with legacy systems to AI-driven clinical applications and oncology-specific profiling. However, these contributions are often fragmented, each addressing a specific challenge, such as data exchange, integration of mobile health applications, or definition of oncology data elements, without providing a comprehensive framework that unifies anamnesis, structured data (e.g. diagnostic imaging, medical reports, ...), and intelligent decision support.
The use of the FHIR standard into HISs has favored the development of solutions aimed at professionals, enhancing data accessibility, quality, and interoperability.
Within the field of interoperability, one of the most influential initiatives is “SMART on FHIR” [8]. This approach proposes a modular framework that not only fosters interoperability, but also facilitates the secure integration of third-party applications into Health Information Systems (HISs), and in particular into Electronic Health Records (EHRs). Through semantically constrained FHIR profiles, OAuth2 2 authorization, and OpenID Connect 3 authentication, the platform SMART on FHIR enables the development of reusable clinical applications, easily integrated into healthcare professionals’ workflows. In this context, Drishti [9] extends the "Open mHealth" framework with a modular sense–plan–act architecture, designed to enable personalised behavioural interventions in mobile health (mHealth). It connects data collection, planning, and alert delivery modules via RESTful APIs and FHIR-compatible backends by integrating FHIR resources such as Observation and CarePlan. This supports seamless integration with clinical systems such as "OpenMRS" 4.
Open mHealth uses FHIR as the canonical format for data exchange and storage, promoting interoperability, reusability, and modular development across different mobile health applications. In this way, it is possible to collect and centralize heterogeneous data, both generated by mobile devices and coming from clinical workflows, similar to what happens for our proposed architecture, where the goal is to integrate and standardize heterogeneous patient information.
Beyond application layer frameworks such as SMART in FHIR and modular platforms such as Drishti, the "MDIRA" initiative [10] offers a vendor-neutral, standards-based reference architecture for clinical device interoperability. Using IEEE 11073 semantics, IHE profiles, and HL7 FHIR messaging, MDIRA enables the seamless and secure integration of medical devices in hospital, home, and hospital-in-home settings. MDIRA supports both peer-to-peer and ICE-style 5 peer-to-aggregator communications and facilitates the development of autonomous, reliable, and reusable systems for critical care delivery.
Complementing these approaches, an ECG stream analysis framework [11] demonstrates how FHIR can support AI-driven healthcare applications in a cloud native environment. Using the Google Cloud Healthcare API, it securely stores FHIR-encoded ECG data and processes it using tools such as Scikit-Learn and PyTorch, thereby bridging the gap between clinical data interoperability and advanced real-time analytics and personalized monitoring. In contrast, in our approach, we manage heterogeneous clinical data, including anamnesis, breast ultrasound images, vital signs sourced by medical devices, which are collected, elaborated with AI for support of phisician decison and structured as FHIR resources. In addition, he secondary use of health data for research purposes, as in our case, has been addressed in [12]. In the cited work, FHIR anonymization and pseudonymization methods allow healthcare institutions to transform identifiable FHIR resources, such as “Patient,” “Visit,” or “Observation”, into de-identified formats suitable for research or data analysis without compromising structural integrity or interoperability.
This enables secure and regulation-compliant reuse of data for purposes such as machine learning, clinical studies, and population health management, while preserving privacy and semantic consistency within FHIR-based infrastructures.
Several studies have explored how FHIR can support advanced clinical workflows. Major et al. [13] demonstrate how a FHIR back-end, integrated with the "Epic" system, which is one of the most widely used EHR systems in the world in hospital setting [14], allows real-time retrieval and analysis of clinical notes, medications, and vital signs to support AI-based models, enhancing timely clinical decision-making. Similarly to our approach, this demonstrates how FHIR-structured data can support decision-making processes across diverse clinical scenarios.
In the mobile environment, Lamprinakos et al. [15] present a FHIR-based health application , similar our, that facilitates interaction between clinicians, patients, and pharmacists. The system uses FHIR RESTful APIs to enable secure data access and personalized care plan management.
A more recent mobile architecture proposed by Pallis et al. [16] enables seamless access to electronic medical records (EMRs) from personal health apps by integrating legacy Cross-Enterprise Document Sharing (XDS) systems with modern FHIR-based EMRs. The mobile solution allows citizens to access and manage clinical information from multiple providers, combining the interoperability of FHIR with the document-centric approach of XDS.
Instead, in the field of precision medicine, the mCODE initiative [17] defines standardized FHIR profiles for oncology data, promoting reuse in both clinical systems and research. This model has been extended to international contexts [18], showcasing its adaptability to different healthcare systems. Similarly, the OSIRIS project [19] proposes a FHIR-compatible framework to enhance the sharing and analysis of clinical and genomic data in oncology.
Another emerging area is the use of FHIR in distributed cancer research, as seen in Oncology on FHIR [20]. This work proposes a modular FHIR-based data model to enable structured data exchange across institutions, using resources such as Condition, Observation, and Procedure.
To support the FHIR standard in defining precise instructions, Implementation Guides (IGs) are used. A noteworthy IG is the International Patient Summary (IPS) [21], which defines an interoperable structure for summarizing a patient’s clinical information, with the goal of supporting the exchange of essential health data. The IPS IG is designed to capture demographic, medical history, and clinical information, including elements such as medical conditions, allergies, ongoing treatments, and immunization history. This guide serves as a key reference for international interoperability and the standardization of patient history collection.
In the oncological field, there are several attempts at FHIR profiling by defining a series of IGs:
(1) mCode [22], an initiative of the American Society of Clinical Oncology (ASCO) in collaboration with the MITRE Corporation, aims to establish a core set of structured data elements for the EHR in oncology.
(2) Breast Cancer Data [23], a joint project of the Clinical Information Council (CIC) and the Clinical Information Modeling Initiative (CIMI), manages a dataset used for breast cancer staging.
(3) Breast Imaging Reporting [24], designed to capture, store, and communicate data derived from breast radiology examinations, supporting breast cancer screening, diagnosis, and treatment activities. It includes FHIR artifacts to represent findings obtained through various imaging modalities, such as ultrasound, Magnetic Resonance Imaging, and nuclear medicine.
(4) ICHOM Patient Centered Outcomes Measure Set for Breast Cancer [25], an IG based on the mCODE data model, proposing a structured set of patient-centered measures for breast cancer management. The IG includes scientifically validated questionnaires that explore specific aspects of the patient’s health experience. The responses to each item are quantified using a scoring system, allowing a standardized and comparable assessment of patient-reported outcomes.
Existing works provide significant but partial contributions: some focus on interoperability and integration with EHRs, others illustrate how FHIR can enable AI-driven clinical workflows, while others define oncology-specific profiles. However, none of them combine structured clinical anamnesis, imaging data, and AI modules within a single end-to-end architecture to support the entire care pathway.
Our work advances the state of the art by integrating these aspects into a complete system solution. It encompasses structured anamnesis and imaging data collection, formalization through FHIR-based Implementation Guides, automated machine learning modules for risk assessment, and a patient-centered mobile application (“InferCare”). Based on insights from the literature on FHIR-enabled AI systems for clinical decision support, our architecture is designed to ensure interoperability while supporting both clinicians and patients throughout the care pathway.
Specifically, the Android application developed within this architecture targets the breast cancer care pathway.
The case study, based on data from multiple patients, focuses on extracting and analyzing key morphological and color-related features in ultrasound images that characterize malignant tumors. By structuring heterogeneous clinical and imaging data in FHIR, our system enables AI modules to provide standardized, explainable, and scalable decision support. In this way, our contribution extends previous experiences by combining interoperability, multi-modal clinical data, and AI-driven analytics within a single, end-to-end framework tailored for oncology care.

3. Integrated Patient Decision Support System - Architecture and Methodology

This section defines the architecture of the Integrated Patient Decision Support System, namely “IPDSS”, a solution designed to centralise, integrate, and standardise heterogeneous patient information in order to provide effective decision support to healthcare professionals and facilitate integration with existing HISs. Figure 1 presents a comprehensive view of the proposed modular architecture.
The main objectives of the defined architecture are:
Data Aggregation and Collection: Aggregating information from different sources, such as: i) information collected (medical and family history) from interviews or through forms filled in by patients; ii) data derived from diagnostic image analysis systems and reports; iii) structured data obtained from HIS.
Data Standardization: Formalizing of all data (source, integrated, and processed) using the international HL7 FHIR standard to realize an interoperable solution.
Intuitive and fast visualization of interest information : Providing integrated information of interest for healthcare professionals through an easy-to-navigate and understand user interface. Only deemed useful information will be displayed for the specific diagnostic case.
Decision Support: Generating a concise summary of the patient’s health status based on the integrated data and the use of AI algorithms, to propose an AI-based risk assessment.
Data Integration: Being able to communicate with HISs (such as EHR or telemedicine systems) through the use of the HL7 FHIR standard.
In the following, we go into the details of the architecture’s description, with reference to functional requirements, data model, data flow, and Diagnostic Image Analysis Module.

3.1. Architecture

The proposed solution adopts a modular architecture, based on logically separated modules or components that communicate with each other.
In detail in Figure 1, the front-end components are contoured in blue, while the overlapped black boxes indicate the back-end modules of the architecture. The front-end modules include:
  • Front-end Patient
    It allows the patient to independently collect a range of anamnestic information (possibly specific to the clinical condition). The module allows for the management of data obtained by a form presented to the patient for the collection of medical history and other relevant information (allergies, medication, family history, symptoms, etc.). This module also allows the patient to retrieve any information present in the system through interaction with the HL7 FHIR-HIS interface module.
  • Diagnostic Image Analysis Module (DIAM)
    It performs processing and analysis of diagnostic images (e.g., detection, segmentation, classification, feature extraction) using Machine Learning algorithms or specific analysis tools. The aim of this module is to return an AI-based risk assessment.
  • HL7 FHIR Formalization Module
    This module allows for formalizing, through international standard HL7 FHIR profiles, the data provided by the patient during the anamnesis phase (coming from the patient’s anamnesis form), and the results of image analysis (from the DIAM). This module uses the FHIR profiles and resources (e.g., Patient, Observation, Diagnostic Report, Allergy Intolerance, Medication Statement), which are appropriately defined to ensure compliance with FHIR standards. It manages the creation of FHIR resources and carries out validation for the aim of proposed architecture.
  • Clinical Information of Interest Presentation Module (CIIPM)
    This module allows identifying and collecting all and only the clinical information of interest for a specific diagnosis. It thus enables the presentation of an integrated and intuitive view of the information: Personal data, structured medical history, and image analysis results.
  • Health Status Summary Module
    It applies logical rules and algorithms to extract and analyze integrated data (history, image results, other available data), and generates a concise summary of the patient’s health status, highlighting key information, potential risks (AI-based risk assessment), and recommendations.
  • HL7 FHIR-HIS Interface
    It enables the CIIPM to communicate with the existing platform (EHR) using the HL7 FHIR standard, and supports FHIR operations such as Create, Read, Update of Patient resources, Observation, Diagnostic Report, and other relevant ones.

3.1.1. Data Model

The data used in the architecture presented in the previous section are anamnestic data, i.e., a collection of information about the patients and their medical history, carried out by the physician to better understand the situation and make an accurate diagnosis. These data help the physician to identify possible diseases, risk factors, and family predispositions, which are essential for appropriate treatment. In addition to the medical history data, some data related to patient images are provided to the clinicians to support the diagnosis (Pathological Features). Moreover, other data features are also extracted from the images (Hand-crafted Features) that allow the computation of the AI-based risk assessment. Section 4 explains the process for the computation of AI-based risk assessment.

3.2. Architecture Modules Details

The following paragraph provides the flow of data within the proposed IPDSS architecture.

3.2.1. Front-End Patient

This module is designed to enable patients to autonomously provide a wide range of anamnestic information, which can be tailored to their specific clinical condition. Through an intuitive user interface, patients are presented with a customizable form where they can input relevant data regarding their personal and family medical history. This includes, but is not limited to, information about existing or past symptoms, current medications, known allergies, past diagnoses, and hereditary conditions.
The collected data are systematically managed and stored by the module, ensuring that healthcare professionals can access accurate and up-to-date patient information. Furthermore, the module is integrated with HIS/EHR, allowing patients to retrieve any relevant data already present in the system. This bidirectional communication ensures that both patient-provided information and existing clinical records are seamlessly synchronized, enhancing the completeness and reliability of the patient’s health profile.

3.2.2. Diagnostic Image Analysis Module (DIAM)

DIAM aims to provide two types of support to the healthcare professionals:
  • Some specific features related to the pathology, namely Pathological Features (PF), are automatically extracted from the images, by using properly designed Computer Vision algorithms. These features are shown to the healthcare professionals and are useful to support the diagnosis.
  • A risk assessment, computed by using suitably designed ML algorithms. For the computation of AI-based risk assessment, the ML methods use a set of features automatically extracted from ultrasound images, larger than PF, and called Hand-crafted Features (HF).

3.2.3. HL7 FHIR Formalization Module (HFFM)

HFFM transforms clinical and contextual data collected by the user into resources that comply with the HL7 FHIR standard. Specifically, the module processes two main categories of information:
  • Anamnestic data: information provided directly by the patient through the Front-end Patient module. This includes, for example, self-reported symptom medical conditions, self-reported symptoms, etc..
  • Clinical information derived from diagnostic images: includes results obtained through the automated analysis of diagnostic images (e.g., ultrasound images) by the DIAM. This information is then translated into formalized clinical observations according to the HL7 FHIR standard.
Based on this information and referring to the work presented in [26], the profiles, value sets, and system codes for a test IG, were defined [27,28]. The Implementation Guide defined in [28] is available to support the structuring of the information used by the proposed solution. Furthermore, the IG enables the automatic validation of the data employed by the solution against the HL7 FHIR standard.

3.2.4. Clinical Information of Interest Presentation Module (CIIPM)

This module is responsible for identifying and aggregating all the clinical information that is specifically relevant to a given diagnosis, based on the patient’s current clinical condition and medical history. Its goal is to filter out non-essential data and focus only on what is diagnostically significant, thereby reducing information overload for healthcare professionals.
The module presents this curated information through an integrated and user-friendly interface. It combines various data sources, including personal and demographic information, structured and categorized medical history, and the results obtained from medical image analysis, into a coherent and accessible overview. This facilitates faster and more informed clinical decision-making by providing a comprehensive yet focused snapshot of the patient’s health status.

3.2.5. Health Status Summary Module

This module performs a synthesis of the patient’s overall health condition by applying predefined logical rules and advanced algorithms to the available integrated data (diagnostic image analysis results, and any other clinically relevant data available in the system).
Based on this analysis, the module generates a clear and concise summary of the patient’s health status. It highlights key clinical findings, flags potential health risks through AI-based risk assessment models, and provides actionable recommendations when appropriate. This summary is designed to support healthcare providers by offering a quick yet comprehensive overview, aiding in both diagnosis and treatment planning.

3.2.6. HL7 FHIR - HIS Interface

This module facilitates seamless communication between the CIIPM and the existing HIS, such as the EHR platform. It leverages the HL7 FHIR standard, which is widely adopted for the secure and efficient exchange of healthcare data.
The interface supports a range of core FHIR operations, including the creation, retrieval, and updating of key healthcare resources such as Patient, Observation, DiagnosticReport, and other relevant resource types. By implementing these operations, the module ensures interoperability with other systems and enables real-time synchronization and sharing of clinical information across the healthcare infrastructure. This interoperability enhances the consistency, accuracy, and accessibility of patient data throughout the clinical workflow.

4. Case Study

This work proposes a case study related to the clinical condition of breast cancer. In the following section, we describe the modules, DIAM and HFFM, that need to be customized in relation to the selected case study.

4.1. DIAM

Dataset description
In this work, the Breast Ultrasound Images Dataset 6 (BUSI) has been used for the evaluation of the classification of ultrasound images. The dataset includes 780 breast ultrasound images of women aged between 25 and 75 years old. 210 images contain malignant lesions, 437 have been annotated as benign breast cancer, and 133 are normal breast cancer images. Only malignant and benign cases are included in this study. In detail, for each ultrasound image, a ground truth with the annotated lesion is associated.
As an example, Figure 2 illustrates two breast ultrasound images: one showing a malignant lesion and the other a benign one. In both cases, the lesions were contoured in red by the expert radiologists, which allowed us to extract the key morphological and color-related features directly in that annotated portion and use them for the classification process.
AI-based Risk assessment calculation
For the selected case study, the DIAM provides the PF set, reported in the Table 1. These features are among the most used by radiologists to provide a diagnosis of breast cancer [29]. So, we show them to the clinician to support the diagnostic decision process.
For the AI-based risk assessment, a classification of the lesion as benign or malignant is proposed. For the classification purpose, the HF set, composed of morphological and color-related features, reported in the Table 2, is extracted from the annotated lesions.
A feature selection step has been applied in order to assess the HFs that are more important for the classification. Recursive Feature Elimination (RFE) using Logistic Regression (LR) results in being the best method for the feature selection phase [30]. Subsequent to the feature selection phase, a classification pipeline was performed with the objective of assessing the predictive capabilities of various ML algorithms. Specifically, four well-established classifiers were considered: Decision Tree (DT) [31], Multi-Layer Perceptron (MLP) [32], Naive Bayes (NB) [33], Random Forest (RF) [34].
In order to mitigate the issue of class imbalance inherent in the dataset, the Synthetic Minority Oversampling Technique (SMOTE) [35] was applied. Unlike traditional random oversampling methods, which tend to increase the risk of overfitting by merely duplicating existing minority class instances, SMOTE addresses the imbalance by synthetically generating new samples. This is achieved through interpolation between existing minority class samples that are in close proximity within the feature space. Such an approach fosters a more balanced class distribution and enhances the generalization ability of the trained models. For each image, the predicted class is the one associated with the highest probability score assigned by the ML algorithm. The risk assessment proposed to the clinician will consist of the predicted class (benign or malignant) and the associated probability for that class.
Results The obtained results highlight the importance of carefully selecting discriminative handcrafted features for breast lesion classification from ultrasound images. The use of RFE proved particularly effective in identifying the most informative subset of features, suggesting that not all morphological descriptors contribute equally to the discrimination task. The fact that only three features—perimeter regularity, axis ratio, and solidity—were sufficient to achieve high performance indicates that these characteristics capture complementary aspects of lesion morphology that are highly relevant for distinguishing benign from malignant patterns. This is also consistent with radiological practice, where border irregularity, asymmetry, and spiculation are well-established hallmarks of malignancy.
Among the classifiers, MLP outperformed the others, achieving almost perfect discrimination with an accuracy close to 98% and a F1-score of 97%. This reinforces the idea that neural networks, even in relatively simple architectures, are well-suited to model nonlinear feature interactions. While perimeter regularity or solidity alone can already provide meaningful information, their joint contribution, along with axis ratio, can be more effectively exploited by MLP compared to other models. In contrast, NB, constrained by its independence assumption, showed limitations in handling such interactions, which likely explains its relatively lower recall.
Decision Trees (DT) performed adequately but were more affected by dataset size and potential noise. This behavior is expected, as DTs are prone to overfitting when trained on small datasets and may fail to generalize well. RF, on the other hand, mitigated some of these issues by averaging multiple decision trees, which resulted in solid performance (97.2% accuracy and 96.8% F1-score). However, RF still did not surpass MLP, suggesting that ensembles of shallow learners might not capture subtle, higher-order nonlinear relationships as effectively as neural networks.
Overall, these findings suggest that the careful combination of feature selection and the use of flexible learning models such as MLP can yield highly accurate breast cancer classification systems, even when relying solely on handcrafted features rather than deep representations. This is a particularly relevant result for settings where computational resources are limited or where large annotated datasets required for deep learning are not available. At the same time, the high performance obtained with a small feature set also improves interpretability, since the decision-making process can be directly related to clinically meaningful morphological descriptors.
Table 3. Results of the different ML algorithms used with the three key features. In bold, the best results for each measure.
Table 3. Results of the different ML algorithms used with the three key features. In bold, the best results for each measure.
Accuracy Precision Recall F1-score
DT 0.9440 0.9333 0.9439 0.9377
MLP 0.9767 0.9746 0.9729 0.9736
NB 0.9488 0.9596 0.9254 0.9397
RF 0.9720 0.9690 0.9683 0.9684

4.2. HFFM

This section presents FHIR profiles developed to ensure the structured and interoperable representation of clinical and anamnestic information collected through an IG. Derived from the adaptation of international FHIR resources to the project context, these profiles cover several areas ranging from patient history to oncology risk analysis. In Table 4 a summary of single profile of IG are reported.
Figure 3. Overview of the Information Model and Profiled FHIR Resources
Figure 3. Overview of the Information Model and Profiled FHIR Resources
Preprints 175946 g003

5. “InferCare” Android Application

To facilitate interaction with the IPDSS system by both patients and physicians, the “InferCare” mobile app was developed. The app’s name is derived from the combination of “Inference” (referring to AI inference) and “Care” (patient care). This section explains the interface and main features of the app, which allow guided completion of the medical history form, display of structured information, and support of diagnosis through summary views for physicians.

5.1. IPDSS Functional Requirements

The section specifies the Functional Requirements (FR) of the IPDSS system, organized according to a progressive nomenclature (FR01, FR02, ...). These requirements define the expected behaviors of the system, both in terms of secure data acquisition and management and in terms of accessibility, user interface, and interoperability with external systems (such as EHR, HIS, and telemedicine systems).
Table 5. Functional Requirements of the IPDSS System
Table 5. Functional Requirements of the IPDSS System
Requirement ID Description
FR01 IPDSS shall enable patients to complete a structured form capturing personal data, teleconsultation history, family medical history, allergies, current medications, lifestyle habits, ongoing symptoms, and other clinically relevant information necessary for initial assessment.
FR02 The form shall support multiple input types, including free text fields, multiple-choice options, checkboxes, and date pickers, to ensure flexibility and completeness of data collection.
FR03 IPDSS shall implement client-side validation mechanisms to ensure data consistency, accuracy, and completeness before submission.
FR04 IPDSS shall allow patients to save their progress during form completion and resume the process at a later time without data loss.
FR05 IPDSS shall ensure the confidentiality and integrity of the submitted information during data transmission through secure communication protocols (e.g., HTTPS, encryption).
FR06 A secure user authentication mechanism shall be provided (where applicable) to control access to the form, particularly when enabling partial form saving or editing features.
FR07 IPDSS shall be responsive and accessible across various devices, including tablets and smartphones, ensuring usability and inclusivity.

5.2. "InferCare"

The mobile app, "InferCare", is designed to provide an intuitive, interactive interface that supports patients and physicians throughout the entire information flow. From the patient’s perspective, an interface has been implemented that allows for the guided completion of a multi-section medical history form covering symptoms, medications, allergies, and habits. The form can be temporarily saved locally and can be resumed later. Once the information is validated through the HFFM module, it is visible to the patient through the interface.
Figure 4. Patient’s View
Figure 4. Patient’s View
Preprints 175946 g004
From the physician’s perspective, an interface has been implemented that provides authenticated and secure access to the patient dashboard, serving as a centralized entry point for all clinical information. Once logged in, the physician can view the list of waiting patients, available appointments, and a summary of each patient’s health status.
Figure 5. Physician’s View
Figure 5. Physician’s View
Preprints 175946 g005
In the next section, the Sequence Diagrams for patient and physician interaction with the app are described.

5.3. Sequence Diagrams

The sequences of actions performed by the Patient are shown in Figure 6:
  • The patient starts the mobile application (StartApp);
  • A request is sent to the Front-end Patient to access the data entry form (RequestCompilationForm);
  • The patient completes a digital form through the Front-end Patient interface;
  • The Front-end Patient form sends the data to the Backend via a REST API (POST(data anamnesis module));
  • The "RESTful API" forwards the "POST(data anamnesis module)" to the "HL7 FHIR Formalization Module";
  • The HL7 FHIR formalization module transforms the received data into FHIR resources and sends them to the "FHIR Server with IG" for data validation ("validate data") ;
  • FHIR Server with IG validates and stores the FHIR Resources , then data sends a "Response FHIR Resources" back to the HL7 FHIR Formalization module.;
  • Finally, the "Front-End Patient" receives a "Response Resource FHIR" back to the mobile application as View Form anamnsesis.
Figure 6. Sequence Diagram Patient’s View.
Figure 6. Sequence Diagram Patient’s View.
Preprints 175946 g006
The sequence in Figure 7 describes the back-end process triggered when a user accesses the mobile application and requests the analysis of diagnostic images. The system handles image retrieval, clinical feature extraction via artificial intelligence, and data transformation into HL7 FHIR format using a structured pipeline supported by RESTful communication and standardized data representation.
  • The user accesses the mobile application (AccessApp).
  • The application sends an image analysis request to the backend via RESTful API (RequestElaborationImage).
  • The RESTful API performs a POST request to the module that queries the ultrasound image database.
  • The database receives the request (RequestDiagnosticImage), retrieves the required ultrasound images (Images recovery), and returns them.
  • The ultrasound images are sent to the AI-based image analysis module.
  • The AI module extracts the clinical features from the ltrasound images (extract features images).
  • The extracted data are sent to the HL7 FHIR formalization module (send data features).
  • The HL7 FHIR module validates and transforms the data into FHIR format (validate data).
  • Finally, the FHIR Server with IG stores the validated features (validation and store features images resource).
Figure 7. Sequence Diagram Images Analysis.
Figure 7. Sequence Diagram Images Analysis.
Preprints 175946 g007
The sequences of actions performed by the Physician are as follows and shown in Figure 8:
  • The physician interacts with the mobile app to request the summary view (RequestViewSummary).
  • The app forwards the request to the Clinical Information of Interest presentation module (RequestSummary).
  • This module queries the FHIR Server through the RESTful API with a query containing the relevant data (Query FHIR Resources (DataAnamnesis, featuresImage)).
  • The FHIR Server with IG retrieves the requested clinical information and returns the FHIR Resources.
  • The Clinical Information of Interest presentation module extracts the main data from the received resources, creates a summary with a Health Status Summary and AI-based risk assessment, and highlights the principal information;
  • The summary view is presented to the physician through the mobile interface (View Summary).
Figure 8. Sequence Diagram Physician’s View.
Figure 8. Sequence Diagram Physician’s View.
Preprints 175946 g008

6. Conclusions

In this work, we used the HL7 FHIR standard and all the standard tools made available by the HL7 community to develop an architecture that allows for the integration of data from diagnostic images, data collected by the patient, and data from heterogeneous sources. We have chosen a case study identifying the AI-based risk assessment related to a clinical condition of breast cancer based on the analysis of ultrasound images by using ML models. In this context, the Android application, "InferCare" compliant with HL7 FHIR standard has been created to simplify the diagnostic process with an interactive graphical interface, available for both the patient and the physician.
This application allows the patient to record anamnestic data and the physician to have a summary of the most important information for an effective and fast diagnosis. In the future, it would be interesting to investigate other medical conditions in which this application could be used to quicken and make the diagnostic process efficient.

Author Contributions

For research articles with several authors, a short paragraph specifying their individual contributions must be provided. The following statements should be used “Conceptualization, N.B. and M.S.; methodology, N.B. and M.S.; software, T.C. and M.R.; validation, N.B., M.S., T.C, M.R. and S.D.; formal analysis, N.B. and M.S.; investigation, T.C. and M.R.; resources, T.C, M.R. and S.D.; data curation, T.C, M.R. and S.D.; writing—original draft preparation, N.B. and M.S.; writing—review and editing, N.B., M.S., T.C, M.R. and S.D.; supervision, N.B. and M.S.. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The image dataset BUSI, used for the experiments, is publicly available at https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset. The Test IG "Remote Anamnesis" is publicly available at http://remote-anamnesis.na.icar.cnr.

Acknowledgments

The authors gratefully acknowledge the Project "RIGOLETTO: Creation of intelligent management platform for oncology patients" - Bando “Accordi per l’innovazione”, del Ministero delle Imprese e del Made in Italy, mimit.AOO_IAI.REGISTRO INTERNO.R.0001470.08-05-2023, for providing support for this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AI Artificial Intelligence
ABUS Automated Breast Ultrasound System
API Application Programming Interface
CAD Computer-Aided Diagnosis
CDSS Clinical Decision Support System
CIIPM Clinical Information of Interest Presentation Module
CIC Clinical Information Council
CIMI Clinical Information Modeling Initiative
DIAM Diagnostic Image Analysis Module
DT Decision Tree
EHR Electronic Health Record
ECG Electrocardiogram
EMR Electronic Medical Record
FHIR Fast Healthcare Interoperability Resources
FR Functional Requirement
HFFM HL7 FHIR Formalization Module
HF Hand-crafted Features
HIS Health Information System
HL7 Health Level Seven International
ICE Integrated Clinical Environment
ICHOM International Consortium for Health Outcomes Measurement
IG Implementation Guide
IHE Integrating the Healthcare Enterprise
IPDSS Integrated Patient Decision Support System
IPS International Patient Summary
JSON JavaScript Object Notation
LR Logistic Regression
mCODE Minimal Common Oncology Data Elements
ML Machine Learning
MLP Multi-Layer Perceptron
MDIRA Medical Device Interoperability Reference Architecture
mHealth Mobile Health
NB Naive Bayes
OSIRIS Open Standards for Interoperable and Reusable Information in Oncology
PF Pathological Features
RF Random Forest
RFE Recursive Feature Elimination
REST Representational State Transfer
SMOTE Synthetic Minority Oversampling Technique
XML Extensible Markup Language
XDS Cross-Enterprise Document Sharing

References

  1. Sreejith, R.; Senthil, S. Smart Contract Authentication assisted GraphMap-Based HL7 FHIR architecture for interoperable e-healthcare system. Heliyon 2023, 9. [Google Scholar] [CrossRef] [PubMed]
  2. Ramgopal, S.; Sanchez-Pinto, L.N.; Horvat, C.M.; Carroll, M.S.; Luo, Y.; Florin, T.A. Artificial intelligence-based clinical decision support in pediatrics. Pediatric research 2023, 93, 334–341. [Google Scholar] [CrossRef] [PubMed]
  3. Taber, P.; Radloff, C.; Del Fiol, G.; Staes, C.; Kawamoto, K. New standards for clinical decision support: a survey of the state of implementation. Yearbook of medical informatics 2021, 30, 159–171. [Google Scholar] [CrossRef] [PubMed]
  4. Lee, J.; Jang, S.; Park, E. Design of a FHIR interface for wearable healthcare devices. In Proceedings of the 2023 Fourteenth International Conference on Ubiquitous and Future Networks (ICUFN). IEEE; 2023; pp. 730–732. [Google Scholar]
  5. Zhang, X.; Lin, X.; Tan, Y.; Zhu, Y.; Wang, H.; Feng, R.; Tang, G.; Zhou, X.; Li, A.; Qiao, Y. A multicenter hospital-based diagnosis study of automated breast ultrasound system in detecting breast cancer among Chinese women. Chinese Journal of Cancer Research 2018, 30, 231. [Google Scholar] [CrossRef] [PubMed]
  6. Duda, S.N.; Kennedy, N.; Conway, D.; Cheng, A.C.; Nguyen, V.; Zayas-Cabán, T.; Harris, P.A. HL7 FHIR-based tools and initiatives to support clinical research: a scoping review. Journal of the American Medical Informatics Association 2022, 29, 1642–1653. [Google Scholar] [CrossRef] [PubMed]
  7. Seiler, S.J.; Neuschler, E.I.; Butler, R.S.; Lavin, P.T.; Dogan, B.E. Optoacoustic imaging with decision support for differentiation of benign and malignant breast masses: a 15-reader retrospective study. American Journal of Roentgenology 2023, 220, 646–658. [Google Scholar] [CrossRef] [PubMed]
  8. Mandel, J.C.; Kreda, D.A.; Mandl, K.D.; Kohane, I.S.; Ramoni, R.B. SMART on FHIR: a standards-based, interoperable apps platform for electronic health records. Journal of the American Medical Informatics Association 2016, 23, 899–908. [Google Scholar] [CrossRef] [PubMed]
  9. Eapen, B.R.; Archer, N.; Sartipi, K.; Yuan, Y. Drishti: A Sense-Plan-Act Extension to Open mHealth Framework Using FHIR. In Proceedings of the 2019 IEEE/ACM 1st International Workshop on Software Engineering for Healthcare (SEH); 2019; pp. 49–52. [Google Scholar]
  10. Sloane, E.B.; Cooper, T.; Silva, R. MDIRA: IEEE, IHE, and FHIR Clinical Device and Information Technology Interoperability Standards, bridging Home to Hospital to “Hospital-in-Home”. In Proceedings of the SoutheastCon 2021, 2021, pp. 1–4.
  11. Lee, J.; Kim, J. Design of an ECG Stream Analysis Framework Based on FHIR Data Model. In Proceedings of the 2024 Fifteenth International Conference on Ubiquitous and Future Networks (ICUFN); 2024; pp. 567–569. [Google Scholar]
  12. Raso, E.; Loreti, P.; Ravaziol, M.; Bracciale, L. Anonymization and Pseudonymization of FHIR Resources for Secondary Use of Healthcare Data. IEEE Access 2024, 12, 44929–44939. [Google Scholar] [CrossRef]
  13. Major, V.J.; Wang, W.; Aphinyanaphongs, Y. Enabling AI-Augmented Clinical Workflows by Accessing Patient Data in Real-Time with FHIR. In Proceedings of the Proceedings of the 2023 IEEE 11th International Conference on Healthcare Informatics (ICHI). IEEE, 2023, pp. 531–533.
  14. Chishtie, J.; Sapiro, N.; Wiebe, N.; Rabatach, L.; Lorenzetti, D.; Leung, A.A.; Rabi, D.; Quan, H.; Eastwood, C.A. Use of Epic electronic health record system for health care research: scoping review. Journal of medical Internet research 2023, 25, e51003. [Google Scholar] [CrossRef] [PubMed]
  15. Lamprinakos, G.C.; et al. Using FHIR to develop a healthcare mobile application. In Proceedings of the Proceedings of the 2014 4th International Conference onWireless Mobile Communication and Healthcare (MOBIHEALTH). IEEE, 2014, pp. 132–135..
  16. Petrakis, Y.; Kouroubali, A.; Katehakis, D. A Mobile App Architecture for Accessing EMRs Using XDS and FHIR. In Proceedings of the 2019 IEEE 19th International Conference on Bioinformatics and Bioengineering (BIBE); 2019; pp. 278–283. [Google Scholar]
  17. Osterman, T.J.; Terry, M.; Miller, R.S. Improving Cancer Data Interoperability: The Promise of the Minimal Common Oncology Data Elements (mCODE) Initiative. JCO Clinical Cancer Informatics 2020, 4, 993–1001. [Google Scholar] [CrossRef] [PubMed]
  18. Chen, J.; et al. Applying the Minimal Common Oncology Data Elements (mCODE) to the Asia-Pacific Region. JCO Clinical Cancer Informatics 2021, 5, 252–253. [Google Scholar] [CrossRef] [PubMed]
  19. Guérin, J.; Laizet, Y.; Le Texier, V.; Chanas, L.; Rance, B.; Koeppel, F.; Lion, F.; Gourgou, S.; Martin, A.L.; Tejeda, M.; et al. OSIRIS: A Minimum Data Set for Data Sharing and Interoperability in Oncology. JCO Clinical Cancer Informatics 2021, 5, 256–265. [Google Scholar] [CrossRef]
  20. Lambarki, M.; Kern, J.; Croft, D.; Engels, C.; Deppenwiese, N.; Kerscher, A.; Kiel, A.; Palm, S.; Lablans, M. Oncology on FHIR: A Data Model for Distributed Cancer Research. Studies in Health Technology and Informatics 2021, 278, 203–210. [Google Scholar]
  21. HL7 International. International Patient Summary (IPS) Implementation Guide. URL: https://build.fhir.org/ig/HL7/fhir-ips/. Accessed June 25, 2025.
  22. mCODE Implementation Guide. URL: https://build.fhir.org/ig/HL7/fhir-mCODE-ig/. Accessed June 25, 2025.
  23. HL7 International and The Breast Cancer Work Group. Breast Cancer Data Implementation Guide. URL: https://hl7.org/fhir/us/breastcancer/2018Sep/index.html. Accessed June 26, 2025.
  24. HL7 International. Breast Radiology Implementation Guide. URL: https://build.fhir.org/ig/HL7/fhir-breast-radiology-ig/. Accessed June 26, 2025.
  25. HL7 International and ICHOM. ICHOM Breast Cancer Implementation Guide. URL: https://build.fhir.org/ig/HL7/fhir-ichom-breast-cancer-ig/. Accessed June 30, 2025.
  26. Conte, T.; Sicuranza, M. Sistema FHIR-Based per la Raccolta Strutturata dell’Anamnesi in Remoto: Approccio, Standard e Implementazione. Rapporto Tecnico RT-ICAR-NA-2025-05, CNR-ICAR, 2025.
  27. Teresa Conte and Mario Sicuranza. Remote Anamnesis Implementation Guide for Technical Report. URL: https://anamnesi.na.icar.cnr.it/ . Accessed August 08, 2025.
  28. Teresa Conte and Mario Sicuranza. Remote Anamnesis Implementation Guide. URL: http://remote-anamnesis.na.icar.cnr . Accessed August 08, 2025.
  29. Guo, R.; Lu, G.; Qin, B.; Fei, B. Ultrasound imaging technologies for breast cancer detection and management: a review. Ultrasound in medicine & biology 2018, 44, 37–70. [Google Scholar]
  30. Kuhn, M.; Johnson, K.; et al. Applied predictive modeling; Vol. 26, Springer, 2013.
  31. Magee, J.F. Decision trees for decision making; Harvard Business Review, 1964.
  32. Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning internal representations by error propagation. Technical report, California Univ San Diego La Jolla Inst for Cognitive Science, 1985.
  33. Langley, P.; Iba, W.; Thompson, K.; et al. An analysis of Bayesian classifiers. In Proceedings of the Aaai. Citeseer; 1992; Vol. 90, pp. 223–228. [Google Scholar]
  34. Ho, T.K. Random decision forests. In Proceedings of the Proceedings of 3rd international conference on document analysis and recognition. IEEE, 1995, Vol. 1, pp. 278–282.
  35. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
1
2
3
4
5
6
Figure 1. Architecture System.
Figure 1. Architecture System.
Preprints 175946 g001
Figure 2. Ultrasound breast images. Example of two ultrasound images of the breast, one with a malignant lesion (A) and one with a benign lesion (B). The images show the lesions contoured in red by the expert radiologists.
Figure 2. Ultrasound breast images. Example of two ultrasound images of the breast, one with a malignant lesion (A) and one with a benign lesion (B). The images show the lesions contoured in red by the expert radiologists.
Preprints 175946 g002
Table 1. Description of PF set.
Table 1. Description of PF set.
Feature Description
Major Axis, Minor Axis Provides intuitive size and shape information
Perimeter Regularity Irregular contours may be indicative of malignancy
Orientation An angle greater than 45° may suggest a malignant nature
Circularity Lower circularity can reflect irregular lesion shapes
Table 2. Description of HF set.
Table 2. Description of HF set.
Feature Description
Eccentricity Indicates how elliptical the lesion is
(0 = perfect circle, 1 = highly elongated ellipse)
Circularity Measures how similar the lesion is to a circle
(1 = perfect, <1 = more irregular)
Perimeter Regularity Evaluates the complexity of the lesion boundary
Axis Ratio Relationship between the axes of an ellipse or ellipsoid
Solidity Ratio between the actual area and the convex area
(1 = compact, <1 = irregular)
Extent Ratio between the lesion area and its bounding box
Elongation Indicates whether the lesion is stretched
along a particular direction
Fractal Dimension Quantifies the complexity of the lesion’s contour
Area Number of pixels comprising the lesion
Perimeter Length of the lesion’s contour
Convex Area Area of the convex hull surrounding the lesion
Equivalent Diameter Diameter of a circle having the same area as the lesion
Kurtosis Measures the "peakedness" of the intensity distribution
Skewness Measures the asymmetry of the intensity distribution
Entropy Indicates the degree of randomness
in the pixel intensity distribution
Contrast Measures local intensity variation
Homogeneity Quantifies how similar neighboring pixels are
Table 4. Description of Profiles.
Table 4. Description of Profiles.
Profile Description
AllergyIntolerance_Patient Used to represent the patient’s known allergies and intolerances
Anamnesis_Patient Used to represent the personal identity and demographic information of the patient subject to the anamnesis
Appointment_Patient Used to describe information about scheduled clinical appointments
CancerRiskAssessment_Patient Used to represent the patient’s risk of developing cancer based on anamnesis and extracted clinical features
CarePlan_Patient Used to represent a patient care or treatment plan
Condition_Patient Used to represent a particular clinical condition of the patient
Consent_Patient Used to represent the patient’s informed consent regarding the use and sharing of their clinical data
DiagnosticReport_Patient Used to document diagnostic reports associated with the patient, such as ultrasound images
FamilyMemberHistory_Patient Used to document the patient’s family medical history, with special reference to inherited diseases in family members
Medication_Patient Used to describe the characteristics of drugs such as active ingredient, pharmaceutical form, and dosage
MedicationStatement_Patient Used to represent the set of medications taken by the patient
Observation_Axis Used to represent the orientation axis of a clinical image or structure, extracted from ultrasound or imaging data.
Observation_PerimeterRegularity Used to represent the regularity of the perimeter of a lesion or structure observed in diagnostic imaging
Observation_Orientation Used to represent the spatial orientation of a lesion as detected in clinical imaging.
Observation_Circularity Used to represent the circularity of a lesion or structure, derived from diagnostic image analysis
Observation_Patient Used to record observations of the patient’s health status such as vital parameters, symptoms etc.
Practitioner_Anamnesis Used to represent the health professional involved in patient care, e.g., the physician in charge of the examination.
Procedure_Patient Used to represent medical procedures undergone by the patient, such as surgeries, biopsies, or diagnostic interventions
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated