Submitted:
28 January 2026
Posted:
29 January 2026
You are already at the latest version
Abstract
Keywords:
1. Summary
2. Data Description
2.1. Dataset Files and Formats
- dataset_case_studies.csv: A UTF-8 encoded tabular file containing the primary data for statistical analysis.
- dataset_case_studies.json: A machine-readable JSON array, ideal for integration into web platforms or NoSQL databases.
- dataset_case_studies.ttl: An RDF serialization in Turtle format. This file links the data instances to the classes and properties defined in the green_nanomaterials_ontology.ttl file.
2.2. Tabular Data Structure
2.3. Semantic Mapping and Interpretation
- Logic Links: The synthesis descriptors are mapped to the gsn:SustainabilityProfile class, while performance metrics are linked to gsn:PerformanceIndicator.
- Units: All numerical values follow standard units: Temperature in Kelvin (K), concentration in g/L, and adsorption capacity in mg/g, as defined by the ontology’s datatype properties.
2.4. Validation Resource
3. Methods
3.1. Data Acquisition and Curation
- Synthesis parameters: Solvent type, precursors, and energy indicators.
- Experimental conditions: pH, temperature, and dosage.
- Performance metrics: Removal efficiency and adsorption capacity (qmax).
3.2. Ontology Development
- Modularity: The ontology was organized into five core modules (Material, Synthesis, Process, Performance, and Provenance) to allow for independent updates.
- Reusability: Where possible, classes and properties were aligned with existing vocabularies such as PROV-O for provenance and CHEO or ENM for chemical entities.
- Axiomatization: Logical restrictions (SubClassOf and EquivalentTo) were implemented to enable automatic classification of “green-synthesized” materials based on their sustainability profiles.
3.3. Data Transformation and RDFization
- Tabular Structuring: The curated data was first organized into a master CSV file.
- Semantic Mapping: Using a custom Python-based mapping script, each CSV row was transformed into an RDF individual (instance).
- Serialization: The data was exported into JSON for web accessibility and Turtle (.ttl) for semantic reasoning. The Turtle version explicitly uses the gsn: namespace defined in the ontology to ensure that the instances are logically bound to their semantic definitions.
3.4. Technical Validation Setup
4. Technical Validation
4.1. Syntactic and Structural Validation
4.2. Logical Consistency and Reasoning
4.3. Semantic Validation (SHACL)
- Each Nanomaterial entry is linked to at least one RemediationMechanism.
- Quantitative indicators (like removal_efficiency_percent) are restricted to numerical ranges (0–100).
- Mandatory provenance metadata (DOI and year) is present for every record.
4.4. Competency Question Testing
5. Usage Notes (or User Notes)
5.1. Accessing and Exploring the Data
- For Nanotechnologists: The dataset_case_studies.csv file can be opened in any spreadsheet software (Excel, Google Sheets) or R/Python environments for quick benchmarking.
- For Knowledge Engineers: The .ttl file should be loaded into Protégé or a triplestore like Apache Jena Fuseki or GraphDB. Users can then execute the provided SPARQL queries to filter materials by specific green chemistry or performance criteria.
5.2. Integration and Extensibility
5.3. Software Requirements
- Protégé (v5.5 or higher) is recommended for ontology visualization.
- Python (rdflib library) is suggested for those wishing to programmatically integrate this dataset into machine learning pipelines or larger Knowledge Graphs.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| CSV | Comma-Separated Values |
| DL | Description Logic |
| DOI | Digital Object Identifier |
| FAIR | Findable, Accessible, Interoperable, and Reusable |
| GSN | Green-Synthesized Nanomaterial |
| IRI | Internationalized Resource Identifier |
| JSON | JavaScript Object Notation |
| OWL | Web Ontology Language |
| RDF | Resource Description Framework |
| SHACL | Shapes Constraint Language |
| SPARQL | SPARQL Protocol and RDF Query Language |
| TTL | Terse RDF Triple Language (Turtle) |
| W3C | World Wide Web Consortium |
References
- Recio-Colmenares, C.L.; Recio-Colmenares, R.B.; Castillo-Barrera, F.E.; Garcia-Garcia, C.A. An Ontology-Based Framework for Semantic Integration and Interoperable Assessment of Green-Synthesized Nanomaterials for Environmental Remediation. Appl. Sci. 2026, submitted.
- Arshadi, M.; Faraji, A.R.; Mehravar, M. Green synthesis of magnetic nanoparticles and their application in environmental remediation. J. Clean. Prod. 2023, 410, 137254. [CrossRef]
- Schweizer, C.; Thomas, A.; Janka-Ramm, M. Digitalizing Material Knowledge: A Practical Framework for Ontology-Driven Knowledge Graphs in Process Chains. Appl. Sci. 2024, 14, 11683. [CrossRef]
- Labra-Gayo, J.E.; Iglesias-Préstamo, Á.; Martín-Fernández, D.; Arnaud, M.A. rudof: A Rust Library for handling RDF data models and Shapes. CEUR Workshop Proc. 2024, 3828, paper 32.
- Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018. [CrossRef]
- Recio-Colmenares, C.L.; Recio-Colmenares, R.B.; Castillo-Barrera, F.E.; Garcia-Garcia, C.A. OntoNanoMat: A Semantic Dataset and Ontology for Green-Synthesized Nanomaterials. Zenodo 2026. [CrossRef]
- Berners-Lee, T.; Hendler, J.; Lassila, O. The Semantic Web. Sci. Am. 2001, 284, 34–43.
- Titocci, J.; Pulieri, M.; Rosati, I.; Karam, N. Enhancing Trait Thesauri Interoperability Using a Manual and Automated Alignment Approach. Appl. Sci. 2025, 15, 12484. [CrossRef]
- Noy, N.F.; McGuinness, D.L. Ontology Development 101: A Guide to Creating Your First Ontology; Stanford Knowledge Systems Laboratory Technical Report KSL-01-05; Stanford University: Stanford, CA, USA, 2001.
| Attribute Group | Column Name | Data Type | Description |
| Identification | case_id | String | Unique identifier for each case study (e.g., CS1, CS2). |
| nanomaterial_name | String | Common name of the synthesized material. | |
| nanomaterial_type | String | Categorization (e.g., Magnetic nanocomposite, Photocatalyst). | |
| Synthesis | synthesis_route | String | Description of the green synthesis procedure. |
| solvent_greenness | String | Qualitative assessment of the solvent (e.g., Low-toxicity). | |
| renewable_precursor | Boolean | True if biogenic or renewable reagents were used. | |
| Process | mechanism | String | Remediation process (Adsorption or Photocatalysis). |
| contaminant_name | String | Name of the target pollutant (e.g., Methylene blue). | |
| pH | Float | Operational acidity/alkalinity during the process. | |
| Performance | removal_efficiency_percent | Float | Maximum removal percentage achieved. |
| qmax_mg_per_g | Float | Maximum adsorption capacity (for adsorption cases). | |
| cycles | Integer | Number of successful recyclability tests reported. | |
| Provenance | provenance_publication_doi | String | DOI link to the original source of the data. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.