Computer Science and Mathematics

Sort by

Article

Data Structures, Algorithms and Complexity

Fast Triangle Detection and Enumeration in Undirected Graphs: The Aegypti Algorithm

Frank Vega

Abstract: The triangle finding problem is a cornerstone of complex network analysis, serving as the primitive for computing clustering coefficients and transitivity. This paper presents \texttt{Aegypti}, a practical algorithm for triangle detection and enumeration in undirected graphs. By combining a descending degree-ordered vertex-iterator with a hybrid strategy that adapts to graph density, \texttt{Aegypti} ensures a worst-case runtime of $\mathcal{O}(m^{3/2})$ for full enumeration, matching the theoretical limit for listing algorithms. Furthermore, we analyze the detection variant ($\texttt{first\_triangle}=\text{True}$), proving that sorting by non-increasing degree enables immediate termination in dense instances and sub-millisecond detection in scale-free networks. Extensive experiments confirm speedups of $10\times$ to $400\times$ over NetworkX, establishing \texttt{Aegypti} as the fastest pure-Python approach currently available.

Posted: 12 February 2026

https://doi.org/10.20944/preprints202511.2197.v2

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

An Approximate Solution to the Minimum Vertex Cover Problem: The Hvala Algorithm

Frank Vega

Abstract: We present the Hvala algorithm, an ensemble approximation method for the Minimum Vertex Cover problem that combines graph reduction techniques, optimal solving on degree-1 graphs, and complementary heuristics (local-ratio, maximum-degree greedy, minimum-to-minimum). The algorithm processes connected components independently and selects the minimum-cardinality solution among five candidates for each component. \textbf{Empirical Performance:} Across 233+ diverse instances from four independent experimental studies---including DIMACS benchmarks, real-world networks (up to 262,111 vertices), NPBench hard instances, and AI-validated stress tests---the algorithm achieves approximation ratios consistently in the range 1.001--1.071, with no observed instance exceeding 1.071. \textbf{Theoretical Analysis:} We prove optimality on specific graph classes: paths and trees (via Min-to-Min), complete graphs and regular graphs (via maximum-degree greedy), skewed bipartite graphs (via reduction-based projection), and hub-heavy graphs (via reduction). We demonstrate structural complementarity: pathological worst-cases for each heuristic are precisely where another heuristic achieves optimality, suggesting the ensemble's minimum-selection strategy should maintain approximation ratios well below $\sqrt{2} \approx 1.414$ across diverse graph families. \textbf{Open Question:} Whether this ensemble approach provably achieves $\rho < \sqrt{2}$ for \textit{all possible graphs}---including adversarially constructed instances---remains an important theoretical challenge. Such a complete proof would imply P = NP under the Strong Exponential Time Hypothesis (SETH), representing one of the most significant breakthroughs in mathematics and computer science. We present strong empirical evidence and theoretical analysis on identified graph classes while maintaining intellectual honesty about the gap between scenario-based analysis and complete worst-case proof. The algorithm operates in $\mathcal{O}(m \log n)$ time with $\mathcal{O}(m)$ space and is publicly available via PyPI as the Hvala package.

Posted: 11 February 2026

https://doi.org/10.20944/preprints202506.0875.v10

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

AGRO: An Adaptive Gold Rush Optimizer with Dynamic Strategy Selection

Costas Panagiotakis

Abstract: In this paper, we propose a metaheuristic optimization algorithm called Adaptive Gold Rush Optimizer (AGRO), a substantial evolution of the original Gold Rush Optimizer (GRO). Unlike the standard GRO, which relies on fixed probabilities in the strategy selection process, AGRO utilizes a novel adaptive mechanism that prioritizes strategies improving solution quality. This adaptive component, that can be applied to any optimization algorithm with fixed probabilities in the strategy selection, adjusts the probabilities of the three core search strategies of GRO (Migration, Collaboration, and Panning), in real-time, rewarding those that successfully improve solution quality. Furthermore, AGRO introduces fundamental modifications to the search equations, eliminating the inherent attraction towards the zero coordinates, while explicitly incorporating objective function values to guide prospectors towards promising regions. Experimental results demonstrate that AGRO outperforms ten state-of-the-art algorithms on the twenty-three classical benchmark functions, the CEC2017, and the CEC2019 datasets. The source code of AGRO algorithm is publicly available at https://sites.google.com/site/costaspanagiotakis/research/agro.

Posted: 09 February 2026

https://doi.org/10.20944/preprints202602.0662.v1

Concept Paper

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Model G: Geometric Formalization of Information Spaces with Intrinsic Coherence

José Vicente Quiles Feliu

Abstract: We present Model G, a mathematical formalization of information spaces where coherence is an intrinsic property guaranteed by algebraic construction. We define the global space G through a triaxial structure (Attribute, Key, Connection) and a coherence operator Φ that filters the managed universe Ω. Four fundamental axioms establish existence by coherence, location uniqueness, acyclicity of the dependency graph, and determinism through the propagation vector Π and the determinant δ. We extend relational normal forms with five semantic-temporal normal forms (SRGD-FN1 to FN5). The SRGD implementation materializes the model through a three-layer stateless architecture. Experimental validation confirms impossibility of incoherent states and O(|Π|) complexity in operations.This work was initiated in December 2025 and the initial version was published on January 6, 2026, temporally coinciding with independent advances such as DeepSeek’s Engram (January 12, 2026).

Posted: 30 January 2026

https://doi.org/10.20944/preprints202601.0490.v3

Review

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Evaluating Tetris Piece (Tetromino) Randomization Algorithms: Sequence Fairness and Gameplay Impact of Uniform RNG vs 7-Bag Systems

Soponloe Sovann

Abstract: Modern versions of the game use the ”7-bag” randomization technique to soften the sense of bias inherent in extreme pieces “droughts,” where a required tetromino is not generated for a long time. This study compares uniform random number generation (RNG) and 7-bag randomization for two experiments. In the first experiment, we inspected 100,000 produced numbers to severity and homogeneity of distribution. The findings revealed that 7-bag randomization resulted in decreases in maximum Ipiece droughts of 71outcomes. Both techniques maintained the fixed frequencies of pieces in the long run. In the second test, the effect of the game via controlled single-player experiments. Games employing 7-bag randomness raised the average number of cleared squares by 14fewer lines and slightly lower variance, although this was not found to be statistically significant because of the small data set size. In general, the results do prove the quantitative effectiveness of 7-bag randomization in new designs for Tetris.

Posted: 29 January 2026

https://doi.org/10.20944/preprints202601.2206.v1

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Digital Twin Building Blocks for Designing A Generic City-Wide Data Exchange Platform

Manolya Kavakli-Thorne

Abstract: Digital Twin (DT) technology has become a critical component of smart city evolution, providing real-time analytics, predictive modelling, and operational efficiency enhancements. The goal of this paper is to explore the opportunities and barriers for a city-wide data exchange platform to establish the principles for a Federated DT (FDT) development to serve the integration of sector-specific DT applications to create a cohesive urban intelligence framework. The paper investigates the topic of federated data exchange in a smart city context and how interoperability among the use cases of DTs can be achieved. Two system architectures for a Data Exchange Platform have been explored, including layered and composable FDT approaches. The composable architecture has been chosen for the platform implementation to ensure interoperability, scalability, and security in real-time data exchange. The composable architecture is essentially a microservices-driven framework with self-contained components that have clear functionalities and provides the greatest flexibility for future development of the FDT Data Exchange Platform. By employing a range of microservices, the composable architecture can ensure modularity, scalability, and flexibility, making it easier to manage, update, and extend the platform to accommodate additional DTs for evolving city needs, urban management and decision-making. However, this comes at the cost of increased issues around security and governance of interfaces. The platform has been tested by 5 DTs designed by 4 Universities located in Birmingham, UK and Ulsan, South Korea. For the design of the platform, nine common elements have been identified as “building blocks” analysing the DT use cases in a sandbox environment called the Diatomic Azure DT Development Platform. These common building blocks are 3D Visualisation, Asset Management, Predictive Analytics, Artificial Intelligence, Machine Learning, Authorisation Methods, Access Control, API, and Sensors. These building blocks have been later validated by the use cases presented by 8 SMEs developing DTs. The initial results confirm that the identified building blocks are sufficient for the development of DTs to create a generic city-wide data exchange platform. These results provide insights for DT adoption by de-risking investment and targeted resources required for smart city development. The scope of this paper is limited with smart city applications but the proposed FDT system architecture should be applicable to any domain.

Posted: 23 January 2026

https://doi.org/10.20944/preprints202601.1756.v1

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Autoregressive and Residual Index Convolution Model for Point Cloud Geometry Compression

Gerald Baulig

Jiun-In Guo

Abstract: This study introduces a hybrid point cloud compression method that transfers from octree-nodes to voxel occupancy estimation to find its lower-bound bitrate by using a Binary Arithmetic Range Coder. In previous attempts, we've shown that our entropy compression model based on index convolution achieves promising performance while maintaining low complexity. However, our previous model lacks an autoregressive approach, which is apparently indispensable to compete with the current state-of-the-art of compression performance. Therefore, we adapt an autoregressive grouping method that iteratively populates, explores, and estimates the occupancy of 1-bit voxel candidates in a more discrete fashion. Furthermore, we refactored our backbone architecture by adding a distiller layer on each convolution, forcing every hidden feature to contribute to the final output. Our proposed model extracts local features using lightweight 1D convolution applied in varied ordering and analyzes causal relationships by optimizing the cross-entropy. This approach efficiently replaces the voxel convolution techniques and attention models used in previous works, providing significant improvements in both time and memory consumption. The effectiveness of our model is demonstrated on three datasets, where it outperforms recent deep learning-based compression models in this field.

Posted: 19 January 2026

https://doi.org/10.20944/preprints202601.1367.v1

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Symmetry Breaking and Regulation in Algorithmic Decision Systems: A Metaheuristic-Based Bias Intervention Module for Business Development Processes

Yu-Min Wei

Abstract: Cognitive bias introduces structural imbalance in exploration and exploitation within adaptive decision systems, yet existing approaches emphasize outcome accuracy or bias reduction while offering limited explanation of how internal decision structures regulate distortion during iterative search. This study develops a metaheuristic-based bias intervention module as a computational artifact for examining symmetry regulation at the process level of biased decision-making. Using controlled computational experiments, the study compares baseline, conventional metaheuristic, and intervention configurations through structural indicators that characterize decision accuracy, convergence stability, symmetry regulation, and bias reduction. The results show that adaptive decision coherence emerges through regulated structural adjustment rather than symmetry maximization. Across evaluated configurations, systems that maintain intermediate symmetry exhibit stable convergence and effective bias regulation, whereas configurations that preserve higher symmetry display structural rigidity and weaker regulation despite high outcome accuracy. These findings reposition cognitive bias as a structural force shaping adaptive rationality in algorithmic decision systems and advance design science research by expressing cognitive balance as measurable computational indicators for process-level analysis of regulated decision dynamics.

Posted: 12 January 2026

https://doi.org/10.20944/preprints202601.0816.v1

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Comparative Analysis of Greedy Algorithms for Minimum Vertex Cover in Unit Disk Graphs

Erlan Zhaparov

Burul Shambetova

Abstract:

The Minimum Vertex Cover (MVC) problem is NP-hard even on unit disk graphs (UDGs), which model wireless sensor networks and other geometric systems. This paper presents an experimental comparison of three greedy algorithms for MVC on UDGs: degree-based greedy, edge-based greedy, and the classical 2-approximation based on maximal matching. Our evaluation on randomly generated UDGs with up to 500 vertices shows that the degree-based heuristic achieves approximation ratios between 1.636 and 1.968 relative to the maximal matching lower bound, often outperforming the theoretical 2-approximation bound in practice. However, it provides no worst-case guarantee. In contrast, the matching-based algorithm consistently achieves the proven 2-approximation ratio while offering superior running times (under 11 ms for graphs with 500 vertices). The edge-based heuristic demonstrates nearly identical performance to the degree-based approach. These findings highlight the practical trade-off between solution quality guarantees and empirical performance in geometric graph algorithms, with the matching-based algorithm emerging as the recommended choice for applications requiring reliable worst-case bounds.

Abstract:

Posted: 29 December 2025

https://doi.org/10.20944/preprints202512.2515.v1

Essay

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Latin Grid Generation Algorithm, Exact Counting Framework, Isomorphic Polyn-Omial Determination Algorithm, and Exact Solution Algorithm for Pending Filling

Ruixue Zhao

Abstract: This paper presents a general algorithm for rapidly generating all N×N Latin squares, along with its precise counting framework and isomorphic (quasi-group) polynomial algorithms. It also introduces efficient algorithms for solving Latin square-filling problems. Numerous combinatorial isomorphism problems, including Steiner triple systems, Mendelsohn triple systems, 1-factorization, networks, affine planes, and projective planes, can be reduced to Latin square isomorphism. Since groups are true subsets of quasigroups and group isomorphism is a subproblem of quasi-group isomorphism, this makes group isomorphism an automatically P-problem. A Latin square of order N is an N × N matrix where each row and column contain exactly N distinct symbols, with each symbol appearing only once. A matrix derived from such a multiplication table forms an N-order Latin square. In contrast, a binary operation derived from an N-order Latin square as a multiplication table constitutes a pseudogroup over the Q set. I discovered four new algebraic structures that remain invariant under permutation of rows and columns, known as quadrilateral squares. All N×N Latin squares can be constructed using three or all four of these quadrilateral squares. Leveraging the algebraic properties of quadrilateral squares that remain unchanged by permutation, we designed an algorithm to generate all N × N Latin squares without repetition when permuted, resulting in the first universal and nonrepetitive algorithm for Latin square generation. Building on this, we established a precise counting framework for Latin squares. The generation algorithm further reveals deeper structural aspects of Latin squares (pseudogroups). Through studying these structures, we derived a crucial theorem: two Latin squares are isomorphic if their subline modularity structures are identical. Based on this important and key theorem, and combined with other structural connections discussed in this paper, a polynomial-time algorithm for Latin square isomorphism has been successfully designed. This algorithm can also be directly applied to solving quasigroup isomorphism, with a time complexity of 5/16(n⁵−2n⁴−n³+2n²)+2n³ Furthermore, more symmetrical properties of Latin squares (pseudogroups) were uncovered. The problem of filling a Latin grid is a classic NP-complete problem. Solving a fillable Latin grid can be viewed as generating grids that satisfy constraints. By leveraging the connections between parametric group algebra structures revealed in this paper, we have designed a fast and accurate algorithm for solving fillable Latin grids. I believe the ultimate solution to NP- complete problems lies within these connections between parametric group algebra structures, as they directly affect both the speed of solving fillable Latin grids and the derivation of precise counting formulas for Latin grids.

Posted: 29 December 2025

https://doi.org/10.20944/preprints202510.0113.v3

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Radioactive Information: How Uncomputability Ensures O(1) Precision for Non-Shannon Inequalities

Tolga Topal

Abstract: Shannon entropy and Kolmogorov complexity describe complementary facets of information. We revisit Q2 from 27 Open Problems in Kolmogorov Complexity: whether all linear information inequalities including non‑Shannon‑type ones admit $$\mathcal{O}(1)$$-precision analogues for prefix‑free Kolmogorov complexity. We answer in the affirmative via two independent arguments. First, a contradiction proof leverages the uncomputability of $$K$$ to show that genuine algorithmic dependencies underlying non‑Shannon‑type constraints cannot incur length‑dependent overheads. Second, a coding‑theoretic construction treats the copy lemma as a bounded‑overhead coding mechanism and couples prefix‑free coding (Kraft's inequality) with typicality (Shannon-McMillan-Breiman) to establish $$\mathcal{O}(1)$$ precision; we illustrate the method on the Zhang-Yeung (ZY98) inequality and extend to all known non‑Shannon‑type inequalities derived through a finite number of copy operations. These results clarify the structural bridge between Shannon‑type linear inequalities and their Kolmogorov counterparts, and formalize artificial independence as the algorithmic analogue of copying in entropy proofs. Collectively, they indicate that the apparent discrepancy between statistical and algorithmic information manifests only as constant‑order effects under prefix complexity, thereby resolving a fundamental question about the relationship between statistical and algorithmic information structure.

Posted: 26 December 2025

https://doi.org/10.20944/preprints202512.2361.v1

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Leveraging the DAO for Edge-to-Cloud Data Sharing and Availability

Adnan Imeri

Uwe Roth

Michail Alexandros Kourtis

Andreas Oikonomakis

Achileas Economopoulos

Lorenzo Fogli

Antonella Cadeddu

Alessandro Bianchini

Daniel Iglesias

Wouter Tavernier

Abstract: Reliable data availability and transparent governance are fundamental requirements for distributed edge-to-cloud systems that must operate across multiple administrative domains. Conventional cloud-centric architectures centralize control and storage, creating bottlenecks and limiting autonomous collaboration at the network edge. This paper introduces a decentralized governance and service-management framework that leverages Decentralized Autonomous Organizations (DAOs) and Decentralized Applications (DApps) to ensure verifiable, tamper-resistant, and continuously accessible data exchange among heterogeneous edge and cloud components. By embedding blockchain-based smart contracts into swarm-enabled edge infrastructures, the approach supports automated decision-making, auditable coordination, and fault-tolerant data sharing without reliance on trusted intermediaries. The proposed OASEES framework demonstrates how DAO-driven orchestration can improve data availability and accountability in real-world scenarios such as energy-grid balancing, structural-safety monitoring, and predictive maintenance of wind turbines. Results highlight that decentralized governance mechanisms enhance transparency, resilience, and trust, offering a scalable foundation for next-generation edge-to-cloud data ecosystems.

Posted: 24 December 2025

https://doi.org/10.20944/preprints202512.2121.v1

Review

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

A Systematic Literature Review on the Evolution of Skyline Query on Uncertain Database: Trends and Insights

H. M. Ikram Kays

Raini Hassan

Dini Oktarina Dwi Handayani

Abstract: Skyline Query, one of the profound tools that holds up tremendously when it comes to extracting valuable insights has witnessed multiple significant evolutions both in application domain and problem complexity over the years. In this SLR (Structural Literature Review), this study has tried to investigate the trends, evolutions of the application domain, and problem complexity from as early as 2008 until now. The authors divided the timespan into three major periods and analyzed 28 Scopus-indexed papers which this study chose using the PRISMA methodology. When looking at insights on application domain evolution, in the early years fundamental algorithmic research was taking place and it gradually shifted towards more specialized applications such as smart cities, IoT, and distributed computing. As the domains evolved, the complexity of the problems also spiked as a need to handle higher dimensionality in data, larger volume, and increased uncertainty became apparent. This paper provides impactful insights into how skyline query research domains have changed and tries to highlight future directions for addressing newer and more complex data management challenges.

Posted: 23 December 2025

https://doi.org/10.20944/preprints202512.2082.v1

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Machine Learning to Detect Abnormal Delivery Performance in Supply Chain Operations

Gita Ziabari

Abstract: This project focuses on detecting anomalies in the shipment performance of a Business-to-Business supply chain using machine learning. Two models were used for the analysis: Isolation Forest and One-Class SVM. The model training was conducted using the 2024 data to minimize the impact of COVID-19. The data was cleaned and standardized. The key delivery-performance variables were also created to support more accurate anomaly detection. The Isolation Forest achieved an accuracy of approximately 87% with a 5% contamination factor, while the One-Class SVM achieved an accuracy of approximately 82%. Both models identified the Shipping Point as the primary contributor to delays. When the trained models were tested on the 2025 dataset, Isolation Forest returned more consistent results and captured a wider range of anomalies, including Delivery Delay and quantity shortages (partial deliveries), while the One-Class SVM focused more on timing issues. Overall, the study demonstrated that machine learning–based anomaly detection can help organizations identify delivery issues early and enhance shipment performance in B2B operations.

Posted: 19 December 2025

https://doi.org/10.20944/preprints202512.1746.v1

Communication

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

A Solution to the P Versus NP Problem

Alan Z

Abstract: People create facts rather than describe them; they formulate mathematical concepts rather than discover them. However, why can people design different mathematical concepts and use different tools to change reality?

People establish the correspondence between theory and reality to use theory to explain reality. However, this implies that theory cannot change reality—if theory was capable of altering reality, the correspondence between theory and reality would no longer hold. Similarly, people have introduced the correspondence between mathematical concepts and sets. This implies that people cannot construct different mathematical concepts. If different mathematical concepts could be constructed, the correspondence between mathematical concepts and sets would no longer hold. Unlike traditional approaches that base mathematical concepts on equivalent transformations—and, by extension, on the principle that correspondence remains unchanged—this theory is founded on nonequivalent transformations. By constructing a special nonequivalent transformation, I demonstrate that for a problem P(a) in the complexity class P and its corresponding problem P(b) in the complexity class NP, P(a) is a P nonequivalent transformation of P(b), and P(b) is an NP nonequivalent transformation of P(a). That is, the relationship between P(a) and P(b) is neither P=NP nor P≠NP.

Posted: 17 December 2025

https://doi.org/10.20944/preprints202510.1418.v4

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Algebraic Learning in Finite Ring Continuum

Yosef Akhtman

Abstract: The Finite Ring Continuum (FRC) models physical structure as emerging from a sequence of finite arithmetic shells of order $q = 4t+1$. While Euclidean shells $\mathbb{F}_{p}$ support reversible Schr\"odinger dynamics, causal structure arises only in the quadratic extension $\mathbb{F}_{p^2}$, where the finite-field Dirac equation is defined. This paper resolves the conceptual tension between the quadratic expansion $\mathbb{F}_{p} \to \mathbb{F}_{p^2}$ and the linear progression of symmetry shells by introducing an algebraic innovation-consolidation cycle. Innovation corresponds to the temporary access to Lorentzian structure in the quadratic extension; consolidation extracts a finite invariant family and encodes it into the arithmetic of the next shell via a uniform G\"odel recoding procedure. We prove that any finite invariant set admits such a recoding, and we demonstrate the full mechanism through an explicit worked example for $p = 13$. The results provide a coherent algebraic explanation for how finite representational systems---biological, computational, and physical---can acquire, assimilate and preserve structure.

Posted: 11 December 2025

https://doi.org/10.20944/preprints202512.0945.v1

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

MONOTONE-MIN-3SAT in Polynomial Time: A Proof of P = NP

Frank Vega

Abstract: The P versus NP problem is a cornerstone of theoretical computer science, asking whether problems that are easy to check are also easy to solve. "Easy" here means solvable in polynomial time, where the computation time grows proportionally to the input size. While this problem's origins can be traced to John Nash's 1955 letter, its formalization is credited to Stephen Cook and Leonid Levin. Despite decades of research, a definitive answer remains elusive. Central to this question is the concept of NP-completeness. If even one NP-complete problem could be solved efficiently, it would imply that all NP problems could be solved efficiently, proving P=NP. This research proposes a groundbreaking claim: MONOTONE-MIN-3SAT can be solved in polynomial time and belongs to NP-complete, establishing the equivalence of P and NP.

Posted: 08 December 2025

https://doi.org/10.20944/preprints202409.2053.v20

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Automatic Generation of NGSI-LD Data Models from RDF Ontologies: Developmental Studies of Children and Adolescents Use Case

Franc Drobnič

Gregor Starc

Gregor Jurak

Andrej Kos

Matevž Pustišek

Abstract: In the era of ever-greater data produced and collected, public health research is often limited by the scarcity of data. To improve this, we propose a data sharing in the form of Data Spaces, which provide technical, business, and legal conditions for an easier and trustworthy data exchange for all the participants. The data must be described in a commonly understandable way, which can be assured by machine-readable ontologies. We compared the semantic interoperability technologies used in the European Data Spaces initiatives and adopted them in our use case of physical development in children and youth. We propose an ontology describing data from the Analysis of Children’s Development in Slovenia (ACDSi) study in the Resource Description Framework (RDF) format and a corresponding Next Generation Systems Interface-Linked Data (NGSI-LD) data model. For this purpose, we developed a tool to generate a NGSI-LD data model using information from an ontology in RDF format. The tool builds on the declaration from the standard that the NGSI-LD information model follows the graph structure of RDF, so that such translation is feasible. The source RDF ontology is analyzed using standardized SPARQL Protocol and RDF Query Language (SPARQL), specifically using Property Path queries. The NGSI-LD data model is generated from the definitions collected in the analysis. Multiple ancestries of classes are also supported, even over multiple or common ontologies. These features may equip the tool to be used more broadly in similar contexts.

Posted: 02 December 2025

https://doi.org/10.20944/preprints202512.0279.v1

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Ze-HB Hierarchical Bayesian Extension of the Ze

Jaba Tkemaladze

Abstract: Sequential data prediction presents a fundamental challenge across domains such as genomics and clinical monitoring, demanding approaches that balance predictive accuracy with computational efficiency. This paper introduces Ze, a novel hybrid system that integrates frequency-based counting with hierarchical Bayesian modeling to address the complex demands of sequential pattern recognition. The system employs a dual-processor architecture with complementary forward and inverse processing strategies, enabling comprehensive pattern discovery. At its core, Ze implements a three-layer hierarchical Bayesian framework operating at individual, group, and context levels, facilitating multi-scale pattern recognition while naturally quantifying prediction uncertainty. Implementation results demonstrate that the hierarchical Bayesian approach achieves an 8.3% accuracy improvement over standard Bayesian methods and 2.3× faster convergence through efficient knowledge sharing. The system maintains practical computational efficiency via sophisticated memory management, including automatic counter reset mechanisms that reduce storage requirements by 45%. Ze's modular, open-source design ensures broad applicability across diverse domains, including genomic sequence annotation, clinical time series forecasting, and real-time anomaly detection, representing a significant advancement in sequential data prediction methodology.

Posted: 02 December 2025

https://doi.org/10.20944/preprints202512.0103.v1

Article

Computer Science and Mathematics

Data Structures, Algorithms and Complexity

Detection and Comparative Evaluation of Noise Perturbations in Dynamical Systems and ECG Signals Using Complexity-Based Features

Kevin Mallinger

Sebastian Raubitzek

Sebastian Schrittwieser

Edgar Weippl

Abstract: Noise can substantially distort both chaotic and physiological dynamics, obscuring deterministic patterns and altering the apparent complexity of signals. Accurately identifying and characterizing such perturbations is essential for reliable analysis of dynamical and biomedical systems. This study combines complexity-based features with supervised learning to characterize and predict noise perturbations in time series data. Using two chaotic systems (Rössler and Lorenz) and synthetic electrocardiogram (ECG) signals, we generated controlled Gaussian, pink, and low-frequency noise of varying intensities and extracted a diverse set of 18 complexity metrics derived from both raw signals and phase-space embeddings. The analysis systematically evaluates how these metrics behave under different noise regimes and intensities and identifies the most discriminative features for noise classification tasks. Approximate Entropy, Mean Absolute Deviation, and Condition Number emerged as the strongest predictors for noise intensity, while Condition Number, Sample Entropy, and Permutation Entropy most effectively differentiated noise categories. Across all systems, the proposed framework reached an average accuracy of 99.9% for noise presence and type classification and 96.2% for noise intensity, significantly surpassing previously reported benchmarks for noise characterization in chaotic and physiological time series. These results demonstrate that complexity metrics encode both structural and statistical signatures of stochastic contamination.

Posted: 24 November 2025

https://doi.org/10.20944/preprints202511.1717.v1

of 11