Computer Science and Mathematics

Sort by

Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Franc Drobnič

,

Gregor Starc

,

Gregor Jurak

,

Andrej Kos

,

Matevž Pustišek

Abstract: In the era of ever-greater data produced and collected, public health research is often limited by the scarcity of data. To improve this, we propose a data sharing in the form of Data Spaces, which provide technical, business, and legal conditions for an easier and trustworthy data exchange for all the participants. The data must be described in a commonly understandable way, which can be assured by machine-readable ontologies. We compared the semantic interoperability technologies used in the European Data Spaces initiatives and adopted them in our use case of physical development in children and youth. We propose an ontology describing data from the Analysis of Children’s Development in Slovenia (ACDSi) study in the Resource Description Framework (RDF) format and a corresponding Next Generation Systems Interface-Linked Data (NGSI-LD) data model. For this purpose, we developed a tool to generate a NGSI-LD data model using information from an ontology in RDF format. The tool builds on the declaration from the standard that the NGSI-LD information model follows the graph structure of RDF, so that such translation is feasible. The source RDF ontology is analyzed using standardized SPARQL Protocol and RDF Query Language (SPARQL), specifically using Property Path queries. The NGSI-LD data model is generated from the definitions collected in the analysis. Multiple ancestries of classes are also supported, even over multiple or common ontologies. These features may equip the tool to be used more broadly in similar contexts.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Jaba Tkemaladze

Abstract: Sequential data prediction presents a fundamental challenge across domains such as genomics and clinical monitoring, demanding approaches that balance predictive accuracy with computational efficiency. This paper introduces Ze, a novel hybrid system that integrates frequency-based counting with hierarchical Bayesian modeling to address the complex demands of sequential pattern recognition. The system employs a dual-processor architecture with complementary forward and inverse processing strategies, enabling comprehensive pattern discovery. At its core, Ze implements a three-layer hierarchical Bayesian framework operating at individual, group, and context levels, facilitating multi-scale pattern recognition while naturally quantifying prediction uncertainty. Implementation results demonstrate that the hierarchical Bayesian approach achieves an 8.3% accuracy improvement over standard Bayesian methods and 2.3× faster convergence through efficient knowledge sharing. The system maintains practical computational efficiency via sophisticated memory management, including automatic counter reset mechanisms that reduce storage requirements by 45%. Ze's modular, open-source design ensures broad applicability across diverse domains, including genomic sequence annotation, clinical time series forecasting, and real-time anomaly detection, representing a significant advancement in sequential data prediction methodology.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Frank Vega

Abstract: The triangle finding problem is a cornerstone of complex network analysis, serving as the primitive for computing clustering coefficients and transitivity. This paper presents \texttt{Aegypti}, a practical algorithm for triangle detection and enumeration in undirected graphs. By combining a descending degree-ordered vertex-iterator with a hybrid strategy that adapts to graph density, \texttt{Aegypti} ensures a worst-case runtime of $\mathcal{O}(m^{3/2})$ for full enumeration, matching the theoretical limit for listing algorithms. Furthermore, we analyze the detection variant ($\texttt{first\_triangle}=\text{True}$), proving that sorting by non-increasing degree enables immediate termination in dense instances and sub-millisecond detection in scale-free networks. Extensive experiments confirm speedups of $10\times$ to $400\times$ over NetworkX, establishing \texttt{Aegypti} as the fastest pure-Python approach currently available.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Frank Vega

Abstract: The Minimum Vertex Cover (MVC) problem is a fundamental NP-complete problem in graph theory that seeks the smallest set of vertices covering all edges in an undirected graph G = (V, E). This paper presents the find_vertex_cover algorithm, an innovative approximation method that transforms the problem to maximum degree-1 instances via auxiliary vertices. The algorithm computes solutions using weighted dominating sets and vertex covers on reduced graphs, enhanced by ensemble heuristics including maximum-degree greedy and minimum-to-minimum strategies. Our approach guarantees an approximation ratio strictly less than √2 ≈ 1.414, which would contradict known hardness results unless P = NP. This theoretical implication represents a significant advancement beyond classical approximation bounds. The algorithm operates in O(m log n) time for n vertices and m edges, employing component-wise processing and linear-space reductions for efficiency. Implemented in Python as the Hvala package, it demonstrates excellent performance on sparse and scale-free networks, with profound implications for complexity theory. The achievement of a sub-√2 approximation ratio, if validated, would resolve the P versus NP problem in the affirmative. This work enables near-optimal solutions for applications in network design, scheduling, and bioinformatics while challenging fundamental assumptions in computational complexity.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Kevin Mallinger

,

Sebastian Raubitzek

,

Sebastian Schrittwieser

,

Edgar Weippl

Abstract: Noise can substantially distort both chaotic and physiological dynamics, obscuring deterministic patterns and altering the apparent complexity of signals. Accurately identifying and characterizing such perturbations is essential for reliable analysis of dynamical and biomedical systems. This study combines complexity-based features with supervised learning to characterize and predict noise perturbations in time series data. Using two chaotic systems (Rössler and Lorenz) and synthetic electrocardiogram (ECG) signals, we generated controlled Gaussian, pink, and low-frequency noise of varying intensities and extracted a diverse set of 18 complexity metrics derived from both raw signals and phase-space embeddings. The analysis systematically evaluates how these metrics behave under different noise regimes and intensities and identifies the most discriminative features for noise classification tasks. Approximate Entropy, Mean Absolute Deviation, and Condition Number emerged as the strongest predictors for noise intensity, while Condition Number, Sample Entropy, and Permutation Entropy most effectively differentiated noise categories. Across all systems, the proposed framework reached an average accuracy of 99.9% for noise presence and type classification and 96.2% for noise intensity, significantly surpassing previously reported benchmarks for noise characterization in chaotic and physiological time series. These results demonstrate that complexity metrics encode both structural and statistical signatures of stochastic contamination.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Frank Vega

Abstract: We propose a new hypothetical framework to explore the relationship between the Birch--Swinnerton-Dyer conjecture (BSD) and computational complexity theory. This paper introduces two central conjectures: a Reduction Conjecture, which posits the existence of a polynomial-time parsimonious reduction from \#P-complete problems to the counting of integer solutions for Tunnell's equations, and a Solution Density Conjecture, which posits that these solution counts are sufficiently well-distributed. We then formally prove that if these two conjectures are true, a surprising conditional result follows: the assumptions of P = NP and the truth of the BSD conjecture would imply the collapse of the counting complexity class \#P to FP. Our main conditional result is that if BSD is true, our conjectures hold, and the widely believed separation \#P $\neq$ FP holds, then P $\neq$ NP. This work reframes the P versus NP problem as a question about two deep, open problems in number theory: the existence of a "complexity-to-arithmetic" reduction and the statistical distribution of solution counts to Tunnell's specific quadratic forms.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Xinqi Zheng

,

Haiyan Liu

Abstract: This paper introduces a computational framework called the Periodic Pairing Matrix model, which provides a foundation for modeling periodic interactions between cyclic sequences. The model formalizes pairing between two sequences using configurable step sizes, creating a deterministic pattern that repeats over a fixed cycle. We present a comprehensive analysis of its properties, including periodic behavior, coverage conditions, and vacancy patterns. The framework offers a unified approach to resource scheduling, sparse neural network design, and other domains that benefit from structured periodic interactions. Through case studies in distributed computing and deep learning, we show improvements in resource utilization, reaching full utilization under optimal settings, and faster neural network training with gains of about forty percent, while maintaining performance. The Periodic Pairing Matrix model enables principled design of systems with predictable and analyzable periodic behavior.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Ibrahim Mammadov

,

Pavel Loskot

,

Thomas Honold

Abstract: Many data processing applications involve binary matrices for storing digital contents, and employ the methods of linear algebra. One of the frequent tasks is to invert large binary matrices. At present, there seem to be no documented algorithms for inverting such matrices. This paper fills the gap by reporting these three results. First, an efficient and provably correct recursive blockwise algorithm based on pivoted PLU factorization is reported to invert the binary matrices of sizes as large as several thousands of bits. Second, assuming the Bruhat matrix decomposition, a fast method is developed for effectively enumerating all elements of the general linear groups. Third, the minimum number of bit-flips is determined to make any singular binary matrix non-singular, and thus, invertible. The proposed algorithms are implemented in C++, and they are publicly available on Github. These results can be readily generalized to other finite fields, for example, to enable linear algebraic methods for matrices containing quantized values.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Isaac Oliva-González

,

Hugo Jiménez-Hernández

Abstract: The shortest path problem remains a central challenge in graph optimization, particularly for dense or large-scale networks where classical algorithms face scalability limitations. This paper presents a QUBO-based Simulated Annealing (QUBO-SA) approach to compute the shortest path instances by employing an existing Quadratic Unconstrained Binary Optimization (QUBO) formulation that jointly encodes path costs and structural constraints. The approach is evaluated on both synthetic graphs, spanning sparse to dense connectivity regimes, and a real-world urban transportation network extracted from the downtown area of Querétaro, Mexico. The performance becomes quantified through probabilistic reliability metrics, including success probability, Time-to-Solution (TTS), and relative runtime ratio R(ptarget), benchmarked against the deterministic Dijkstra algorithm. Results show that QUBO-SA achieves near-optimal performance for small to medium graphs and maintains competitive efficiency for large-scale and sparse urban networks. In particular, the solver achieves a success probability of 0.57 with a workspace of 443 nodes corresponding to an urban graph, requiring only 35% more runtime than Dijkstra to reach 99% confidence, with this gap halved when relaxing the confidence to 90%. These results highlight the balance between the method’s exploration and computational complexity resources needed, which demonstrates its potential for scalable path optimization in both synthetic and real-world networks.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Frank Vega

Abstract: The Minimum Vertex Cover problem, a cornerstone of NP-hard optimization, has long been bounded by approximation hardness thresholds, including the Unique Games Conjecture's assertion that no polynomial-time algorithm can achieve a ratio better than \( 2 - \epsilon \) for any \( \epsilon > 0 \). This paper introduces a novel reduction-based algorithm that computes a vertex cover with an approximation ratio strictly less than 2 for any finite undirected graph with at least one edge, thereby disproving the Unique Games Conjecture. The approach transforms the input graph into an auxiliary graph with maximum degree 1 using degree-weighted auxiliary vertices, solves the minimum weighted vertex cover optimally in linear time, and projects the solution back to yield a valid cover. We provide a rigorous correctness proof, demonstrating that the projection preserves coverage while exploiting structural slack in finite graphs to ensure the strict sub-2 ratio. Runtime analysis confirms \( O(|V| + |E|) \) efficiency, making the algorithm practical for large-scale instances. This breakthrough not only advances approximation techniques for vertex cover but also resolves a major open question in complexity theory, opening avenues for revisiting UGC-dependent hardness results in other optimization domains.
Communication
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Alan Z

Abstract: According to conventional wisdom, the relationship between P and NP must be one of two possibilities: either P=NP or P≠NP. Unlike traditional approaches that base mathematical concepts on equivalent transformations—and, by extension, on the principle that correspondence remains unchanged—this theory is founded on non-equivalent transformations. By constructing a special non-equivalent transformation, I will demonstrate that for a problem P(a) in the complexity class P and its corresponding problem P(b) in the complexity class NP, P(a) is a P non-equivalent transformation of P(b), and P(b) is an NP non-equivalent transformation of P(a). That is, the relationship between P(a) and P(b) is neither P=NP nor P≠NP.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Sascha Eichstädt

,

Jens Niederhausen

Abstract: Data spaces are digital realms of data and information shared between stakeholders and peer groups. They underpin several developments in sectors ranging from automotive industry, social sciences to governmental networks. Digital traceability of information in data spaces is needed to validate statements about metadata, data quality, and data features. In many cases this also directly translates to metrological traceability of measurements to the SI. The concept and development of digital product passports bring these traceability aspects together to form a tool for a digital quality infrastructure. This paper outlines the general principles of digital metrological traceability based on digital certificates, a digital international system of units, and digital product passports.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Frank Vega

Abstract: The Maximum Independent Set (MIS) problem, a core NP-hard problem in graph theory, seeks the largest subset of vertices in an undirected graph $G = (V, E)$ with $n$ vertices and $m$ edges, such that no two vertices are adjacent. We present a hybrid approximation algorithm that combines iterative refinement with greedy selections based on minimum and maximum degrees, plus a low-degree induced subgraph heuristic, implemented using NetworkX. The algorithm preprocesses the graph to handle trivial cases and isolates, computes exact solutions for bipartite graphs using Hopcroft-Karp matching and K\"onig's theorem, and, for non-bipartite graphs, iteratively refines a candidate set via maximum spanning trees and their maximum independent sets, followed by a greedy extension. It also constructs independent sets by selecting vertices in increasing and decreasing degree orders, and computes an independent set on the induced subgraph of low-degree vertices (degree strictly less than maximum), returning the largest of the four sets. An efficient $O(m)$ independence check ensures correctness. The algorithm guarantees a valid, maximal independent set with a worst-case $\sqrt{n}$-approximation ratio, tight for graphs with a large clique connected to a small independent set, and robust for structures like multiple cliques sharing a universal vertex. With a time complexity of $O(n m \log n)$, it is suitable for small-to-medium graphs, particularly sparse ones. While outperformed by $O(n / \log n)$-ratio algorithms for large instances, it aligns with inapproximability results, as MIS cannot be approximated better than $O(n^{1-\epsilon})$ unless $\text{P} = \text{NP}$. Its simplicity, correctness, and robustness make it ideal for applications like scheduling and network design, and an effective educational tool for studying trade-offs in combinatorial optimization, with potential for enhancement via parallelization or heuristics.
Essay
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Ruixue Zhao

Abstract: This paper presents a general algorithm for rapidly generating all N×N Latin squares, along with its precise counting framework and isomorphic (quasi-group) polynomial algorithms. It also introduces efficient algorithms for solving Latin square-filling problems. Numerous combinatorial isomorphism problems, including Steiner triple systems, Mendelsohn triple systems, 1-factorization, networks, affine planes, and projective planes, can be reduced to Latin square isomorphism. Since groups are true subsets of quasigroups and group isomorphism is a subproblem of quasi-group isomorphism, this makes group isomorphism an automatically P-problem. A Latin square of order N is an N×N matrix where each row and column contain exactly N distinct symbols, with each symbol appearing only once. A matrix derived from such a multiplication table forms an N-order Latin square. In contrast, a binary operation derived from an N-order Latin square as a multiplication table constitutes a pseudogroup over the Q set. I discovered four new algebraic structures that remain invariant under permutation of rows and columns, known as quadrilateral squares. All N×N Latin squares can be constructed using three or all four of these quadrilateral squares. Leveraging the algebraic properties of quadrilateral squares that remain unchanged by permutation, we designed an algorithm to generate all N×N Latin squares without repetition when permuted, resulting in the first universal and nonrepetitive algorithm for Latin square generation. Building on this, we established a precise counting framework for Latin squares. The generation algorithm further reveals deeper structural aspects of Latin squares (pseudogroups). Through studying these structures, we derived a crucial theorem: two Latin squares are isomorphic if their subline modularity structures are identical. Based on this important and key theorem, and combined with other structural connections discussed in this paper, a polynomial-time algorithm for Latin square isomorphism has been successfully designed. This algorithm can also be directly applied to solving quasigroup isomorphism, with a time complexity of 5/16(n5−2n4−n3+2n2)+2n3 Furthermore, more symmetrical properties of Latin squares (pseudogroups) were uncovered. The problem of filling a Latin grid is a classic NP-complete problem. Solving a fillable Latin grid can be viewed as generating grids that satisfy constraints. By leveraging the connections between parametric group algebra structures revealed in this paper, we have designed a fast and accurate algorithm for solving fillable Latin grids. I believe the ultimate solution to NP- complete problems lies within these connections between parametric group algebra structures, as they directly affect both the speed of solving fillable Latin grids and the derivation of precise counting formulas for Latin grids.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Sungwook Yoon

Abstract: Organizing vast, heterogeneous enterprise documents is a critical challenge, as tradi-tional methods fail to capture the dynamic, multi-dimensional context (e.g., priority, workflow) that defines a document's true utility. This paper introduces FLACON (Flag-Aware Context-sensitive Clustering), a novel system that addresses this gap. FLACON models documents using a six-dimensional flag system—unifying semantic, temporal, priority, workflow, and relational contexts—and organizes them within an information-theoretic framework. The core objective is to minimize clustering entropy while maximizing the preservation of contextual information. The approach addresses gaps where context-aware systems lack domain-specific intelligence and LLM methods require prohibitive computational resources. FLACON provides deterministic, cost-effective organization with 7-fold performance improvement over LLM ap-proaches while achieving 89% of their clustering quality. Evaluation on nine dataset variations demonstrates significant improvements with Silhouette Scores of 0.311 versus 0.040 for traditional methods, representing 7.8-fold gains. The system demon-strates O(n log n) scalability and deterministic behavior suitable for compliance re-quirements.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Guillermo De Ita Luna

Abstract: A combinatorial tree \( (Pt) \) is used to record valid colors to be assigned to each of the vertices of a planar graph \( G \) of order \( n \). The main process consists of a loop that incrementally builds \( G \) vertex by vertex, starting from the most internal triangular face of \( G \), and in \( n-3 \) iterations, the paths constructed in \( Pt \) will have the valid color labels assigned to the vertices of \( G \). This method ultimately generates all proper 4-colorings of \( G \). In each iteration, a vertex \( v_i \in V(G) \) is selected to be aggregated to the current induced subgraph \( G_i \) of \( G \). This process, alongside the use of the \( Pt \) tree (which results in a binary tree of depth \( n-3 \)), ensures all proper 4-colorings of \( G \), regardless of the topology of the maximal planar graph \( G \). Additionally, we develop an existential theorem that shows that for any maximal planar graph \( G \), it is always possible to create a proper 4-coloring. Furthermore, we detail a method through which such a 4-coloring can be constructed. Also, we present the extremal topologies of planar graphs, highlighting those with the maximum and minimum number of 4-coloring functions.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Jaba Tkemaladze

Abstract: The Voynich Manuscript (VM) remains one of history's most perplexing cryptographic and linguistic puzzles (Landini & Foti, 2020). This paper introduces a novel hypothesis: that the VM's text is not a direct encoding of a natural language but represents a compressed data stream utilizing principles analogous to modern LZ77 compression and Huffman coding (Huffman, 1952; Ziv & Lempel, 1977). We propose that the manuscript's unusual statistical properties, including its low redundancy and specific word structure, are artifacts of a sophisticated encoding process rather than features of an unknown language (Montemurro & Zanette, 2013; Reddy & Knight, 2011). To evaluate this, we developed a computational framework that treats VM transliterations as a encoded bitstream. This framework systematically tests decompression parameters, using Shannon entropy as a primary fitness metric to identify outputs resembling natural language (Shannon, 1948; Cover & Thomas, 2006). While a complete decipherment is not yet achieved, this methodology provides a new, rigorous, and reproducible computational approach to VM analysis, moving beyond traditional linguistic correlation (Hauer & Kondrak, 2011). The framework's architecture and initial proof-of-concept results are presented, outlining a clear pathway for future research with a fully digitized VM corpus.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Michael Rey

Abstract: Computation has historically been framed in Newtonian terms: time and space as separate, absolute measures of algorithmic cost. Yet modern algorithms—from randomized heuristics to quantum circuits—demand a relativistic view where time, space, energy, entropy, and coherence form a unified manifold. This work introduces computational relativity: a geometric theory of algorithms built on spacetime geodesics, entropy trade-offs, and quantum coherence dynamics. We show how classical complexity results like the Hopcroft-Paul-Valiant theorem and recent improvements by Williams emerge naturally as specific geodesic types in this framework. The theory extends through thermodynamic principles to encompass stochastic algorithms and energy consumption, then to quantum coherence for quantum computing applications. This progression motivates living algorithms—self-monitoring systems that dynamically optimize their computational trajectories in real-time. We demonstrate applications across machine learning, quantum computing, robotics, and optimization, concluding with a comprehensive algorithm compendium that classifies computational methods by their geometric properties and resource trade-offs.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Asila Muxitdinova

Abstract: In our rapidly evolving era, Artificial Intelligence (AI) has emerged as a transformative technology that significantly contributes to various aspects of human life. This paper explores the definition of AI and its true capabilities, tracing its origins and highlighting the early ideas that inspired generations of scientists. The discussion then shifts to the present, emphasizing AI’s importance through examples of its diverse applications. Despite these benefits, public concern and fear about AI’s rapid evolution persist. This paper examines common reasons behind these fears, analyzing them from scientific and logical perspectives to provide insights and recommendations. Finally, it reflects on human adaptation and the valuable role AI plays in our ongoing evolution, with the ultimate aim of demonstrating that many negative assumptions and fears are exaggerated. By offering a broad, personal perspective on these concerns, the paper encourages readers to embrace AI as a partner in progress—coexisting with it and welcoming its integration into daily life with open arms.
Review
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Ali Raza

,

Fatima Khan

,

Zhen Bin It

,

Jovan Bowen Heng

,

Tee Hui Teo

Abstract: This review synthesizes recent advancements in federated learning (FL) frameworks tailored for sensitive domains such as mental healthcare, medical imaging, and non-IID data simulation. Some past study presents a hybrid privacy-preserving FL model that integrates clustered federated learning (CFL) and quantum federated learning (QFL) to enhance accuracy and privacy in stress detection using wearable devices. Other studies introduce FedArtML, a novel tool for generating controlled non-IID datasets, offering quantifiable metrics like Jensen–Shannon and Hellinger distances to assess data heterogeneity. Furthermore, some of the latest paper proposes a transfer learning-based FL architecture for breast cancer classification using mammography images, combining feature extraction with federated averaging to ensure privacy and robust diagnostic accuracy. Collectively, these works address key challenges in FL, including client heterogeneity, data imbalance, privacy preservation, and system performance. This review highlights the complementary strengths of hybrid architectures, synthetic data partitioning, and transfer learning in advancing real-world applications of federated learning in healthcare.

of 10

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated