Computer Science and Mathematics

Sort by

Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Sheng Zhang,

FuHao Liu,

YuYuan Huang,

ZiQiang Luo,

Ka Sun,

HongMei Mao

Abstract: The research into complex networks has consistently attracted significant attention, with the identification of important nodes within these networks being one of the central challenges in this field of study.Existing methods for identifying key nodes based on effective distance commonly suffer from high time complexity, and often overlook the impact of nodes' multi-attribute characteristics on the identification outcomes.To identify important nodes in complex networks more efficiently and accurately, we propose a novel method that leverages an improved effective distance fusion model to identify important nodes. This method effectively reduces redundant calculations of effective distances by employing an effective influence node set. Furthermore, it incorporates the multi-attribute characteristics of nodes, characterizing their propagation capabilities by considering local, global, positional, and clustering information, thereby providing a more comprehensive assessment of node importance within complex networks.
Review
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Yawen Bao

Abstract:

Transformers have become the backbone of numerous advancements in deep learning, excelling across domains such as natural language processing, computer vision, and scientific modeling. Despite their remarkable performance, the high computational and memory costs of the standard Transformer architecture pose significant challenges, particularly for long sequences and resource-constrained environments. In response, a wealth of research has been dedicated to improving the efficiency of Transformers, resulting in a diverse array of innovative techniques. This survey provides a comprehensive overview of these efficiency-driven advancements. We categorize existing approaches into four major areas: (1) approximating or sparsifying the self-attention mechanism, (2) reducing input or intermediate representation dimensions, (3) leveraging hierarchical and multiscale architectures, and (4) optimizing hardware utilization through parallelism and quantization. For each category, we discuss the underlying principles, representative methods, and the trade-offs involved. We also identify key challenges in the field, including balancing efficiency with performance, scaling to extremely long sequences, addressing hardware constraints, and mitigating the environmental impact of large-scale models. To guide future research, we highlight promising directions such as unified frameworks, dynamic and sparse architectures, energy-aware designs, and cross-domain adaptations. By synthesizing the latest advancements and providing insights into unresolved challenges, this survey aims to serve as a valuable resource for researchers and practitioners seeking to develop or apply efficient Transformer models. Ultimately, the pursuit of efficiency is crucial for ensuring that the transformative potential of Transformers can be realized in a sustainable, accessible, and impactful manner.

Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Nikos Karousos,

Despoina Pantazi,

George Vorvilas,

Vassilios S. Verykios

Abstract: Matching problems arise in various settings where two or more entities need to be matched—such as job applicants to positions, students to colleges, organ donors to recipients, and advertisers to ad slots in web advertising platforms. Both offline and online bipartite matching algorithms have been developed for these problems, with online methods being particularly important for real-time applications like internet advertising. This study introduces the Preference Adjustment Matching Algorithm (PAMA), a novel matching framework that pairs elements, which conceptually represent a bipartite graph structure, based on rankings and preferences. In particular, this algorithm is applied to tutor-module assignment in academic settings. Tutor-module assignment is the process of assigning tutors to academic modules while balancing tutor preferences, module rankings, and institutional constraints. Based on the fundamental research theory of matching, PAMA combines both maximal and stability principles within fairness and efficiency criteria to achieve flexible and equitable assignments. PAMA operates in iterative rounds, dynamically adjusting modules and tutors’ preferences while addressing capacity and eligibility constraints. The algorithm operates under two distinct scenarios: in the first case it achieves maximal matching, while in the second one, even if it does not maintain maximality due to deadlock resolution, it guarantees its convergence to a stable solution. PAMA was applied to a real dataset provided by the Hellenic Open University (HOU), in which 3,982 tutors competed for 1,906 positions within 620 modules. Its performance was tested through various scenarios and proved capable of effectively handling both single-round and multi-round assignments. Although PAMA has a maximal matching behavior, this property might be lost in the presence of deadlock(s), but these are resolvable, allowing the algorithm to converge to a stable solution. It also adeptly resolved complex situations, offering flexibility for administrative decision-making that aligned with institutional policies, making PAMA a powerful solution for matching related problems.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Frank Vega

Abstract: The P versus NP problem is a cornerstone of theoretical computer science, asking whether problems that are easy to check are also easy to solve. "Easy" here means solvable in polynomial time, where the computation time grows proportionally to the input size. While this problem's origins can be traced to John Nash's 1955 letter, its formalization is credited to Stephen Cook and Leonid Levin. Despite decades of research, a definitive answer remains elusive. Central to this question is the concept of NP-completeness. If even one NP-complete problem, like SAT, could be solved efficiently, it would imply that all NP problems could be solved efficiently, proving P=NP. This research proposes a groundbreaking claim: SAT, traditionally considered NP-complete, can be solved in polynomial time, establishing the equivalence of P and NP.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

André Luiz Marques Serrano,

Patricia Helena Santos Martins,

Guilherme Fay Vergara,

Guilherme Dantas Bispo,

Gabriel Arquelau Pimenta Rodrigues,

Letícia Rezende Mosquéra,

Matheus Noschang de Oliveira,

Clovis Neumann,

Maria Gabriela Mendonca Peixoto,

Vinícius Pereira Gonçalves

Abstract: The sustainable management of energy resources is fundamental in addressing global environmental and economic challenges, particularly when considering biofuels such as ethanol and gasoline. This study evaluates advanced forecasting models to predict consumption trends for these fuels in Brazil. The models analyzed include ARIMA/SARIMA, Holt-Winters, ETS, TBATS, Facebook Prophet, Uber Orbit, N-BEATS, and TFT. By leveraging datasets spanning 72, 144, and 263 months, the study aims to assess the effectiveness of these models in capturing complex temporal consumption patterns. Uber Orbit exhibited the highest accuracy in forecasting ethanol consumption among the evaluated models, achieving a mean absolute percentage error (MAPE) of 6.77%. Meanwhile, the TBATS model demonstrated superior performance for gasoline consumption, with a MAPE of 3.22%. These results underline the potential of advanced time-series models to enhance the precision of energy consumption forecasts. This study contributes to more effective resource planning by improving predictive accuracy, enabling data-driven policy-making, optimizing resource allocation, and advancing sustainable energy management practices. These results support Brazil’s energy sector and provide a framework for sustainable decision-making that could be applied globally.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Rosa Molina,

Yasmina Crespo-Cobo,

Francisco J. Esteban,

Ana Victoria Arias,

Javier Rodríguez-Árbol,

Maria Felipa Soriano,

Antonio J. Ibañez-Molina,

Sergio Iglesias-Parro

Abstract: Schizophrenia is characterized by widespread disruptions in neural connectivity and dynamic modulation. Traditional EEG analyses often rely on static or averaged measures, which may overlook the temporal evolution of neural complexity across cognitive demands. This study employed Higuchi Fractal Dimension, a non-linear measure of signal complexity, to examine the temporal dynamics of EEG activity across five cortical regions (central, frontal, occipital, parietal, and temporal lobes) during an attentional and a memory-based task in individuals diagnosed with schizophrenia and healthy controls. A permutation-based Topographic Analysis of Variance, revealed significant differences in neural complexity between tasks and groups. In the control group, results showed a consistent pattern of higher neural complexity during the attentional task across the different brain regions (except during a few moments in temporal and occipital regions. This pattern of differentiation in complexity between the attentional and memory task reflects healthy individuals' ability to dynamically modulate neural activity based on task-specific requirements. In contrast, the group of patients with schizophrenia exhibited inconsistent patterns of differences in complexity between tasks over time across all neural regions. That is, differences in complexity between tasks varies across time intervals, being sometimes higher in the attentional task and other times higher in the memory task (especially in central, frontal and temporal regions). This inconsistent pattern in patients can explain reduced differentiation in complexity across tasks in schizophrenia, and suggests a disruption in the modulation of neural activity on function of task demands. These findings underscore the importance of analyzing the temporal dynamics of EEG complexity to capture task-specific neural modulation.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Zhenrui Chen,

Zhibo Dai,

Huiyan Xing,

Junyu Chen

Abstract: The financial market has been at the forefront of machine learning applications since the 1980s, yet accurate stock price prediction remains a significant challenge due to market complexity and inherent volatility. This paper presents a comprehensive approach to stock market prediction through the integration of Linear Regression (LR), Long Short-Term Memory (LSTM), and Autoregressive Integrated Moving Average (ARIMA) methods. We evaluate these approaches using historical data from five major stocks across different market sectors, demonstrating that traditional time series analysis methods can achieve comparable or superior performance to complex deep learning approaches when properly optimized. To validate our findings, we implement an integrated prediction and trading support system that provides automated data processing and real-time updates, enabling effective decision-making in dynamic market conditions. Our results suggest that the combination of multiple prediction approaches, coupled with automated trading support, can significantly enhance investment decision-making capabilities.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Yue Wu,

Carlo Vittorio Cannistraci

Abstract: Scalability is an important aspect of algorithm design. Typical methods for accessing scalability in terms of running time include theoretical time complexity by algorithm analysis, and model-fitting time complexity derived from simulation results. However, theoretical time complexity often fails to account for real-world conditions in algorithm implementation such as influence from compilers, and lacks a simulation-based examination method. Alternatively, model-fitting time complexity is prone to learn a model with incorrect scaling orders. Here, we propose the empirical time complexity (ETC), which is a data-driven model-free and parameter-free method to account for the factors influencing an algorithm; for instance, the crosstalk between algorithm realization, compilation and hardware implementation. This can be used to guide code optimization and to gain algorithmic maximum efficiency for specific hardware, programming language or compiler. The values of ETC in function of an input variable form a curve that offers a visual representation of how an algorithm scale with input size. When the ETC curves of different versions of an algorithm are reported together with the theoretical time complexity curve, their comparison allows to select what versions are closer to the theoretical complexity. To showcase the utility of ETC in real scenarios, we investigated two sets of algorithms (graph shortest path and matrix multiplication) and we offer evidence on how ETC is crucial for diagnostic and optimized design of algorithms as close as possible to their theoretical limit.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Isna Ahsan,

Mahmil Butt,

Zaiwar Ali,

Mohamad A. Alawad,

Abdulmajeed M. Alenezi,

Sheroz Khan,

And Muhammad Yahya

Abstract: Wireless-powered Mobile Edge Computing (MEC) has been proving to be an auspicious paradigm to enhance the data processing competency of low-powered networks in light of the increasing need for diagnostic information retrieval. Applications of dividing a given load into smaller units and then executing each unit independently by different processors are a class of tasks that require a pressing need of parallel and distributed processing. However, it is challenging to decide whether the units will be offloaded to the edges of a cloud (MEC) or through a concept of Mobile Device Cloudlet (MDC) where a User Equipage (UE) with finite resources prefer to offload its units to a foreign UE. The network of the client UE and foreign UE is known as a cloudlet. Furthermore, the cost function developed assigns equal weights to the factors to optimise the analysis of policies which in actual fact is not persuasive. There is a need to improve energy efficiency of UEs and consider the latency dilema in cloud computing due to distant communication between UEs and remote cloud centres. To address these problems, our paper proposes an Offloading and Time Allocation Policy using MDC and MEC (OTPMDC) that implements whether a task should be offloaded through MEC or MDC in conjunction with the time allocation judgement for the UE to harvest energy and transmit information. To address the gap of the second issue, our goal is to train an intelligent deep learning-based decision-making algorithm that will choose an optimal set of applications based on the energy in the UE. We have formulated a cost function by considering the above policies that will generate an extensive dataset from which the algorithm will select the optimal sets and train a deep learning network. The obtained simulation results depict that performance of UEs is improved.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Fei Yang,

Xiaopeng Su,

Xuemei Ren

Abstract:

The rapid growth of automotive intelligence and automation technology has made it difficult for traditional in vehicle servo systems to satisfy the demands of modern intelligent systems when facing complex problems such as external disturbances, nonlinearity, and parameter uncertainty. To improve the anti-interference ability and control accuracy of the system, this study proposes a joint control method of electronic mechanical braking control combined with anti-lock braking system. This method has developed a new type of actuator in the electronic mechanical brake control system, and introduced particle swarm optimization algorithm to optimize the parameters of the self disturbance rejection control system. At the same time, it combines adaptive inversion algorithm to optimize the anti-lock braking system. The results indicated that the speed variation of the developed actuator and the actual signal completely stopped at 1.9 seconds. During speed control and deceleration, the actuator could respond quickly and accurately to control commands as expected. On asphalt pavement, the maximum slip rate error of the optimized control method was 0.0428, while the original control method was 0.0492. The optimized method reduced the maximum error by about 12.9%. On icy and snowy roads, the maximum error of the optimization method was 0.0632, significantly lower than the original method's 0.1266. The optimization method could significantly reduce slip rate fluctuations under extreme road conditions. The proposed method can significantly improve the control performance of the vehicle mounted servo platform, reduce the sensitivity of the system to external disturbances, and has high practical value.

Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Mahmood Yousaf,

Muhammad Tariq,

Abdul Jabbar,

Syed Qaisar Jalil

Abstract: This comprehensive review paper explores the diverse landscape of cryptocurrency forecasting, tracing its evolution from an alternative to traditional monetary systems to its significant growth in the global financial arena. It consolidates existing research by categorizing and analyzing 234 scholarly articles, organizing them into machine learning, deep learning, deep reinforcement learning, and statistical methodologies, and evaluating the related metrics. The case study titled “Examining the performance differences between backtesting and forward testing” highlights the challenges investors face, as strategies that appear effective in backtesting often fail in practical use. Another case study, “Social Data Exploration in Cryptocurrency Trends,” examines how social media data can provide insights into market movements and investor sentiment, revealing the impact of social trends on cryptocurrency prices. The findings section provides a detailed view, illuminating trends such as yearly publication rates, methodological distributions, input features, training/testing splits, the total number of data samples considered, and forecasting time horizons. This survey paper serves as a valuable resource, providing researchers and investors with a solid foundation for understanding and navigating the dynamic field of cryptocurrency forecasting.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Chuan-Min Lee

Abstract: Domination problems are fundamental problems in graph theory with diverse applications in optimization, network design, and computational complexity. This paper investigates {k}-domination, k-tuple domination, and their total domination variants in weighted strongly chordal graphs and chordal bipartite graphs—two well-studied subclasses of chordal graphs and bipartite graphs. We extend existing theoretical models to explore the less-explored domain of vertex-weighted graphs and establish efficient algorithms for these domination problems. Specifically, the {k}-domination problem in weighted strongly chordal graphs and the total {k}-domination problem in weighted chordal bipartite graphs are shown to be solvable in O(n+m) time. For weighted proper interval graphs and convex bipartite graphs, we solve the k-tuple domination and total k-tuple domination problems in O(n^{2.371552}log^{2}(n)log(n/δ)), where δ is the desired accuracy. Furthermore, for weighted unit interval graphs, the k-tuple domination problem achieves a significant complexity improvement, reduced from O(n^{k+2}) to O(n^{2.371552}log^{2}(n)log(n/δ)). These results are achieved through a combination of linear and integer programming techniques, complemented by totally balanced matrices, totally unimodular matrices, and graph-specific matrix representations such as neighborhood and closed neighborhood matrices.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Dmitry Lukyanenko,

Sergei Torbin,

Valentin Shinkarev

Abstract: The paper proposes an algorithm for parallelizing calculations that arise when using highly optimized minimization functions available in many computing packages. The main idea of the proposed algorithm is based on the fact that although the ``inner workings'' of the minimization function used may not be known to the user, it inevitably uses in its work auxiliary functions that implement the calculation of the minimized functional and its gradient, which are usually realized by the user and in many cases can be parallelized. The paper discusses in detail both the parallelization algorithm and its software implementation using MPI parallel programming technology, which can act as template for parallelizing a wide set of applied minimization problems. An example of software implementation of the proposed algorithm is demonstrated using the Python programming language, but can be easily rewritten using the C/C++/Fortran programming languages.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Tenzin Chan,

De Wen Soh,

Christopher Hillar

Abstract: Oftentimes in a complex system it is observed that, as a control parameter is varied, there are certain intervals during which the system undergoes dramatic change. Especially in biology, these signatures of criticality are thought to be connected with efficient computation and information processing. Guided by the classical theory of Rate-Distortion (RD) from information theory, we propose a measure for detecting and characterizing such phenomena from data. When applied to RD problems, the measure correctly identifies exact critical trade-off parameters emerging from the theory and allows for the discovery of new conjectures in the field. Other application domains include efficient sensory coding, machine learning generalization, and natural language. Our findings give support to the hypothesis that critical behavior is a signature of optimal processing.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Richard Murdoch Montgomery

Abstract: This article presents an in-depth overview of clustering techniques, which play a vital role in unsupervised learning by uncovering natural groupings in data. We examine five prominent methods: k-Means, k-Medoids, Kohonen Networks and Self-Organizing Maps (SOMs), Fuzzy C-Means, Hierarchical Clustering, and Spectral Clustering. Each technique is described in detail, including its mathematical foundation, operational mechanism, applications, strengths, and limitations. The goal is to provide a thorough understanding of each approach, helping readers select the most appropriate method for their data analysis needs. Practical examples are also provided, demonstrating the application of these clustering techniques in various real-world contexts such as customer segmentation, image processing, and bioinformatics.
Technical Note
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Vuong M. Ngo,

Geetika Sood,

Patricia Kearney,

Fionnuala Donohue,

Dongyun Nie,

Mark Roantree

Abstract: The RECONNECT project addresses the fragmentation of Ireland's public healthcare systems, aiming to enhance service planning and delivery for chronic disease management. By integrating complex systems within the Health Service Executive (HSE), it prioritizes data privacy while supporting future digital resource integration. The methodology encompasses structural integration through a Federated Database design to maintain system autonomy and privacy, semantic integration using a Record Linkage module to facilitate integration without individual identifiers, and the adoption of the HL7-FHIR framework for high interoperability with the national electronic health record (EHR) and the Integrated Information Service (IIS). This innovative approach features a unique architecture for loosely coupled systems and a robust privacy layer. A demonstration system has been implemented to utilize synthetic data from the Hospital Inpatient Enquiry (HIPE), Chronic Disease Management (CDM), Primary Care Reimbursement Service (PCRS) and Retina Screen systems for healthcare queries. Overall, RECONNECT aims to provide timely and effective care, enhance clinical decision-making, and empower policymakers with comprehensive population health insights.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Victor Andres Bucheli,

Mauricio Gaona,

Oswaldo Solarte Pabón

Abstract: This article examines the impact of network degree distribution on the performance of evolutionary algorithms, specifically in solving the Traveling Salesperson Problem (TSP). It evaluates the integration of various network types to regulate population crossovers, enhancing candidate solutions and their refinement over generations. The experimental analysis includes complex networks like Erdős–Rényi and Barabási–Albert, along with regular graphs such as Balanced trees and Harary graphs. The study focuses on how different network structures influence exploration, convergence, and computational efficiency, particularly in terms of CPU time. Results show that highly connected networks, like Barabási–Albert, facilitate faster convergence and improved solution quality by guiding the search process more effectively. The Harary graph with k=2 demonstrated superior performance, reducing execution times by up to sevenfold compared to traditional approaches. This improvement highlights the advantages of specific degree distributions in balancing exploration and exploitation. Networks with hubs, such as Barabási–Albert, accelerate information flow and the evolutionary process, while more regular networks, like Harary graphs, provide controlled structures that maintain high solution quality. These findings emphasize the importance of network topology in evolutionary algorithms and suggest further exploration of hybrid models and other network types to optimize NP-hard problems.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Aaron Hong,

Christina Boucher

Abstract: The burgeoning volume of genomic data, fueled by advances in sequencing technologies, demands efficient data compression solutions. Traditional algorithms like Lempel-Ziv77 (LZ77) have been foundational in offering lossless compression, yet they often fall short when applied to the highly repetitive structures typical of genomic sequences. This review delves into the evolution of LZ77 and its derivatives, exploring specialized algorithms such as prefix-free parsing, AVL grammars, and LZ-based methods tailored for genomic data. Innovations in this field have led to enhanced compression ratios and processing efficiencies by leveraging the intrinsic redundancy within genomic datasets. We critically examine a spectrum of LZ77-based algorithms, including newer adaptations for external and semi-external memory settings, and contrast their efficacy in managing large-scale genomic data. Additionally, we discuss the potential of these algorithms to facilitate the construction of data structures such as compressed suffix trees, crucial for genomic analyses. This paper aims to provide a comprehensive guide on the current landscape and future directions of data compression technologies, equipping researchers and practitioners with insights to tackle the escalating data challenges in genomics and beyond.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Wangzhou Luo,

Hailong Wu,

Jiegang Peng

Abstract: The Electric Fish Optimization (EFO) Algorithm is inspired by the predation behavior and communication of weak electric fish. It is a novel meta-heuristic algorithm that attracts researchers because it has few tunable parameters,high robustness,and strong global search capabilities. Nevertheless, when operating in complex environments, the EFO algorithm encounters several challenges including premature convergence, susceptibility to local optimum, and issues related to passive electric field localization stagnation. To address these challenges, this study introduces an Adaptive Electric Fish Optimization Algorithm Based on Standstill Label and Level Flight (SLLF-EFO). This hybrid approach incorporates the Golden Sine Algorithm and Good Point Set Theory to augment the EFO algorithm’s capabilities, employs a variable step size Levy flight strategy to efficiently address passive electric field localization stagnation problems, and utilizes a standstill label strategy to mitigate the algorithm’s tendency to fall into local optimum during the iterative process. By leveraging multiple solutions to optimize the EFO algorithm, this framework enhances its adaptability in complex environments. Experimental results from benchmark functions reveal that the proposed SLLF-EFO algorithm exhibits improved performance in complex settings, demonstrating enhanced search speed and optimization accuracy This comprehensive optimization not only enhances the robustness and reliability of the EFO algorithm but also provides valuable insights for its future applications.
Article
Computer Science and Mathematics
Data Structures, Algorithms and Complexity

Frank Vega

Abstract: The P versus NP problem is a fundamental question in computer science. It asks whether problems whose solutions can be quickly verified can also be quickly solved. Here, "quickly" refers to computational time that grows proportionally to the size of the input (polynomial time). While the problem's roots trace back to a 1955 letter from John Nash, its formalization is attributed to Stephen Cook and Leonid Levin. Despite extensive research, a definitive answer remains elusive. Closely tied to this is the concept of NP-completeness. If a single NP-complete problem could be solved efficiently, it would imply that all problems in NP can be solved efficiently, proving that P equals NP. This work posits that MONOTONE ONE-IN-THREE 3SAT, a notoriously difficult NP-complete problem, can be solved efficiently, thereby establishing the equivalence of P and NP.

of 8

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated