Computer Science and Mathematics

Sort by

Article
Computer Science and Mathematics
Mathematical and Computational Biology

Hua-Lin Xu

,

Xiu-Jun Gong

,

Hua Yu

,

Ying-Kai Wang

Abstract: Accurate identification of promoters is essential for deciphering gene regulation but remains challenging due to the complexity and variability of transcriptional initiation signals. Existing deep learning models often fail to simultaneously capture long-range dependencies and precise local motifs in DNA sequences. To address this, we propose DNABERT2-CAMP, a hybrid deep learning framework that integrates global sequence context with localized feature extraction for enhanced promoter recognition in Escherichia coli. The model leverages a pre-trained DNABERT-2 Transformer to encode evolutionary conserved patterns across extended contexts, while a novel CAMP (CNN-Attention-Mean Pooling) module detects fine-grained promoter motifs through convolutional filtering, multi-head attention, and mean pooling. By fusing global embeddings with high-resolution local features, our approach achieves robust discrimination between promoter and non-promoter sequences. Under 5-fold cross-validation, DNABERT2-CAMP attained an accuracy of 93.10% and a ROC AUC of 97.28%. It also demonstrated strong generalization on independent external data, achieving 89.83% accuracy and 92.79% ROC AUC. These results underscore the advantage of combining global contextual modeling with targeted local motif analysis for accurate and interpretable promoter identification, offering a powerful tool for synthetic biology and genomic research.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Valentin E. Brimkov

Abstract: In this work, we pose and aim to answer the following questions, among others: Which quantitative characteristics, being satisfied, led to the phase transition from "primordial soup" to living organisms? How to measure the negentropy of a certain organic matter that underpinned the appearance of a certain species? To what extent do the biosequences of living organisms differ from random sequences? How do we quantitatively distinguish primitive from higher-level organisms? How can we compare the complexity of two living things? Is there an adequate mathematical structure that naturally and appropriately represents each organism biosequence and all of them as a whole? What are the properties of that structure? How does that structure evolve, and what are the theoretical limits of any further evolution? Is it likely that these bounds will be reached, and what are the "limits of life?" How to estimate the effect on the mechanism of evolution of natural selection vs. the one of chance and mutations? To this end, we introduce relevant mathematical structures and use them for modeling purposes. Finally, we also speculate on possible scenarios of the origin of life, evolution, and related issues.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Debnarayan Khatua

,

Bikash Kumar

,

Manoranjan K. Singh

,

Somnath Kumar

Abstract: Hepatitis C Virus (HCV) continues to be a significant worldwide health issue, particularly in resource-limited environments with inadequate diagnostic and therapeutic options. This study formulates a deterministic six-compartment model, predicated on the assumptions that the population undergoes natural birth-death dynamics, awareness initiatives transition individuals from $S_1$ to $S_2$, diagnosis advances U to I, recovery is achieved through therapy or immunity, and infection and mortality rates vary among classes. The system is described by coupled nonlinear ODEs that include three time-dependent controls. Analytical examination guarantees the positivity and boundedness of all compartments and calculates the fundamental reproduction number ($R_0$) using the next-generation matrix. Sensitivity analysis shows that $\beta_1, \beta_2, \tau_1, \tau_2$ are the most important parameters. Using Pontryagin's Maximum Principle, the forward–backwards sweep method is employed to determine the optimal controls that minimise both infection and cost. A Mamdani fuzzy logic controller is added to handle parameter uncertainty and generate adaptive responses to infection pressure, awareness level, and hospital load. Simulations reveal that fuzzy control delivers equivalent suppression to the crisp optimum with around two-thirds lower cost, enabling a stable, interpretable, and resource-efficient paradigm for dynamic HCV intervention.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Gabriela Fernandes

Abstract:

Relapse in acute myeloid leukemia (AML) is frequently associated with chemoresistance, yet the molecular mechanisms driving this transition remain incompletely understood. To explore relapse-associated epigenetic remodeling, we reanalyzed publicly available Nanopore whole-genome methylation data from three AML patients with matched onset and relapse samples. We focused on CpG-poor transcription factor (TF)-associated regulatory regions, recently implicated as unconventional epigenetic hotspots in leukemia progression. Across all samples, relapse was characterized by a consistent gain in DNA methylation within CpG-poor TF regions, with all ranked loci demonstrating a positive mean Δβ shift. Heatmap visualization of the top-ranked regions revealed distinct clustering of relapse versus onset samples, supporting the presence of a coordinated epigenetic signature rather than random methylation drift. These findings suggest that relapse AML cells may acquire targeted methylation to suppress key regulatory networks involved in DNA repair, apoptosis, and growth control, thereby enabling therapeutic escape. This work highlights the potential utility of Nanopore methylation profiling as a real-time biomarker platform to detect relapse-associated epigenetic rewiring and guide precision treatment strategies.

Article
Computer Science and Mathematics
Mathematical and Computational Biology

Maxim Valentinovich Polyakov

,

Elena Ivanovna Tuchina

Abstract: Developing effective CAR-T cell therapy for solid tumours remains challenging because of biological barriers such as antigen escape and an immunosuppressive microenvironment. The aim of this study is to develop a mathematical model of the spatio-temporal dynamics of tumour processes in order to assess key factors that limit treatment efficacy. We propose a reaction–diffusion model described by a system of partial differential equations for the densities of tumour cells and CAR-T cells, the concentration of immune inhibitors, and the degree of antigen escape. The methods of investigation include stability analysis and numerical solution of the model using a finite-difference scheme. The simulation results show that antigen escape leads to the formation of a persistent core within the tumour and subsequent relapse after an initial regression. We find that the efficacy of therapy critically depends on the balance between the rate of tumour-cell killing and the rate of resistance development, and that repeated administration of CAR-T cells provides deeper and more durable suppression of tumour growth compared with a single infusion. We conclude that the proposed model is a valuable tool for analysing and optimising CAR-T therapy protocols, and that our results highlight the need for combined strategies aimed at overcoming antigen escape.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Arnav Gupta

,

Gatik Goyal

Abstract: Both hereditary and clinical risk factors influence development of T2D. Currently a rich body of research exists about the effect of the clinical factors on T2D, but less is known about how genetic factors influence the development of T2D. Therefore, we used an AI trained ML algorithm to better understand how genetic variants influence the development of T2D in the presence of high, moderate, and low risk clinical factors.We collected genetic and clinical risk factor data sets from publicly available sources. We probabilistically assigned genetic variants from our genetic dataset to the individuals in the clinical dataset to form a single dataset containing both clinical and genetic risk factors. The combined data set was then trained on XGBoost XGBClassifier. SHAP Summary plots were also generated for each risk group after model training. The model’s predictive performance (AUC scores) achieved highest accuracy with the low-risk group, while the moderate and high-risk groups performed slightly lower. According to the SHAP plots, both BMI and family history are key predictors of T2D across all risk groups. However, SNP effect sizes were more influential than other clinical risk factors, indicating that genetic contributions, while secondary, were still relevant. ROC curves assess the model’s ability to predict diabetes cases across risk groups. All models performed above the 0.7 AUC threshold, with the low risk group having an AUC score of 0.9116, the medium risk group AUC score being 0.7372, and the high risk group AUC score being 0.7366. indicating they are clinically applicable and not affected by assignment of genetic variables. While genetic treatments for diabetes remain experimental, our work supports emerging advancements in pharmacogenomics and gene-based therapies by helping to identify which patients may benefit from specific drug regimens including gene-based interventions.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Natalya Maxutova

,

Akmaral Kassymova

,

Kuanysh Kadirkulov

,

Aisulu Ismailova

,

Gulkiz Zhidekulova

,

Zhanar Azhibekova

,

Jamalbek Tussupov

,

Quvvatali Rakhimov

,

Zhanat Kenzhebayeva

Abstract: This paper proposes an intelligent and explainable ensemble system for predicting as-partate aminotransferase (AST) levels based on routine biochemical and demographic data from the NHANES dataset. The framework integrates robust preprocessing, adaptive feature encoding, and multi-level ensemble learning within a nested cross-validation (5×3) structure to ensure reproducibility and prevent data leakage. Several regression mod-els—including Random Forest, XGBoost, CatBoost, and stacking ensembles—were sys-tematically compared using R², RMSE, MAE, and MAPE metrics. The results show that the Stacking v2 architecture, combining CatBoost, LightGBM, and Ridge meta-regression, achieves the highest predictive accuracy and stability. Explainable AI analysis using SHAP revealed key biochemical and lifestyle factors influencing AST variability. The pro-posed system provides a modular, interpretable, and reproducible foundation for deci-sion-support applications in intelligent healthcare analytics, aligning with the goals of applied system innovation.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Juan Pablo Acuña González

,

Moisés Sánchez Adame

,

Oscar Montiel

Abstract: We formalize an inverse, data-conditioned variant of the Variational Quantum Eigensolver (VQE) for clinical biomarker discovery. Given patient-encoded quantum states, we construct a task-specific Hamiltonian whose coefficients are inferred from clinical associations, and interpret its expectation value as a calibrated energy score for prognosis and treatment monitoring. The method integrates principled coefficient estimation, ansatz specification with basis rotations, commuting-group measurements, and a practical shot-budget analysis. Evaluated on public infectious-disease datasets under severe class imbalance, the approach yields consistent gains in balanced accuracy and precision-recall over strong classical baselines, with stability across random seeds and feature ablations. This variational energy-scoring framework bridges Hamiltonian learning and clinical risk modeling, offering a compact, interpretable, and reproducible route to biomarker prioritization and decision support.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Joshua Kim

,

Sungwoo Yang

Abstract: Background/Objectives: Ionizable lipid nanoparticles (LNPs) are the mainstream delivery mechanisms for mRNA vaccines. However, LNPs are limited in their mRNA transfection efficiency (TE) into target cells. Dendrimersome nanoparticle (DNP) delivery systems, developed using ionizable amphiphilic Janus dendrimers (IAJDs), were designed to overcome the limitations of earlier approaches. Researchers have found this alternative promising due to their comparatively simple, repeating one-component structure and enhanced stability. This study sought to clarify the impact of particular IAJD structural components on mRNA TE and develop novel IAJD candidates for maximum predicted TE. Methods: Structural constituents (hydrophilic, ionizable amine, & hydrophobic regions) were systematically defined & encoded for computational analysis. Luciferase-induced luminescence was used as a quantitative metric for mRNA transfection. TE prediction models were built using several machine learning algorithms, and the model using eXtreme Gradient Boosting was selected. This prediction model overcame imbalanced datasets and this model was used to find the optimal IAJD designs and formulation conditions. Results: The IAJD optimization process ultimately yielded three novel optimized IAJD candidates and one of existing IAJDs, surpassing previously identified IAJDs. Conclusions: To our knowledge, this study presents the first large-scale computational investigation of IAJD structural optimization using machine learning. The design of IAJD is the primary factor that influences mRNA TE, but there are other impacting factors and more work is needed. This study highlights the potential of ML-driven IAJD optimization. Combined with high-throughput in vitro assays, this method could significantly accelerate mRNA therapeutics development with an improved delivery mechanism.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Ritwik Deshpande

,

Vishal Lakshmanan

Abstract: This research paper provides a comprehensive analysis of the impact of generative artificial intelligence on the diagnosis of Attention-Deficit/Hyperactivity Disorder (ADHD) among high school students in North- ern California between 2022 and 2025. This period is marked by two converging phenomena: the explosive, near-universal adoption of generative AI tools like ChatGPT in educational settings and a complex, evolv- ing landscape of adolescent mental health. While national ADHD diagnosis rates for adolescents have remained stable at approximately 14% [19, 20], California has consistently reported significantly lower prevalence, around 6% [18]. Northern California, as a global technology hub and a leader in educational policy, serves as a critical case study for examining the intersection of these trends. This paper synthesizes data on AI adoption rates, student usage patterns, regional educational policies, and the neuropsycholog- ical effects of AI on adolescent cognition. The analysis reveals that the primary impact of generative AI is not on the raw prevalence of ADHD but on the fundamental nature of its presentation, assessment, and diagnosis. AI tools function as a dual-edged sword, simultaneously offering compensatory support that can mask underlying executive function deficits while also potentially exacerbating ADHD symptoms or inducing ADHD-like cognitive patterns through mechanisms of attention fragmentation and dopamine sys- tem dysregulation [54,87]. This creates a profound diagnostic challenge, complicating clinical assessments and potentially leading to both under-diagnosis and misdiagnosis. The paper concludes that the rapid in- tegration of generative AI necessitates a paradigm shift in clinical and educational approaches to ADHD, requiring updated assessment protocols that account for a student’s digital cognitive ecosystem to ensure accurate and equitable diagnosis.
Concept Paper
Computer Science and Mathematics
Mathematical and Computational Biology

Akhilesh Kaushal

Abstract: Modern biomedical research generates vast, multi-modal datasets (multi-omics) from the same patient cohorts, offering an unprecedented opportunity to understand complex diseases. However, integrating these heterogeneous data views to predict clinical outcomes like patient survival presents significant statistical challenges. These challenges include data heterogeneity, high dimensionality, inherent zero-inflation due to technical dropouts or biological absence, and the need to incorporate prior biological knowledge. We propose the Bayesian Multi-view Graph Convolutional Network (BMGCN), a deep generative framework designed to address these challenges. BMGCN factorizes the data into shared and view-specific latent representations, enabling both data integration and the identification of view-specific signals. It employs graph-convolutional encoders to integrate prior biological network knowledge, a zero-inflated likelihood to accurately model sparse omics data, and a spike-and-slab prior for Bayesian view selection to identify modalities most relevant to the outcome. Finally, a semi-parametric Cox proportional hazards module allows the model to handle right-censored survival data directly. We detail the full generative model, derive the variational inference objective, and outline a comprehensive validation strategy. BMGCN provides a powerful, interpretable, and flexible framework for integrative multi-omics analysis.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Ragavan Murugasan

,

Veeramani Chinnadurai

Abstract:

In this research article , we propose a fuzzy fractional-order SEI\( R_iU_i \)HR model to describe the transmission dynamics of COVID-19, comprising susceptible, exposed, infected, reported, unreported, hospitalized, and recovered compartments. The uncertainty in initial conditions is represented using fuzzy numbers, and the fuzzy Laplace transform combined with the Adomian decomposition method is employed to solve the non-linear differential equations and also to derive approximate analytical series of solutions. In addition to fuzzy lower and upper bound solutions, is introduced to provide a representative trajectory under uncertainty. Numerical experiments are conducted to compare fuzzy and normal (non-fuzzy) solutions, supported by 3D visualizations. The results reveal the influence of fractional order and fuzzy parameters on epidemic progression, demonstrating the model’s capability to capture realistic variability and to provide a flexible framework for analyzing infectious disease dynamics.

Article
Computer Science and Mathematics
Mathematical and Computational Biology

Jianfeng Yao

,

Mengmeng Yang

,

Zhuofan Li

,

Denglong Ha

,

Wenqiang Gao

,

Xiao He

,

Xuefan Hu

,

Xinyu Song

Abstract: To improve the accuracy of tree age estimation by accounting for variations in radial growth, this study developed a diameter-age model that incorporates radial growth rate for seven typical tree species across subtropical to cold temperate regions. For each tree species, six trees were selected, including 2 dominant trees, 2 intermediate trees, and 2 suppressed trees. A total of 646 disks were collected at 1-meter intervals along the stems, starting at 0.3 m height. Disks diameters and tree-rings were measured, and the radial growth rate of each disk over the past two years was calculated. For each tree species, 2/3 of the data were randomly selected as the modeling dataset, while the remaining 1/3 served as the testing dataset. Based on scatter plots, select linear models, logarithmic models, and exponential models as candidate models. A logarithmic function best described the diameter-age relationship, while an exponential model best fit the radial growth rate -age relationship. A dual-factor nonlinear model combining both variables achieved the highest estimation accuracy (77.95%), significantly outperforming single-factor models based solely on diameter (52.72%) or growth rate (70.78%). These results demonstrate that integrating radial growth rate substantially enhances the precision of tree age estimation.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Meriem Bouzari

,

Latifa Ait Mahiout

,

Anastasia Mozokhina

,

Vitaly Volpert

Abstract: We develop and analyze a reaction-diffusion model describing the early spatial dynamics of viral infection in tissue, incorporating key components of the innate immune system: inflammatory cytokines and circulating macrophages. The system couples three spatial partial differential equations (for uninfected cells, infected cells, and virus particles) with two ordinary differential equations (for cytokines and activated macrophages), and includes time delays related to intracellular viral replication. In the absence of macrophage degradation, we derive analytical expressions for the total viral load and the wave speed, and identify explicit immune control thresholds in terms of the virus replication number and the strength of the immune response. In the presence of macrophage degradation, simulations reveal that increasing macrophage turnover accelerates wave propagation and increases viral burden. These results highlight the critical role of innate immune feedback, modulated by effector degradation, in shaping the spatial outcome of infection. Depending on the values of viral replication number and the strength of the immune response, infection can be immediately suppressed, or it can propagate with gradual extinction due to the time-dependent immune response, or it can persistently propagate in the tissue in the form of a reaction-diffusion wave.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Freddy Patricio Moncayo-Matute

,

R. Claramunt

,

Alvaro Guzman-Bautista

,

Paúl Bolívar Torres-Jara

,

Enrique Chacón-Tanarro

Abstract: Background/Objectives: Screw loosening and vertebral fractures remain common after vertebral body tethering (VBT). Because tightening torque sets screw preload, its biomechanical effect warrants explicit modeling. In this paper, a Finite Element (FE) model, supported by ex-vivo porcine vertebrae tests, was developed and validated that incorporates torque-induced pre-tension to quantify vertebral stress, aiming toward customizable VBT planning. Methods: An FE model with pre-tension and axial extraction failure was parameterized using ex vivo tests on five porcine vertebrae. A laterally inserted surgical screw in each specimen was tightened to 5.9±0.80 [N·m]. Axial extraction produced failure loads of 2.1±0.31 [kN]. This is also considered in the FE model to validate the failure scenario. Results: Torque alone generated peak von Mises stresses of 16.1 [MPa] (cortical) and 2.1 [MPa] (trabecular), lower than prior reports. With added axial load, peaks rose to 141.1 [MPa] and 19.7 [MPa], exceeding typical ranges. However, predicted failure agreed with experiments, showing 0.58 [mm] displacement and a conical displacement distribution around the washer. Conclusions: Modeling torque-induced pre-tension is essential to reproduce realistic stress states and anchor failure in VBT. The framework enables patient-specific assessment (bone geometry/density) to recommend safe tightening torques, potentially reducing screw loosening and early fractures.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Ilyes Abdelhamid

,

Yuchi Liu

,

Armel Lefebvre

,

Ziheng Liao

,

Aldo Acevedo

,

Carlo Vittorio Cannistraci

Abstract: Pathway enrichment analysis (PEA) is fundamental for interpreting omics signatures. Standard PEA practice reduces results to tabular significance lists, where complex systems‑biology insights reside undisclosed. Here, we present Hyperpathway, an open-access network-based visualization webtool for PEA’s results interpretation. Given a table of statistically significant pathways and enriched molecules, Hyperpathway transforms this tabular information into a pathway–molecule bipartite network. Then it embeds the network into a two-dimensional hyperbolic disk, providing a holistic geometric representation of the nodes hierarchical organization along the radial coordinates, and the nodes similarity patterns along the angular coordinates. On genomic, metabolomic, and lipidomic datasets, Hyperpathway allows a deeper understanding of the interplay between pathways and their molecular components, facilitating the visualization and identification of latent functional systems biology modules not readable in conventional PEA tabular outputs. By bridging statistical enrichment analysis with network geometry, Hyperpathway advances pathway analysis from a list-based to a systems-level visualization paradigm.
Review
Computer Science and Mathematics
Mathematical and Computational Biology

Fatemeh Safari

,

Jai J Tree

,

Fatemeh Vafaee

Abstract: Machine learning is a powerful approach for analysing RNA sequences, particularly for understanding the function and regulation of non-coding RNAs. A critical step in this process is feature extraction, which transforms biological sequences into numerical representations that allow computational models to capture and interpret complex biological patterns. Despite its central role, the field of RNA feature extraction remains broad and fragmented, with limited standardization and accessibility hindering consistent application. In this comprehensive review, we address the fragmentation of the field by systematically organizing over 25 feature extraction strategies into sequence- and structure-based approaches. We further conduct a comparative analysis highlighting how the choice of feature sets impacts model performance, reinforcing the importance of integrated feature engineering. To facilitate practical adoption, it also provides a curated list of publicly available tools and software packages. By consolidating methodologies and resources, this work seeks to improve reproducibility, scalability, and interpretability in machine learning-driven RNA research.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Amar Nath Chatterjee

,

Santosh Kumar Sharma

,

Fahad Al Basir

,

Aeshah A. Raezah

Abstract: H1N1 influenza, also known as swine flu, is a subtype of the influenza A virus that can infect humans, pigs, and birds. Sensitivity analysis and optimal control studies play a crucial role in understanding the dynamics of infectious diseases like H1N1 influenza. This study employs a mathematical model incorporating both symptomatic and asymptomatic infections, and vaccination to assess the impact of key parameters on disease transmission. Also, we have assumed a density dependent infection transmission in the model. Basic reproduction number is determined and stability of the equilibria are studied. We determine the basic reproduction number using next generation matrix method and found that the disease-free equilibrium is stable when the basic reproduction number, R0<1 and endemic equilibrium exists when R0>1. By performing sensitivity analysis, the most influential factors affecting infection spread are identified, aiding in targeted intervention strategies. Optimal control techniques are then applied to determine the best approaches to minimize infections while considering resource constraints. The findings provide valuable insights for public health policies, offering effective strategies for mitigating H1N1 outbreaks and enhancing disease management efforts.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

Thomas Junier

Abstract: We present termal, a fast, interactive terminal-based viewer for multiple se- quence alignments (MSAs), designed for use on remote systems such as high-performance computing (HPC) clusters. Unlike traditional graphical viewers, termal runs entirely within a terminal and offers features such as scrolling, zooming, consensus/conservation visualization, and customizable colour schemes. It is implemented in Rust, ensuring high performance and minimal dependencies.
Article
Computer Science and Mathematics
Mathematical and Computational Biology

José Alberto Rodrigues

Abstract: Tumor growth is driven not only by genetic mutations but also by ecological interactions among heterogeneous cell populations within the tumor microenvironment. In this study, we apply evolutionary game theory (EGT) to model competition between glycolytic (G) and oxidative (O) tumor cells under distinct environmental scenarios. Using a payoff matrix to encode fitness interactions, we implement replicator dynamics to simulate changes in cell population fractions over time. Three numerical scenarios are considered: a baseline balanced competition, an acidic microenvironment, and a pH-buffered therapeutic intervention. Our simulations reveal that environmental acidity strongly favors glycolytic dominance, consistent with aggressive tumor phenotypes, while pH-buffered interventions can restore oxidative prevalence, potentially enhancing susceptibility to conventional therapies. These results provide mechanistic insight into how microenvironmental conditions shape tumor composition and highlight the potential of evolutionary-informed strategies—such as adaptive and ecological therapies—to steer tumor evolution toward less aggressive states. Overall, this work demonstrates the utility of EGT as a quantitative framework for understanding tumor heterogeneity and guiding personalized treatment planning.

of 12

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated