Computer Science and Mathematics

Sort by

Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Quỳnh Phạm

Abstract: In the context of the rapid growth of e-commerce, understanding customer emotions and needs through online feedback has become a key factor in strategies to improve product and service quality. On platforms such as Shopee, users often leave reviews after experiencing products - these comments not only reflect customer satisfaction but also provide detailed information regarding various aspects such as price, quality, shipping, and customer service.This study leverages over 11,000 customer reviews on many shoe-related products from the e-commerce platform - Shopee, annotated across 8 different aspects and classified into 4 sentiment categories: positive, negative, neutral, and undefined. The main objective is to perform Aspect-Based Sentiment Analysis (ABSA) to uncover insights into user experiences and propose improvements to products and services.By combining advanced text preprocessing techniques, data visualization, and modern machine learning models, this research not only evaluates overall sentiment but also analyzes detailed factors influencing consumer purchasing decisions and satisfaction. This study contributes to the development of Vietnamese natural language processing and opens up future research directions for low-resource languages.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Nisar Hussain,

Amna Qasim,

Gull Mehak,

Muhammad Zain,

Grigori Sidorov,

Alexander Gelbukh,

Olga Kolesnikova

Abstract: Depression is now one of the most common mental health concerns in the digital era, calling for powerful computational tools for its detection and its level of severity estimation. A multi-level depression severity detection framework in the Reddit social media network is proposed in this study, and posts are classified into four levels: minimum, mild, moderate, and severe. We take a dual approach using classical Machine Learning (ML) algorithms and recent Transformer-based architectures. For the ML track, we build ten classifiers, including Logistic Regression, SVM, Naive Bayes, Random Forest, XGBoost, Gradient Boosting, K-NN, Decision Tree, AdaBoost, and Extra Trees, with two recently proposed embedding methods, Word2Vec and GloVe embeddings, and we fine-tune them for mental health text classification. Of these, XGBoost yields the highest F1-score of 94.01 using GloVe embeddings. For the deep learning track, we fine-tune ten Transformer models, covering BERT, RoBERTa, XLM-RoBERTa, MentalBERT, BioBERT, RoBERTa-large, DistilBERT, DeBERTa, Longformer, and ALBERT. The highest performance was achieved by the MentalBERT model with an F1-score of 97.31, followed by RoBERTa (96.27) and RoBERTa-large (96.14). Our results demonstrate that, to the best of the authors’ knowledge, domain-transferred Transformers outperform non-Transformer-based ML methods in capturing subtle linguistic cues indicative of different levels of depression, thereby highlighting their potential for fine-grained mental health monitoring in online settings.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Kevin Tole,

Fullgence Mwakondo

Abstract: This paper presents a novel hybrid optimization framework that combines Quantum Annealing (QA) with Topological Data Analysis (TDA) for solving academic timetabling problems. The proposed model addresses the multi-constraint nature of university scheduling by integrating quantum-based global search capabilities with topological insights that capture structural data complexity. Empirical evaluations were conducted on real-world scheduling data from the Technical University of Mombasa (TUM), encompassing three datasets of increasing complexity: certificate/diploma, undergraduate, and postgraduate program schedules. The performance of four configurations—QA-only, TDA-only, hybrid without refinement, and full hybrid with refinement—was assessed using four key metrics: Conflict-Free Rate (CFR), Resource Utilization (RU), Computation Time (CT), and Energy Function Value (EFV). Results show that the full hybrid configuration significantly outperforms all baselines, achieving a CFR of 94.3\% and RU of 91.2\% on the most complex dataset, while also yielding the lowest EFV. Clustering and K-Nearest Neighbor (KNN) analyses were conducted to explore configuration similarities and performance consistency, confirming the hybrid model’s robustness across different problem scales.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Satyadhar Joshi

Abstract: The rapid adoption of generative AI in various sectors, particularly in finance, has introduced new challenges and opportunities for model risk management (MRM). This paper provides a comprehensive review of the current state of MRM in the context of generative AI, focusing on the risks, regulatory frameworks, and mitigation strategies. We explore the implications of generative AI on financial institutions, the evolving regulatory landscape, and the role of advanced MRM frameworks in ensuring compliance and mitigating risks. By synthesizing insights from 50+ recent articles, this paper aims to provide a roadmap for future research and practical applications of MRM in the generative AI era. It examines the key risks associated with these models, including bias, lack of transparency, and potential for misuse, and explores the regulatory frameworks and best practices being developed to mitigate these risks. We delve into the specific challenges faced by financial institutions in adapting their MRM strategies to encompass generative AI, and highlight the emerging tools and technologies that can support effective risk management. This paper also discusses **quantitative methods** for risk quantification, such as probabilistic frameworks, Monte Carlo simulations, and adversarial risk metrics, which are essential for assessing the reliability and robustness of generative AI models. Foundational metrics, including fairness measures like demographic parity and equalized odds, are explored to address bias and ensure ethical AI deployment. Additionally, the paper presents **pseudocode** for key algorithms, such as risk quantification and adversarial risk calculation, to provide a practical understanding of these methods. A detailed **gap analysis** identifies critical shortcomings in current MRM frameworks, such as the lack of standardized validation methods and inadequate handling of adversarial robustness. Based on these gaps, the paper proposes solutions, including the development of advanced validation frameworks, integration of fairness metrics, and alignment with regulatory standards. These findings and proposals aim to guide financial institutions in adopting generative AI responsibly while addressing the unique risks it poses. This paper serves as a valuable resource for professionals and researchers seeking to understand and navigate the complexities of MRM in the age of generative AI.
Hypothesis
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Dong Jun,

Wang Zi Jia

Abstract: The abstraction and formalization of architectural design thinking into structured knowledge representations remains a critical gap in computational design theory. This paper proposes a conceptual framework for translating tacit design knowledge from exemplary architectural projects and expert design processes into machine-interpretable semantic networks, aiming to bridge the epistemological divide between architectural practice and large language models (LLMs). By synthesizing principles from design cognition theory and semantic graph formalisms, the framework establishes a Design Thinking Semantic Framework (DTSF) that constructs relational graph structures that map architectural lexicons to LLM-compatible semantic units, preserving the nuanced interplay between creative intuition and disciplinary rigor inherent to architectural problem-solving. Through ontological alignment mechanisms, the framework enables the transfer of domain-specific design intentionality—such as the dialectic between form and function or the negotiation of heritage constraints—into generative AI systems. This theoretical advancement provides a structured methodology for embedding architectural epistemology into AI models, fostering cross-disciplinary dialogue while addressing the limitations of current LLM applications in architecture. The work lays foundational principles for future AI-augmented design tools that respect architectural complexity without reducing it to combinatorial optimization.
Article
Computer Science and Mathematics
Discrete Mathematics and Combinatorics

Mila Mursalina,

Yeni Susanti

Abstract: If ϱ and H are simple, connected, undirected graphs, and ϱ can be covered by an H-covering, then for a positive integer p, a total p-labeling φ on ϱ is considered as a total H-irregular p-labeling if, for every subgraph K of ϱ that is isomorphic to H, the weight of K (the sum of the labels of all vertices and edges of K) is a unique number. The smallest integer p for which graph ϱ can be labeled with a total H-irregular p-labeling is called the total H-irregularity strength of graph G. This paper presents the exact total H-irregularity strength values for some particular graphs including balloon graphs, double balloon graphs, and double balloon ladder graphs.
Article
Computer Science and Mathematics
Robotics

Yuxin Jiang,

Shengcong Chen,

Siyuan Huang,

Liliang Chen,

Pengfei Zhou,

Yue Liao,

Xindong He,

Chiming Liu,

Hongsheng Li,

Maoqing Yao

+1 authors
Abstract: Robotic imitation learning has advanced from solving static tasks to addressing dynamic interaction scenarios, but testing and evaluation remain costly and challenging due to the need for real-time interaction with dynamic environments. We propose EnerVerse-AC (EVAC), an action-conditional world model that generates future visual observations based on an agent's predicted actions, enabling realistic and controllable robotic inference. Building on prior architectures, EVAC introduces a multi-level action-conditioning mechanism and ray map encoding for dynamic multi-view image generation while expanding training data with diverse failure trajectories to improve generalization. As both a data engine and evaluator, EVAC augments human-collected trajectories into diverse datasets and generates realistic, action-conditioned video observations for policy testing, eliminating the need for physical robots or complex simulations. This approach significantly reduces costs while maintaining high fidelity in robotic manipulation evaluation. Extensive experiments validate the effectiveness of our method. Code, checkpoints, and datasets can be found at <https://annaj2178.github.io/EnerverseAC.github.io>.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Medha Pujari,

Megha Chandra Medde,

Sergey Butakov

Abstract: The growing complexity of cyberattacks has posed significant challenges to network intrusion detection systems (IDS). Despite being equipped with sophisticated machine learning capabilities, intelligent IDSs have vulnerabilities that can be exploited by adversarial algorithms which are widely known to inject subtle perturbations into data passing through IDSs. This paper evaluates the impact of two adversarial black-box attacks - Gaussian Perturbation and Genetic Algorithm - on the performance of machine learning(ML)-based IDS model, by thoroughly investigating how these adversarial algorithms affect the integrity of data and model performance. Our research contributes to a deeper understanding of how adversarial attacks generate deceptive data and underscores the importance of developing innovative strategies to defend the defenders, i.e., IDS systems.
Article
Computer Science and Mathematics
Signal Processing

Makhanbetov Adilet,

Yesmagambetov Bulat-Batyr,

Balabekova Madina,

Saidakhmetov Murad,

Zhanteli Khassen,

Sarsenbayev Kanat,

Sultanova Gulbanu,

Tursumbayeva Adiya

Abstract: The article is devoted to the issues of signal detection in a radio communication system. As a rule, such tasks are solved in two ways – adaptive and nonparametric. The adaptive method is to adjust the structure and parameters of the detection system in accordance with the parameters of signals and interference. Nonparametric methods are used when it is necessary to ensure that the system is insensitive to changes in the properties of signals and interference, in particular in the absence of a priori information about the probabilistic properties of the measured signals and interference. Nonparametric methods of signal detection are based on the use of nonparametric methods of statistical hypothesis testing theory. Nonparametric methods are effectively used in radio electronic systems due to the need to stabilize the frequency of false alarms with unknown interference properties. The invariant properties of nonparametric procedures are based on the independence of the statistical properties of nonparametric functions from the statistical properties of the measured signals. One of the most effective nonparametric methods is the use of ranks, which are formed as a result of ranking the input samples of the measured signal either in ascending or descending order. Ranks have many useful properties for practice and are the most suitable for the purposes of forming various statistical hypotheses. The article is devoted to the criterion of post-detector rank signal processing based on minimizing the RMS error of measuring the situation parameter. The article considers the noise immunity of this criterion in comparison with optimal rank detection methods based on the Neumann -Pearson criterion.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Kai Aidi,

Danyi Gao

Abstract: To address the challenge of highly volatile and difficult-to-predict memory usage in cloud servers, this paper proposes a memory usage prediction model that integrates Convolutional Neural Networks (CNN) and Long Short-Term Memory networks (LSTM). The approach extracts local spatial correlations from input feature sequences through the CNN module and captures temporal dependencies using the LSTM structure. This enables high-precision prediction of memory usage trends over time. To validate the model's effectiveness, a prediction dataset was constructed using real-world cloud server monitoring data, covering ten key resource indicators. Comparative experiments were conducted with several mainstream deep learning models. The results show that the proposed CNN+LSTM model outperforms traditional models in terms of MSE, MAE, and R2 metrics, demonstrating stronger fitting capability and greater stability. Loss convergence analysis and prediction curve comparisons further confirm that the model effectively captures the actual fluctuation patterns of resource usage. It performs particularly well on complex nonlinear sequences, exhibiting both strong predictive performance and practical engineering value.
Article
Computer Science and Mathematics
Discrete Mathematics and Combinatorics

Dario Galić,

Anita Katić,

Radoslav Galić,

Elvir Čajić

Abstract: This entry presents a formal and algorithmic model for classifying network and semi-network structures using Boolean algebra. The study defines and analyzes classes Gk, Uk, Tk, Hk, and Bk based on their algebraic and logical properties. Utilizing union and intersection interpreted through logical disjunction and conjunction, a systematic classification framework is developed. A MATLAB-based application supports the graphical and logical analysis of function intersections, Boolean expressions, and network behavior. The approach bridges symbolic logic with graphical representation, offering a robust model for digital logic, network theory, and mathematical education.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Freja Lindholm,

Wyne Nasir,

Emil Sörensen

Abstract: Understanding relationships among entities in visually rich documents (VrDU) is a cornerstone for various industries, including finance, healthcare, and legal services. While the integration of multimodal signals—such as textual content, layout structures, and visual cues—has driven substantial progress in VrDU-related tasks like relation extraction (RE), there remains a gap in comprehensively assessing the predictive effectiveness of each modality. In this paper, we introduce MORAE, a systematic framework designed to dissect and analyze the individual and joint contributions of text, layout, and vision in RE tasks. Through an extensive series of ablation experiments under multiple controlled settings, we investigate the incremental utility of each modality both in isolation and combination. Our findings demonstrate that while a bimodal fusion of text and layout achieves the highest F1-score of 0.728, the textual component alone remains the most influential predictor in establishing entity relationships. Furthermore, our study uncovers the surprisingly competitive performance of geometric layout data as a standalone modality, presenting a cost-efficient alternative in scenarios where textual extraction might be hindered. Visual information, though less dominant, exhibits supportive capacity in certain complex document layouts. Beyond empirical validations, we provide a lightweight RE classifier under MORAE, encouraging practical deployment in resource-constrained applications. These insights offer a deeper understanding of modality synergies and promote the informed design of future VrDU systems.
Article
Computer Science and Mathematics
Computer Networks and Communications

Owen Graham,

Sora Davidson

Abstract: This study investigates the creation of domain-specific language models for legal terminology using the Vosk speech recognition toolkit. As the legal field increasingly adopts technology for transcription and documentation, the need for accurate recognition of specialized vocabulary and phrases becomes paramount. Generic speech recognition models often struggle to understand legal terminology due to its unique linguistic characteristics, leading to high error rates and inefficiencies in legal processes. This research addresses these challenges by developing customized language models tailored specifically for legal contexts. The methodology involves the collection of a comprehensive legal corpus, encompassing various sources such as court transcripts, legal documents, and audio recordings of legal proceedings. Through meticulous data preprocessing, including cleaning and annotation, the corpus is prepared for model training. The Vosk toolkit is employed to create and train these domain-specific models, leveraging its capabilities to enhance recognition accuracy for legal terminology. Evaluation metrics such as Word Error Rate (WER) and recognition accuracy are utilized to assess model performance, alongside user feedback from legal professionals who test the system in real-world scenarios. The results demonstrate significant improvements in recognition accuracy compared to generic models, indicating that domain-specific adaptations lead to more effective transcription and documentation processes in legal practice. This research not only highlights the importance of specialized language models in enhancing speech recognition technology but also provides a roadmap for future developments in other specialized fields. Ultimately, this study contributes to the ongoing evolution of legal technology by creating tools that improve accessibility, efficiency, and accuracy in legal communication.
Article
Computer Science and Mathematics
Algebra and Number Theory

Diego Muguet de Magalhães

Abstract: We introduce a geometric and spectral reformulation of the Riemann Hypothesis based on the analysis of a complex vector-valued function, the Function of Residual Oscillation (FOR(N)), defined by a regularized spectral sum over the nontrivial zeros of the Riemann zeta function. This function reveals a torsion structure in the complex plane that is minimized under the critical-line condition Re(ρ) = 1/2. By analyzing the directional stability of the associated vectors, we demonstrate that the Riemann Hypothesis is equivalent to the global vanishing of the spectral torsion function τ(N). The approach combines geodesic vector dynamics, coherence cancellation, and asymptotic convergence, providing a new structural perspective on one of the most fundamental problems in mathematics.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Sloane Everett,

Wyne Nasir,

Rowan Cassidy

Abstract: Understanding the intricate interplay between actions and their consequential effects is a cornerstone of human intelligence and decision-making processes. Enabling artificial agents to emulate such capabilities is essential for fostering seamless interaction in dynamic, real-world environments. In response to this demand, we present a novel approach, termed Differential Effect-Aware Reasoner (DEAR), which systematically leverages the structured representations encapsulated within scene-graphs to model the nuanced outcomes of actions articulated in natural language. Unlike prior methods that predominantly rely on monolithic visual features paired with linguistic cues, DEAR capitalizes on observing relational differences across state transitions induced by actions. By employing paired scene-graphs reflecting pre-action and post-action states, our approach enhances the agent's sensitivity to subtle state variations. To empirically validate the effectiveness and robustness of DEAR, we conduct extensive evaluations on the CLEVR\_HYP dataset. The experimental results consistently demonstrate that DEAR surpasses baseline models in terms of reasoning accuracy, data efficiency, and cross-scenario generalization, thus underscoring its potential as a foundational mechanism for future action-effect reasoning systems.
Article
Computer Science and Mathematics
Computer Science

Seid Mehammed Abdu,

Md Nasre Alam

Abstract: The accurate identification of meaningful patterns in high-dimensional and noisy datasets remains a fundamental challenge in intelligent data analysis, particularly within the domain of smart city analytics. Traditional clustering algorithms such as DBSCAN offer robustness to noise and the ability to detect clusters of arbitrary shapes. However, they suffer from critical limitations, including sensitivity to parameter selection and poor performance in handling overlapping or ambiguous data regions. To overcome these issues, this paper presents a novel hybrid clustering framework that synergistically combines fuzzy logic and Particle Swarm Optimization (PSO) with Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The proposed method begins with Z-score normalization for data standardization, followed by the application of PSO to automatically optimize key DBSCAN parameters, namely Eps and MinPts, across a predefined range. A fuzzy extension of DBSCAN is then employed to enable soft clustering, which better accommodates data uncertainty and overlapping class boundaries. Experimental evaluations on urban analytics datasets from Addis Ababa demonstrate that the proposed approach achieves improved clustering quality, as evidenced by enhanced silhouette scores and intra-cluster cohesion, in comparison to traditional DBSCAN and its variants. This work contributes a flexible and intelligent clustering technique well-suited for real-world smart city applications where data ambiguity and parameter sensitivity are prevalent.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Asadbek Yussupov,

Ruslan Isaev

Abstract: Conditional Value-at-Risk (CVaR) is one of the most popular risk measures in finance, used in risk management as a complementary measure to Value-at-Risk (VaR). VaR estimates potential losses within a given confidence level, such as 95% or 99%, but does not account for tail risks. CVaR addresses this gap by calculating the expected losses exceeding the VaR threshold, providing a more comprehensive risk assessment for extreme events. This research explores the application of Denoising Diffusion Probabilistic Models (DDPM) to enhance CVaR calculations. Traditional CVaR methods often fail to capture tail events accurately, whereas DDPMs generate a wider range of market scenarios, improving the estimation of extreme risks. However, these models require significant computational resources and may present interpretability challenges.
Article
Computer Science and Mathematics
Applied Mathematics

Ji-hong Li,

Heng-you Lan,

Si-yuan Lin

Abstract: In this paper, we propose a novel alternating direction method of multipliers based on inertial acceleration techniques for a class of nonconvex optimization problems with a two-block structure. To address the nonconvex subproblem, we introduce a proximal term to reduce the difficulty of solving this subproblem. For smooth subproblem, we employ a gradient descent method on the augmented Lagrangian function, which significantly reduces the computational complexity. Under the assumptions that the generated sequence is bounded and the auxiliary function satisfies Kurdyka-\L{ojasiewicz} property, we establish the global convergence of the proposed algorithm. Finally, the effectiveness and superior performance of the proposed algorithm are validated through numerical experiments in signal processing and SCAD problems.
Review
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Rajesh Kumar,

Isabelle Laurent,

David Müller,

Klaus Elli

Abstract: Large Language Models (LLMs) have emerged as a cornerstone of modern artificial intelligence, achieving remarkable capabilities in natural language understanding and generation. As their scale and utility have increased, two critical and complementary trends have defined their evolution: (1) the distributed systems and algorithms enabling efficient training of ultra-large models across massive compute infrastructures, and (2) the integration of multiple modalities—such as vision, audio, and structured data—into unified multimodal large language models (MLLMs). This survey provides a comprehensive examination of the state-of-the-art in both of these dimensions.We begin by exploring the foundations and advances in distributed training, including model parallelism, pipeline parallelism, memory optimization strategies, and the design of sparse and expert models. We assess system-level techniques such as ZeRO, DeepSpeed, and tensor sharding that allow for scalable, memory-efficient training at trillion-parameter scale. Next, we turn to multimodality, surveying architectures and training objectives that extend LLMs to process and generate across diverse input types. We review contrastive learning, cross-attention fusion, and aligned token embeddings as key techniques that enable cross-modal reasoning, with illustrative examples from models like Flamingo, CLIP, and GPT-4V.Beyond current methodologies, we identify and formalize the core technical challenges facing distributed and multimodal LLMs, including memory bottlenecks, communication overhead, alignment in the absence of ground truth, robustness to modality shifts, and evaluation under open-ended tasks. To guide future research, we outline six key directions: unified memory-augmented architectures, modular and composable systems, self-aligning mechanisms, lifelong and continual learning agents, embodied multimodal cognition, and the emergence of general-purpose foundation agents.Our goal is to synthesize recent progress while articulating a vision for the next generation of foundation models—models that are not only scalable and multimodal but are also capable of reasoning, grounding, and adapting to complex, real-world environments. This survey serves both as a technical reference and a roadmap for researchers and practitioners navigating the future of large-scale, multimodal, and distributed AI systems.
Article
Computer Science and Mathematics
Computer Vision and Graphics

Li'an Wang,

Jian Xu,

Xuan An,

Yujie Ji,

Yuxuan Wu,

Zhaoyuan Ma

Abstract: The 3DGS technology is a hot research direction in the current fields of computer vision and robotic perception. The initialization process of the 3DGS system requires obtaining the initial point cloud of the scene and the camera pose to assist in the establishment of the Gaussian map. Most of the research on initialization methods adopt the COLMAP system, which makes the overall time for establishing the Gaussian map relatively long. Based on this, in this study, the ORB-SLAM initialization method is first adopted to obtain the initial point cloud and the camera pose, which shortens the initialization time by 10 times. Secondly, in order to improve the reconstruction quality of the 3DGS, we introduce the LGSBA algorithm during the 3DGS training process. At the same time, our system uses ROS to achieve a tight coupling between the ORB-SLAM and the 3DGS system. On the basis of not affecting the reconstruction quality of the 3DGS, we finally achieve a significant superiority over the system that combines COLMAP and the traditional 3DGS in terms of both optimization time and optimization quality.

of 486

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated