Computer Science and Mathematics

Sort by

Article
Computer Science and Mathematics
Computer Science

Keerthivasan Ramasamy Velliangiri

Abstract: The flight operations inside the cockpit and fuselage are moving towards technological based systems, such as Tablets and Mobile computers. The new technology also offers a variety of possible options to pick the right tools and gadgets for regular operations performed by the Pilot, inside the Cockpit and in the fuselage. This paper surveys different gadgets, Electronic Flight Bag (EFB) used by the aviation pilots as well as Inflight Entertainment Devices (IFE) used in the aviation industry. Moreover, the present work also surveys the variety of gadgets used to achieve various goals with the existing issues and challenges.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Shuo Xu,

Yexin Tian,

Yuchen Cao,

Zhongyan Wang,

Zijing Wei

Abstract: The proliferation of fake news presents a major challenge for information integrity in the digital age. In this work, we systematically compare the performance of five widely used machine learning and deep learning models—Logistic Regression, Random Forest, Light Gradient Boosting Machine (LightGBM), ALBERT, and Gated Recurrent Units (GRU)—on the WELFake dataset, utilizing only the news article headlines as input. Each model’s mathematical formulation and optimization process are described, and their ability to capture salient features in brief textual data is evaluated. Experimental results show that transformer-based ALBERT outperforms all other models in both precision and recall, demonstrating the value of contextual encoding even with limited input. Ensemble models, particularly Random Forest and LightGBM, also deliver strong performance and offer interpretability benefits for practical deployment. This study provides actionable insights for deploying efficient and accurate fake news detection models when only headline information is available.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Aria Taukiri,

Emily Marwood,

Liam Raukawa

Abstract: The fusion of complementary modalities has become a central theme in remote sensing (RS), particularly in leveraging Hyperspectral Imaging (HSI) and Light Detection and Ranging (LiDAR) data for more accurate scene classification. In this paper, we introduce \textbf{FusionFormer-X}, a novel transformer-based architecture that systematically unifies multi-resolution heterogeneous data for RS tasks. FusionFormer-X is specifically designed to address the challenges of modality discrepancy, spatial-spectral alignment, and fine-grained feature representation. First, we embed convolutional tokenization modules to transform raw HSI and LiDAR inputs into semantically rich patch embeddings, preserving spatial locality. Next, we propose a Hierarchical Multi-Scale Multi-Head Self-Attention (H-MSMHSA) mechanism, which performs cross-modal interaction in a coarse-to-fine manner, enabling robust alignment between high-spectral-resolution HSI and lower-resolution spatial LiDAR data. We validate our framework on public RS benchmarks including Trento and MUUFL, demonstrating its superior classification performance over current state-of-the-art multimodal fusion models. These results underscore the potential of FusionFormer-X as a foundational backbone for high-fidelity multimodal remote sensing understanding.
Concept Paper
Computer Science and Mathematics
Software

Chitranshu chauhan,

Deepak Tyagi,

Aman Baliyan,

Aryan Tyagi

Abstract: The demand for organic products has grown significantly due to increasing consumer awareness of health benefits, environmental sustainability, and ethical sourcing. This review paper examines existing research on consumer behavior, purchasing patterns, and marketing strategies for organic food, aligning with the development of Village 24x7, an e-commerce platform dedicated to organic products. Key factors influencing organic food purchases include health consciousness, trust in organic labels, and digital marketing strategies ([1], [5], [6], [11], [14], [19]). However, challenges such as high prices, limited availability, and consumer skepticism remain significant barriers ([2], [7], [9]). The study explores the role of organic stores in promoting sustainable consumption and how digital platforms like Village 24x7 can bridge the gap between suppliers and consumers ([3], [13], [14]). Research indicates that social media marketing, influencer endorsements, and transparent certifications significantly enhance consumer trust and engagement ([6], [14], [17]).Additionally, pricing strategies play a crucial role, with premium pricing being effective only when consumers perceive clear health and environmental benefits ([4]). The growing interest in organic products among young consumers in developing nations further supports the potential success of online organic stores ([16]).
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Kazi Sakib Hasan

Abstract: This research presents a comprehensive trustworthy machine learning framework for early diabetes detection, addressing critical gaps in reliability, interpretability, and fairness in clinical AI systems. The study integrates causal inference, modern ensemble methods (LightGBM, XGBoost-DART, HistGBM), and TabNet for tabular deep learning to enhance predictive performance while ensuring transparency. A novel Causal-guided Stacking Classifier (CGSC) is introduced, utilizing LightGBM as a meta-learner trained on causally relevant features identified through Causal Forests. The framework emphasizes interpretability through SHAP-based global and local explanations and leverages TabNet’s intrinsic attention mechanism for feature-level insights. Counterfactual reasoning (DiCE) enables personalized risk mitigation strategies by identifying minimal feature changes to alter predictions. To promote fairness, gender is excluded as a direct feature, reducing demographic bias. Experimental results demonstrate robust performance: CGSC achieves the highest recall (0.81), critical for early warning systems, while TabNet attains superior precision (0.79). Uncertainty quantification reveals stable F1-scores (0.73 ± 0.03) across ensemble models. Key causal drivers include general health (ATE = 0.1392) and cardiovascular factors, while counterintuitive findings like alcohol consumption’s negative association (ATE = -0.1875) warrant further investigation. The framework’s emphasis on causal feature selection, model transparency, and actionable explanations aligns with healthcare requirements for trustworthy AI, offering a reproducible solution for diabetes risk stratification with potential clinical applicability. All experiments are fully reproducible, with resources available at the GitHub repository.
Review
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Zheyu Zhang,

Seong-Yoon Shin

Abstract: Two-dimensional human pose estimation (2D HPE) has become a fundamental task in computer vision, driven by growing demands in intelligent surveillance, sports analytics, and healthcare. The rapid advancement of deep learning has led to the development of numerous methods. However, the resulting diversity in research directions and model architectures has made systematic assessment and comparison difficult. This review presents a comprehensive overview of recent advances in 2D HPE, focusing on method classification, technical evolution, and performance evaluation. We classify mainstream approaches by task type (single-person vs. multi-person), output strategy (regression vs. heatmap), and architectural design (top-down vs. bottom-up) and analyze their respective strengths, limitations, and application scenarios. Additionally, we summarize commonly used evaluation metrics and benchmark datasets such as MPII, COCO, LSP, OCHuman, and CrowdPose. A major contribution of this review is the detailed comparison of the top six models on each benchmark, highlighting their network architectures, input resolutions, evaluation results, and key innovations. In light of current challenges, we also outline future research directions, including model compression, occlusion handling, and cross-domain generalization. This review serves as a valuable reference for researchers seeking both foundational insights and practical guidance in 2D human pose estimation.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Abdel Hiram Cital Duarte,

Gilberto Borrego Soto,

Samuel González López

Abstract: Predicting pain in physical rehabilitation remains challenging due to the subjectivity inherent in pain assessment and the high variability among patients. In addition, traditional self-reported scales often introduce bias that complicates objective monitoring. Researchers have explored physiological biomarkers such as heart rate variability (HRV) and photoplethysmography (PPG) for pain assessment, combining these with artificial intelligence to enhance accuracy. Limited datasets and significant inter-individual variability also restrict the practical application of these approaches. This study evaluates the performance of linear regression and random forest models for pain prediction using HRV and oxygenation data. The random forest model was evaluated in several configurations, achieving a classification accuracy of 97.77% for detecting low pain levels. Other configurations yielded overall accuracies of 60.65% and 76.64%, highlighting variations in performance. Notably, the high accuracy in identifying low pain suggests that this approach can reliably detect even mild discomfort at an early stage, which is essential for timely therapeutic interventions. Future work should incorporate advanced models and expanded datasets for improved generalizability.
Article
Computer Science and Mathematics
Robotics

Bojan Nemec,

Mihael Simonic,

Ales Ude

Abstract: In this paper, we propose an active touch sensing algorithm designed for robust hole localization in 3D objects, specifically aimed at assembly tasks such as peg-in-hole operations. Unlike general object detection algorithms, our solution is tailored for precise localization of features like hole openings, using sparse tactile feedback. The method builds on a prior 3D map of the object and employs a series of iterative search algorithms to refine localization by aligning tactile sensing data with the object’s shape. It is specifically designed for objects composed of multiple parallel surfaces located at distinct heights; a common characteristic in many assembly tasks. In addition to the deterministic approach, we introduce a probabilistic version of the algorithm, which effectively compensates for sensor noise and inaccuracies in the 3D map. This probabilistic framework significantly improves the algorithm's resilience in real-world environments, ensuring reliable performance even under imperfect conditions. We validate the method's effectiveness for several assembly tasks, such as inserting a plug into a socket, demonstrating its speed and accuracy. The proposed algorithm outperforms traditional search strategies, offering a robust solution for assembly operations in industrial and domestic applications with limited sensory input.
Article
Computer Science and Mathematics
Information Systems

Shanqi Zhan,

Yujuan Qiu

Abstract: The rapid development of big data analytics has revolutionized data analysis and decision-making processes across industries. This paper explores how to use Apache Spark to analyze the MovieLens 20M dataset and identify the top movies in Minnesota. By integrating robust data preprocessing and collaborative filtering techniques, a novel recommendation system is developed. The results reveal the popular movies in Minnesota, major genres such as drama and comedy, and related tags such as "original" and "finale." Additionally, a detailed tag correlation analysis is conducted to optimize recommendation accuracy. The study further illustrates Spark’s application in large-scale data processing, demonstrating its effectiveness in recommendation systems. These findings bridge the gap between theoretical frameworks and practical applications, providing a replicable approach to address challenges in preprocessing, analysis, and personalized recommendations.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Giannis Thivaios,

Panagiotis Zervas,

Giannis Tzimas

Abstract: This study presents a method for detecting and removing duplicate job postings in large datasets with emphasis on key attributes such as job title, location, company name, and job description. The approach begins with a preprocessing phase that standardizes text data—normalizing formats, removing special characters, and resolving lexical variations—to ensure consistency and compatibility. For deduplication, we utilize WordLlama, a fast and lightweight NLP toolkit optimized for fuzzy deduplication and similarity detection. Furthermore, we evaluate the performance of various Large Language Models (LLM) in identifying duplicates, measuring accuracy through precision and recall metrics. The objective is to determine which model best captures semantic similarities in job postings and achieves the highest deduplication accuracy. This comparison offers valuable insights into the effectiveness of LLMs for large-scale, text-based deduplication in the context of job postings.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Shanqi Zhan

Abstract: Accurate parking occupancy prediction is essential for reducing traffic congestion and optimizing urban mobility. Traditional monitoring methods are costly and difficult to scale, making machine learning a viable alternative. This study employs XGBoost (eXtreme Gradient Boosting) to predict parking occupancy using OpenStreetMap (OSM) data, with simulated occupancy rates based on proximity to the central business district (CBD). The model achieves a Mean Squared Error (MSE) of 0.0022 and an R²value of 0.8922, demonstrating strong predictive accuracy. Results confirm the significance of spatial factors in parking demand. Future work will integrate real-time data and explore deep learning models to further enhance prediction accuracy.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Hao Wang,

Yuxin Gong,

Chang Yu

Abstract: In the financial sector, fraud detection tasks have posed a significant challenge to researchers for a long time, particularly in scenarios characterized by a highly imbalanced dataset. Due to the rare occurrence of fraudulent activities, unfortunately, significantly imbalanced datasets are common, leading to the limitations of traditional machine learning models to generalize well on minority classes. To address this challenge, we introduce GAN\_BERT, a hybrid neural framework architecture that combines Conditional Tabular Generative Adversarial Networks (CTGAN) for synthetic data generation with a transformer-based Bidirectional Encoder Representations from Transformers (BERT) classifier. Within GAN\_BERT, each component targets on different issues: the CTGAN module captures intrinsic patterns hidden behind fraud records, then generates high-quality synthetic samples for training. The data loader module prepares training data and synthetic samples in a stratified way, which substantially leverages up the model exposure for minority classes. Lastly, the classifier module learns the tempura relationship among fraud transactions, then identifies the fraud activities accurately while maintaining low false alarm rate. Running through the benchmark datasets with other state-of-art models, GAN\_BERT demonstrates noticeably improvements in precision, recall and F1-score for the minority class. We propose this innovative neural network architecture, GAN\_BERT, to be a robust, flexible, and scalable solution for fraud detection tasks especially on imbalanced datasets. Our research achievements may also be applicable to other domains facing similar challenges.
Article
Computer Science and Mathematics
Mathematics

Philippe Sainty

Abstract: In this article the proof of the binary Goldbach conjecture is established ( Any integer greater than one is the mean arithmetic of two positive primes ) . To this end the weak Chen conjecture is proved ( Any even integer greater than One is the difference of two positive primes ) and a " located " algorithm is developed for the construction of two recurrent sequences of primes () and ( ), ( ( ) dependent of ( ) ) such that for each integer n their sum is equal to 2n . To form this a third sequence of primes () is defined for any integer n by = Sup ( p : p ≤ 2n - 3 ) , being the infinite set of positive primes. The Goldbach conjecture has been proved for all even integers 2n between 4 and 4. In the table of terms of Goldbach sequences given in Appendix 12 values of the order of 2n = are reached. An analogous proof by recurrence « finite ascent and descent method » is developed and a majorization of by 0.7 ( 2n ) is justified.. In addition, the Lagrange-Lemoine-Levy conjecture and its generalization called ’’ Bezout-Goldbach ’’ conjecture are proven by the same type of algorithm.
Article
Computer Science and Mathematics
Computational Mathematics

Owen Graham,

David Hamilton

Abstract: The integration of machine learning (ML) in healthcare has the potential to revolutionize patient care, optimize clinical workflows, and facilitate personalized medicine. However, the utilization of electronic health records (EHRs) for training ML models raises significant privacy concerns due to the sensitive nature of health data. This paper explores the emerging field of privacy-preserving machine learning (PPML) as a critical approach to safeguarding patient confidentiality while enabling the effective analysis of EHRs. We systematically review various PPML techniques, including differential privacy, homomorphic encryption, and federated learning, assessing their applicability in the context of healthcare data. Differential privacy is examined as a method for adding controlled noise to data outputs, ensuring that the contributions of individual patients cannot be easily inferred. We discuss its implementation challenges, particularly in maintaining the trade-off between data utility and privacy guarantees. Homomorphic encryption, which allows computations to be performed on ciphertexts, is analyzed for its capacity to secure sensitive health information during model training and inference. However, we highlight the computational complexity and resource demands associated with this technique, which may limit its practical application in real-world healthcare settings. Federated learning emerges as a promising paradigm that enables decentralized model training across multiple institutions, allowing EHRs to remain localized and secure. This section delves into the benefits of federated learning in facilitating collaborative research while addressing the challenges of communication overhead and model performance. We also consider hybrid approaches that combine multiple privacy-preserving techniques to enhance security without significantly compromising model accuracy. Furthermore, we investigate the ethical and regulatory implications of implementing PPML in healthcare, particularly in light of stringent data protection regulations such as HIPAA and GDPR. The role of patient consent, data governance, and the need for transparent AI systems are discussed to ensure that privacy-preserving measures align with ethical standards and foster patient trust. In conclusion, while privacy-preserving machine learning presents a viable pathway for leveraging EHRs in healthcare analytics, ongoing research is essential to refine these techniques and address their limitations. This paper contributes to the discourse on balancing the benefits of advanced ML methodologies with the imperative of protecting patient privacy, ultimately advocating for a multidisciplinary approach that integrates insights from computer science, healthcare, and ethical governance. As the healthcare landscape evolves, the adoption of robust privacy-preserving frameworks will be pivotal in harnessing the power of machine learning while safeguarding the confidentiality of sensitive health data.
Article
Computer Science and Mathematics
Computer Networks and Communications

Owen Graham,

Jim Balford

Abstract: The rapid evolution of large language models (LLMs) has transformed natural language processing, enabling machines to perform complex language understanding, generation, and reasoning tasks with unprecedented fluency and adaptability. This survey presents a comprehensive comparative analysis of three major classes of LLMs: foundation models, instruction-tuned models, and multimodal variants. We first define and contextualize each category—foundation models as the general-purpose pretrained backbones, instruction-tuned models as task-optimized derivatives guided by human or synthetic instructions, and multimodal models as those extending language understanding to vision, audio, and other modalities. The paper examines architectural innovations, training methodologies, benchmark performances, and real-world applications across these model types. Through systematic comparison, we highlight the trade-offs in generality, alignment, efficiency, and modality integration. We further discuss deployment trends, ethical considerations, and emerging challenges, offering insights into the future trajectory of unified, scalable, and human-aligned language models. This survey aims to serve researchers and practitioners by clarifying the landscape and guiding informed decisions in the design and application of LLMs.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Pramod Kumar Saket

Abstract: Climate change threatens rice cultivation in India, a cornerstone of food security for over 1.4 billion people. This study proposes an Explainable Artificial Intelligence (XAI) framework integrating Long Short-Term Memory (LSTM) networks with SHAP (SHapley Additive exPlanations) to predict rice yield and interpret the impact of climate variables (temperature, precipitation, humidity, soil moisture). Using historical data (2000–2020) from the India Meteorological Department (IMD) and Ministry of Agriculture, the model achieves an R² of 0.88, Mean Absolute Error (MAE) of 0.11 tons/ha, and Root Mean Squared Error (RMSE) of 0.15 tons/ha. SHAP analysis identifies temperature (42%) and precipitation (33%) as primary drivers of yield variability. This framework provides transparent, data-driven insights, supporting farmers and policymakers in developing climate-resilient agricultural strategies aligned with India’s sustainability goals.
Article
Computer Science and Mathematics
Computer Science

Astan Serikov

Abstract: This paper explores the impact of flexible academic schedules and part-time employment opportunities on student performance and professional growth, particularly in the field of Information Technology (IT). It presents examples and an analysis of the "Freeway" program, which implements a flexible approach to class attendance and supports students who combine study with work. The methodology includes a literature review, analysis of student survey results, and a case study of the "Freeway" program. The findings indicate that flexible learning and working conditions contribute to improved academic performance, increased motivation, and the development of professional skills among students.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Owen Graham,

David Hamilton

Abstract: In the increasingly data-driven landscape of healthcare, the application of Federated Learning (FL) has emerged as a transformative paradigm, enabling the collaborative training of machine learning models across decentralized datasets while preserving data privacy. This approach is particularly pertinent for health data, which is often sensitive and subject to stringent regulatory requirements. However, the integration of secure aggregation protocols within Federated AI systems is crucial for ensuring the confidentiality and integrity of anonymized health data during the aggregation process. This paper comprehensively reviews the state of secure aggregation protocols in the context of Federated AI, emphasizing their role in safeguarding patient privacy while allowing for the effective utilization of health data. We categorize existing secure aggregation methods based on their cryptographic techniques, including homomorphic encryption, secure multiparty computation, and differential privacy, analyzing their strengths and limitations in practical applications. Furthermore, we explore the implications of these protocols on data utility, computational efficiency, and scalability in real-world healthcare settings. By synthesizing recent advancements and ongoing challenges in the field, this study underscores the importance of designing robust aggregation protocols that not only enhance security but also facilitate the seamless integration of diverse health data sources. We propose a framework for evaluating the performance of these protocols, taking into account factors such as communication overhead, resilience against attacks, and adaptability to various federated learning architectures. Our findings indicate that while significant progress has been made, there remains a critical need for ongoing research to balance the trade-offs between security, privacy, and model performance. This paper aims to contribute to the development of more sophisticated secure aggregation protocols that can effectively support the growing demand for collaborative, AI-driven health analytics without compromising patient confidentiality. Ultimately, we advocate for a multidisciplinary approach that incorporates insights from cryptography, data science, and healthcare policy to advance the secure and ethical use of federated AI in health data research.
Article
Computer Science and Mathematics
Artificial Intelligence and Machine Learning

Rogério Figurelli

Abstract: This paper introduces the eXtended Content Protocol (XCP), a symbolic and semantic-first message architecture designed to enable distributed, cross-protocol communication with ontological clarity and runtime adaptability. This perspective builds upon the epistemic framework of Heuristic Physics, which interprets law, structure, and communication as surviving compressions — semantic heuristics that persist under contextual entropy and symbolic drift. XCP extends this principle to protocol design: rather than replacing MQTT, CoAP, or HTTP, it encapsulates them within a minimal semantic envelope that embeds self-describing meaning into each message. The protocol addresses the epistemic gap between syntactic compatibility and semantic interoperability by treating communication as symbolic knowledge exchange. The architecture formalized herein defines three interoperable layers — envelope, declaration module, and adaptive transport bindings — and simulates their behavior in delay-prone and broadcast-intensive environments. The results demonstrate that semantic resilience and interpretability can be embedded directly into the structure of messages without static schemas, prior alignment, or centralized mediation. XCP is proposed not as a software implementation, but as a formal protocol architecture: a compressive symbolic scaffold designed to survive meaning collapse across heterogeneity. It invites rethinking network communication as an epistemic function — not merely data transfer, but semantic continuity under relational instability.
Article
Computer Science and Mathematics
Other

Varun Gowda R,

Sumadhva Krishna H M,

Syed Muzammil Hussiani,

Ramesh K B

Abstract: This paper presents the design and implementation of a longrange smart home automation system using LoRa (Long Range) technology. The system consists of a transmitter and a receiver circuit that utilize the RYLR896 LoRa (Long Range) module to communicate over a range of 1–2 km, with potential coverage up to 15 km. A push button at the transmitter end triggers signals that are received and acknowledged via an OLED (ORGANIC LIGHT-EMITTING DIODE) display on the receiver side, which also controls an appliance through a relay. Integration with Google Home via the ESP8266 NodeMCU (Node Micro Controller Unit) allows voice-based control, enhancing user accessibility. The use of LoRa (Long Range) ensures reliable, low-power communication suitable for large homes or remote areas, addressing the limitations of conventional Wi-Fi and Bluetooth-based systems. The proposed system demonstrates efficiency, scalability, and real-time responsiveness, making it ideal for modern smart home applications.

of 505

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated