Computer Science and Mathematics

Sort by

Article

Signal Processing

Fiber Bundle Learning: A Topological Framework for Classification Using Homology and Discrete Connections

Arturo Tozzi

Abstract: Many machine-learning tasks involve structured data whose geometry, local feature distributions and global organization interact in ways that are not well captured by existing methods based on vectorization, graph metrics or homological signatures. We introduce Fiber Bundle Learning (FBL), a topological framework that represents each data sample as a discrete fiber bundle and extracts a classification signature combining persistent homology, local feature geometry and gluing structure. FBL builds a base space from the coarse geometry of each object, models local feature patches as fibers and estimates transition maps between neighboring fibers to construct a discrete connection. From this representation, FBL computes a set of invariants: persistent homology of the base, fibers and total space; holonomy obtained by transporting fiber states along cycles; curvature-like quantities measuring transition inconsistency; discrete analogues of characteristic classes. These components are assembled into a fixed-length feature vector that can be used with any standard classifier. We show that FBL yields a signature with three desirable theoretical properties: stability under perturbations of geometry and local features; invariance under isometries and global fiber reparameterizations; robustness to sampling noise. Our synthetic experiments show that FBL distinguishes twisted from untwisted bundles with identical homology, a distinction classical topological methods fail to capture. Additional tests quantify the system’s resistance to noise, its invariance to geometric transformations and the contribution of each signature component. Taken together, our results indicate that representing data through fiber-bundle structure may provide an effective tool for classifying complex, multi-level objects.

Posted: 28 November 2025

https://doi.org/10.20944/preprints202511.2221.v1

Article

Computer Science and Mathematics

Signal Processing

A Non-Parametric Algorithm for Predicting Future Samples in Single- and Multi-Channel Time Series

Ioannis Dologlou

Abstract: A new method to estimate future samples in time series data is presented and it is compared against the well known technique ESPRIT. It exploits the null space of the Hankel matrix of the data allowing the prediction of future samples with better accuracy and confidence. Moreover a generalization of the algorithm is derived that also applies to multichannel signals. Both cases with and without cross-channel coupling are considered and different algorithms are presented. The method is fully deterministic with comparable computational complexity to ESPRIT. Testing involves 4000 randomly chosen data sets with variable spectral characteristics.

Posted: 27 November 2025

https://doi.org/10.20944/preprints202511.2136.v1

Article

Computer Science and Mathematics

Signal Processing

The Impact of Quantifying Human Locomotor Activity on Examining Sleep-Wake Cycle

Bálint Maczák

Adél Zita Hordós

Gergely Vadai

Abstract: Actigraphy quantifies human locomotor activity by measuring wrist acceleration with wearable devices at relatively high rates and converting it into lower-temporal-resolution activity values; however, the computational implementations of this data compression differ substantially across manufacturers. Building on our previous work, where we ex-amined how dissimilarly the various activity determination methods we generalized can quantify the same movements through correlation analysis, we investigated here how these methods (e.g., digital filtering, data compression) influence nonparametric circadian rhythm analysis and sleep–wake scoring. In addition to our generalized actigraphic framework, we also emulated the use of specific devices commonly employed in such sleep-related studies by applying their methods to raw actigraphic acceleration data we collected to demonstrate, through concrete real-life examples, how methodological choices may shape analytical outcomes. Additionally, we assessed whether nonparametric indi-cators could be derived directly from acceleration data without compressing them into ac-tivity values. Overall, our analysis revealed that all these analytical approaches of the sleep-wake cycle can be substantially affected by the manufacturer dependent actigraphic methodology, with the observed effects traceable to distinct steps of the signal processing pipeline, underscoring the necessity of cross manufacturer harmonization from a clini-cally oriented perspective.

Posted: 14 November 2025

https://doi.org/10.20944/preprints202511.1076.v1

Article

Computer Science and Mathematics

Signal Processing

Crosstalk Suppression in a Multi-Channel, Multi-Speaker System Using Acoustic Vector Sensors

Grzegorz Szwoch

Abstract: Automatic speech recognition in a scenario with multiple speakers in a reverberant space, such as a small courtroom, often requires multiple sensors. This leads to a problem of crosstalk that must be removed before the speech-to-text transcription is performed. The proposed method uses Acoustic Vector Sensors to acquire audio streams. Speaker detection is performed using statistical analysis of the direction of arrival. This information is then used to perform source separation. Next, speakers’ activity in each channel is analyzed, and signal fragments containing direct speech and crosstalk are identified. Crosstalk is then suppressed using a dynamic gain processor, and the resulting audio streams may be passed to a speech recognition system. The algorithm was evaluated using a custom set of speech recordings. An increase in SI-SDR value over the unprocessed signal was achieved: 7.54 dB and 19.53 dB for the algorithm with and without the source separation stage, respectively. The algorithm is intended for application in multi-speaker scenarios requiring speech-to-text transcription, such as court sessions or conferences.

Posted: 30 September 2025

https://doi.org/10.20944/preprints202509.2532.v1

Article

Computer Science and Mathematics

Signal Processing

A Non-Parametric Algorithm to Estimate Future Samples in Time Series

Ioannis Dologlou

Abstract: A new method to estimate future samples in time series data is presented and it is compared against the well known technique ESPRIT. It exploits the null space of the Hankel matrix of the data allowing the prediction of future samples with better accuracy and confidence. In a more general sense the notion of null space refers to the set of eigenvectors of the data Hankel matrix which are associated with the smallest eigenvalues. The method is fully deterministic with comparable computational complexity to ESPRIT. Testing involves 4000 randomly chosen data sets with variable spectral characteristics.

Posted: 29 September 2025

https://doi.org/10.20944/preprints202509.2140.v2

Article

Computer Science and Mathematics

Signal Processing

Multi-Modal Weak Signal Analysis for Buried Optical Cable DVS Using Combined Multi-Head Attention Mechanism

Lyu Minhui

Jiang Rongjun

Abstract: Distributed optical fiber sensing technologies, particularly Φ-OTDR, have been extensively applied in vibration monitoring of critical infrastructure, including highways and pipelines. This is attributed to their capabilities of long-distance monitoring, high spatiotemporal coverage, and ease of deployment. Nevertheless, the monitoring data encompasses a mixture of information such as cable structural coupling, initial vibration states, and multi-modal environmental excitations. Consequently, effective separation and extraction of these signals are crucial for practical implementations. Notably, multi-modal weak signal analysis has emerged as a significant technical challenge in this field. Building upon the Φ-OTDR DVS fiber-environment coupled vibration observation model, this research introduces an innovative scenario-oriented analytical framework that integrates a combined multi-head attention mechanism. This advancement enables precise extraction of multi-modal weak signals within complex environments. Empirical validation utilizing measured data from a 30 km optical cable installed along an urban ring road has confirmed the framework’s exceptional performance across various scenarios. These include road surface roughness detection, construction machinery detection and localization, and vehicle trajectory recognition. The study reveals that the attention mechanism effectively concentrates on scenario-relevant signals, thereby substantially enhancing the analytical real-time performance. Overall, the proposed framework offers a versatile and real-time solution for DVS signal processing in intricate scenarios.

Posted: 16 September 2025

https://doi.org/10.20944/preprints202509.1368.v1

Article

Computer Science and Mathematics

Signal Processing

Channel Estimation in UAV-Assisted OFDM Systems by Leveraging LoS and Echo Sensing with Carrier Aggregation

Zhuolei Chen

Wenbin Wu

Renshu Wang

Manshu Liang

Weihao Zhang

Shuning Yao

Wenquan Hu

Chaojin Qing

Abstract: Unmanned aerial vehicle (UAV)-assisted wireless communication systems often employ the carrier aggregation (CA) technique to alleviate the issue of insufficient bandwidth. However, in high-mobility UAV communication scenarios, the dynamic channel characteristics pose significant challenges to channel estimation (CE). Given these challenges, integrated sensing and communication (ISAC), which combines communication and sensing functionalities, has emerged as a promising solution to enhance CE accuracy for UAV systems. Meanwhile, the dominant line-of-sight (LoS) characteristics inherent in UAV scenarios present a valuable opportunity for further exploitation. To this end, this paper proposes a LoS and echo sensing-based CE scheme for CA-enabled UAV-assisted communication systems. Firstly, LoS sensing and echo sensing techniques are employed to acquire sensing-assisted prior information. Subsequently, the obtained prior information is utilized to refine the CE of the primary component carrier (PCC) in CA, thereby improving the accuracy of channel parameter estimation for the PCC. Based on the path-sharing property between PCC and secondary component carriers (SCCs), a three-stage scheme is proposed to reconstruct the channel of SCCs. In Stages I and II, the path-sharing property is exploited to reconstruct the LoS and non-line-of-sight (NLoS) paths of the SCCs in the delay-Doppler (DD) domain, respectively. Finally, an iterative procedure is applied to enhance the initial reconstruction and further recover non-shared transmission paths between PCC and SCCs. Simulation results demonstrate that the proposed method effectively enhances the CE accuracy for both PCC and SCCs. Furthermore, the proposed scheme exhibits robustness against parameter variations.

Posted: 16 September 2025

https://doi.org/10.20944/preprints202509.1275.v1

Article

Computer Science and Mathematics

Signal Processing

Georeferenced UAV Localization in Mountainous Terrain under GNSS-Denied Conditions

Inseop Lee

Chang-Ky Sung

Hyungsub Lee

Seongho Nam

Juhyun Oh

Keunuk Lee

Chansik Park

Abstract: In Global Navigation Satellite System (GNSS)-denied environments, Unmanned Aerial Vehicles (UAVs) relying on Vision-Based Navigation (VBN) in high-altitude, moun-tainous terrain face severe challenges due to geometric distortions in aerial imagery. This paper proposes a georeferenced localization framework that integrates orthorec-tified aerial imagery with Scene Matching (SM) to achieve robust positioning. The method employs a camera projection model combined with Digital Elevation Model (DEM) to orthorectify UAV images, thereby mitigating distortions from central projec-tion and terrain relief. Pre-processing steps—including illumination normalization, lens distortion correction, rotational alignment, and resolution adjustment—enhance consistency with reference orthophoto maps, after which template matching is per-formed using Normalized Cross-Correlation (NCC). Sensor fusion is achieved through Extended Kalman Filter (EKF) incorporating Inertial Navigation System (INS), GNSS (when available), barometric altimeter, and SM outputs, with sub-modules for hori-zontal, vertical, and altimeter error estimation. The framework was validated through flight tests with an aircraft over 45 km trajectories at altitudes of 2.5 km and 3.5 km in mountainous terrain. Results demonstrate the orthorectification improves image simi-larity and significantly reduces localization error, yielding lower 2D RMSE compared to conventional rectification. The proposed approach enhances VBN by mitigating terrain-induced distortions, providing a practical solution for UAV localization in GNSS-denied scenarios.

Posted: 09 September 2025

https://doi.org/10.20944/preprints202509.0790.v1

Article

Computer Science and Mathematics

Signal Processing

Plane Wave Imaging with Large-Scale 2D Sparse Arrays: A Method for Near-Field Enhancement via Aperture Diversity

Óscar Martínez-Graullera

Jorge Camacho

Jorge Huecas

Guillermo Cosarinsky

Luis Elvira

Montserrat Parrilla

Abstract: In the context of a medical imaging application for preclinical research, specifically, erebrovascular imaging in small animals, this work examines the challenges of using a Large-Scale 2D ultrasonic array with 32×32 elements (96×96 wavelengths). The application imposes demanding requirements: very near-field operation, high spatial resolution, high frequency, high frame rate, and imaging in a highly attenuating medium. These needs, combined with current technological limitations, such as element size and constraints on the number of elements that can be driven in parallel, pose significant challenges for system design and implementation. To evaluate system performance, we use plane wave imaging as a reference mode due to its ability to meet high acquisition speed requirements. Our analysis highlights limitations in spatial coverage and image quality when operating the full aperture under planewave transmission constraints. To overcome these limitations, we propose a sparse aperture strategy. When integrated with advanced signal processing techniques, this approach improves both contrast and resolution while maintaining acquisition speed. This makes it a promising solution for high-performance ultrasonic imaging under the demanding conditions of preclinical research.

Posted: 03 September 2025

https://doi.org/10.20944/preprints202509.0235.v1

Article

Computer Science and Mathematics

Signal Processing

Quantum-Enhanced Analysis and Grading of Vocal Performance

Rohan Agarwal

Abstract: Vocal singing is a profoundly emotional art form possibly predating spoken language, yet evaluating a vocal track remains a subjective and specialized task. Meanwhile, quantum computing shows promise to bring about significant advances in science and art. This study introduces QuantumMelody, a quantum-enhanced algorithm to evaluate vocal performances through objective metrics. QuantumMelody begins by collecting a comprehensive array of classical acoustic and musical features including pitch contours, formant frequencies, Mel-spectrograms, and dynamic ranges. These features are divided into three musically categorized groups, converted into scaled angles based on statistical metrics, and then encoded into specific quantum rotation gates. Each qubit group is entangled internally, followed by intergroup entanglement, thus exploring subtle, non-linear relationships within and across feature sets. The resulting quantum probability distributions and classical features are used to train a neural network, combined with a spectrogram transformer to holistically grade each recording on a 2--5 scale. Key difference metrics like the Jensen-Shannon distance and Euclidean measures of scaled angles are used to enable nuanced comparisons of different recordings. Furthermore, the algorithm uses classical music-based heuristics to provide targeted suggestions to the user for various aspects of vocal technique. On a dataset of 168 labeled 20 second vocal excerpts, QuantumMelody achieves 74.29% agreement with expert graders. The circuits are simulated; we do not claim hardware speedups, and results reflect a modest, single-domain dataset. We position this as an applied audio-signal-processing contribution and a feasibility step toward objective, interpretable feedback in singing assessment.

Posted: 03 September 2025

https://doi.org/10.20944/preprints202509.0281.v1

Article

Computer Science and Mathematics

Signal Processing

Prediction of Medically Drug-Induced Arrhythmias (Torsades de Pointes, Ventricular Tachycardia, and Ventricular Fibrillation) in Rabbit Model up to One Hour Before Their Onset Using Computational Method Based on Entropy Measure and Machine Learning

Jiří Kroc

Dmitriy Bobir

Abstract: Background: In general, this methodical paper describes a well-documented application of one complexity measure and various machine learning methods to solve a specific problem in biosignal processing: predictions of ventricular tachycardia & fibrillation and Torsades de Pointes arrhythmia. The methodology part provides a concise introduction to all used methods and is accompanied by a sufficient citation apparatus. Once the presented methodology gets explained, it is easy to apply it to many other research areas. Currently, allopathic medicine is facing one of the biggest challenges and transitions that it has been going through during its history. Deeper understanding of human physiology will enable medicine to reach better understanding of human body functioning. Simultaneously it will allow to design novel, so-far-inaccessible, complex, dynamically changing therapies based on this knowledge. We address the following general question: "Are there existing mathematical tools enabling us to predict changes in physiological functions of human bodies at least minutes or even hours before they start to operate?" This general question is studied on a specific, simple model of the rabbit heart subjected to by medically-induced drug insults that are leading to the drug-induced Torsades de Pointes (TdP) arrhythmia. This class of models can improve our ability to assess the current condition of the heart and even to predict its future condition and disease development within the next minutes and even hours. This can eventually lead to substantial improvement of the out-of-bed cardiology care. Methods: Electrocardiograph (ECG) recordings were acquired—in a different research project—from anesthetized rabbits (ketamine and xylazine) that were subjected to infusion of gradually increasing doses of arrhythmia-inducing methoxamine and dofetilide drugs. Subsequently, ECG curves were evaluated using the permutation entropy for different lag values, where the lag is the evaluation parameter. Lag is defining the distance between neighboring measuring points. Computed entropy curves were processed by machine learning (ML) techniques: Random Forest (RF), Support Vector Machine (SVM), Logistic Regression (LR), k-nearest neighbors (k-NN), Ensemble Learning (EL), and others. ML methods performed classification of arrhythmia above the evaluated segments of permutation entropy curves. Results: A possibility to predict drug-induced TdP arrhythmia up to one hour before its onset was confirmed in a small study of 37 rabbits with specificity and sensitivity achieving 93% (for important statistical features [measurable properties]). It was demonstrated that animals can be divided into two distinct groups: susceptible and resistant to arrhythmia. It was shown that animals can be classified using just five-minute segments prior to and after the application of methoxamine (this drug can be used in human medicine, unlike dofetilide). The drawback of the study is the too low a number of measured animals. Conclusion: This pilot study demonstrated a relatively high probability that the prediction of the onset of TdP arrhythmia is possible tens of minutes or even hours before its actual onset with sensitivity and specificity around 93%. Those findings must be confirmed in wider animal studies and on human ECGs. Another human study got similar results using deep learning methods. Presented software predicting of arrhythmia has a big potential in human medicine, because it can be applied in hospital monitors, implantable defibrillators, and wearable electronics guarding the health condition of patients. A small set of tested animals does not allow their subdivision into sufficiently big subgroups (TdP and Normal). Groups are too small and asymmetric. It is recommended to test achieved results on different, larger ECG databases of animal models and on large human ECG databases.

Posted: 02 September 2025

https://doi.org/10.20944/preprints202509.0119.v1

Article

Computer Science and Mathematics

Signal Processing

Integrated Low-Cost Lighting Filters for Color-Accurate Imaging

Sahara R. Smith

Susan Farnand

Abstract: Color accuracy is both important and elusive in cultural heritage imaging. An established method for improving color accuracy is dual-RGB imaging, where RGB images of an object are captured sequentially under two different conditions, then combined. As part of an initiative to increase accessibility to color-accurate imaging, the use of lighting filters with the dual-RGB method is investigated. Gel lighting filters are low-cost and can be directly integrated into an imaging workflow by placing them in front of the existing light sources. This research found that color accuracy can be increased by using lighting filters, but can also be decreased by a poor selection of filter combinations. The identity of the best-performing filters is highly dependent on the light source and can be affected by the pixels selected to represent the color target. Current simulation approaches are insufficient to predict which filters will increase color accuracy. While lighting filters are a promising method for accessible multispectral imaging, the practical implementation is complex and requires further research and adjustments to the method.

Posted: 28 August 2025

https://doi.org/10.20944/preprints202508.2054.v1

Article

Computer Science and Mathematics

Signal Processing

NeuroGraph-TSC: A Neuro-Inspired Graph-Based Temporal-Spatial Classifier for Cognitive State Prediction from EEG

Noor Fatima

Ghulam Nabi

Abstract: Accurate prediction of cognitive states such as psychological stress from electroencephalography (EEG) remains a significant challenge due to the inherently spatiotemporal and nonlinear nature of brain dynamics. To address these complexities, we propose NeuroGraph-TSC, a novel neuro-inspired, graph-based temporal-spatial classifier that incorporates domain-specific neuroscientific priors into a deep learning architecture for improved cognitive state decoding. The model constructs a spatial graph where EEG electrodes are represented as nodes, and inter-node edge weights are determined based on either scalp geometry or empirical functional connectivity, enabling physiologically meaningful spatial feature propagation. Temporal modeling is achieved through recurrent processing that captures both rapid and slow neural fluctuations. To further enhance biological plausibility, we integrate a neural mass model-based regularizer into the loss function, specifically adopting the Jansen-Rit dynamical system to constrain the model toward biophysically informed temporal dynamics.We evaluate NeuroGraph-TSC on the SAM-40 raw EEG stress dataset, achieving high classification performance across low, moderate, and high stress levels. Comprehensive ablation studies and interpretability analyses confirm the individual and collective contributions of the neuroscience-aligned components, validating both the robustness and neurophysiological relevance of the model. NeuroGraph-TSC offers a promising step toward bridging computational neuroscience and deep learning for advancing EEG-based affective computing.

Posted: 20 August 2025

https://doi.org/10.20944/preprints202508.1321.v1

Article

Computer Science and Mathematics

Signal Processing

Design of Improvements to Low-Complexity Scheme for Enhancing Multi-Carrier Communications

Keith Jones

Abstract: The paper describes an enhanced version of a scheme originally designed by the current author to enable one to improve the quality of one’s own wireless communications, over a given frequency (or frequencies), when in the presence of inter‑modulation distortion (IMD). The IMD is that generated by one’s own power amplifier (PA), when operating over an adjacent band of frequencies, and arises as a result of the non‑linear nature of the PA when engaged in the transmission of modulated multi‑carrier (MMC) signals. The IMD appears in the form of inter‑modulation products (IMPs), these occurring at multiple frequencies which may potentially coincide with that of one’s own communication. The new version – which, like the original scheme, efficiently predicts the locations and strengths of the IMPs and, when coincident with the communication frequency, clears the IMPs from that frequency – overcomes the limitations of the original scheme. That is, it enables one to handle more sophisticated signal types whereby both bandwidths and powers of the signal components may now be arbitrarily defined. Also, it extends the scheme’s applicability to multiple zones of distortion, this resulting in the need to handle 2nd‑order and 4th‑order IMD terms as well as the 3rd‑order and 5th-order terms originally addressed.

Posted: 19 August 2025

https://doi.org/10.20944/preprints202508.1366.v1

Article

Computer Science and Mathematics

Signal Processing

Multimodal EEG-Based Classification of Alzheimer's and MCI Using Olfactory Event-Related Potentials and Transformers

Noor Fatima

Ghulam Nabi

Abstract: Neurodegenerative diseases such as Alzheimer’s Disease (AD) and Mild Cognitive Impairment (MCI) are characterized by insidious cognitive decline, often preceded by olfactory dysfunction. Emerging evidence from cognitive neuroscience and olfaction research suggests that odor-evoked brain responses may serve as sensitive biomarkers for early neurodegenerative changes. This study proposes a multimodal framework integrating cognitive event-related potentials (ERPs), olfactory stimulus processing, and machine learning-based disease classification to detect early signs of MCI and AD using electroencephalography (EEG).We utilize a publicly available EEG dataset recorded during olfactory oddball paradigms to investigate differential neural responses to standard versus deviant odors across three cohorts: healthy controls, MCI patients, and individuals with AD. First, electrophysiological signatures such as the P300 and N200 components are analyzed to characterize cognitive processing of olfactory stimuli. Second, time-frequency analyses and source localization methods are employed to delineate latency, amplitude, and cortical activation differences in response to olfactory deviance. Third, engineered EEG features, including ERP peak amplitudes and spectral power in alpha, beta, and gamma bands, are used to train deep learning models, particularly Transformer architectures, for robust multi-class classification.Preliminary findings indicate significant group-level differences in ERP profiles and classification metrics, demonstrating the diagnostic potential of olfactory EEG responses. The proposed approach offers a non-invasive, cost-effective adjunct for early detection of neurodegeneration, advancing the intersection of olfactory neuroscience, cognitive electrophysiology, and clinical neuroinformatics.

Posted: 18 August 2025

https://doi.org/10.20944/preprints202508.1286.v1

Article

Computer Science and Mathematics

Signal Processing

TriNet-MTL: A Multi-Branch Deep Learning Framework for Biometric Identification and Cognitive State Inference from Auditory-Evoked EEG

Noor Fatima

Ghulam Nabi

Abstract: Electroencephalography (EEG) signals, particularly those elicited by auditory stimuli, provide a rich window into both cognitive processing and physiological traits. This dual nature makes auditory-evoked EEG highly promising for diverse applications ranging from biometric authentication to cognitive state inference. However, most existing approaches treat these tasks in isolation and rely on unimodal or task-specific models, which limits their robustness and generalization in real-world, noisy environments. In this work, we introduce TriNet-MTL (Triple-Task Neural Transformer for Multitask Learning). This unified deep learning framework simultaneously addresses three complementary objectives: (i) biometric user identification, (ii) auditory stimulus language classification (native vs. non-native), and (iii) device modality recognition (in-ear vs. bone-conduction). The proposed architecture combines a shared temporal encoder with a Transformer-based sequence representation module, followed by three specialized task heads. This design allows the model to leverage shared representations while still optimizing for the unique characteristics of each task. Training is conducted with a sliding-window strategy and a joint cross-entropy loss function to balance task performance. Extensive experiments demonstrate that TriNet-MTL achieves strong performance across all tasks, including over 91% accuracy in user identification, high precision in language discrimination, and reliable device modality classification. Notably, multitask learning not only improves individual task outcomes but also enhances feature sharing across tasks, reducing redundancy and mitigating interference. Our findings highlight the potential of multitask deep learning as a powerful paradigm for EEG-based analysis, paving the way toward integrated neurotechnology solutions that unify biometric authentication, brain–computer interface (BCI) systems, and cognitive monitoring in a single framework.

Posted: 18 August 2025

https://doi.org/10.20944/preprints202508.1270.v1

Article

Computer Science and Mathematics

Signal Processing

LPGNet: A Lightweight Network with Parallel Attention and Gated Fusion for Multimodal Emotion Recognition

Zhining He

Yang Xiao

Abstract: Emotion recognition in conversations (ERC) aims to predict the emotional state of each utterance by using multiple input types, such as text and audio. While Transformer-based models have shown strong performance in this task, they often face two major issues: high computational cost and heavy dependence on speaker information. These problems reduce their ability to generalize in real-world conversations. To solve these challenges, we propose LPGNet, a Lightweight network with Parallel attention and Gated fusion for multimodal ERC. The main part of LPGNet is the Lightweight Parallel Interaction Attention (LPIA) module. This module replaces traditional stacked Transformer layers with parallel dot-product attention, which can model both within-modality and between-modality relationships more efficiently. To improve emotional feature learning, LPGNet also uses a dual-gated fusion method. This method filters and combines features from different input types in a flexible and dynamic way. In addition, LPGNet removes speaker embeddings completely, which allows the model to work independently of speaker identity. Experiments on the IEMOCAP dataset show that LPGNet reaches over 87% accuracy and F1-score in 4-class emotion classification. It outperforms strong baseline models while using fewer parameters and showing better generalization across speakers.

Posted: 11 August 2025

https://doi.org/10.20944/preprints202508.0678.v1

Article

Computer Science and Mathematics

Signal Processing

A Local Thresholding Algorithm for Image Segmentation by Using Gradient Aided Histogram

Lijie Dong

Kailong Zhang

Mingyue He

Shenxin Zhong

Congjie Ou

Abstract: In image segmentation, local thresholding algorithms may yield more accurate and robust results since they are based on the features of images. Therefore, the common patterns exhibit in the same image category is crucial to improve the quality of segmentation results. In present paper, a new local thresholding algorithm that using gradient aided histogram is proposed to process the images that have apparent texture or periodical structure. It is found that clustering pixels with similar gray-level gradient plays an important role for the multi-level image segmentation. The famous global thresholding algorithms, such as Kapur and Otsu, are adopted to make the comparison. The results are quantitatively illustrated in terms of PSNR (Peak Signal-to-Noise Ratio) and FSIM (Feature Similarity Index). It is shown that the proposed algorithm can effectively recognize the common features of the images that belong to the same category, and maintain the stable performances when the number of threshold increases. Furthermore, the processing time of present algorithm is competitive to those of other algorithms, which shows the potential application in real time scenes.

Posted: 31 July 2025

https://doi.org/10.20944/preprints202507.2617.v1

Article

Computer Science and Mathematics

Signal Processing

A Solution to the Collatz Conjecture Problem

Baoyuan Duan

Abstract:

Research Collatz odd sequence, change (×3 + 1) ÷ 2^k operation in Collatz Conjecture to (×3 + 2^m − 1) ÷ 2^k operation. Expand loop Collatz odd sequence (if exists) in (×3 + 2^m − 1) ÷ 2^k odd sequence to become ∞-steps non-loop sequence. Build a (×3 + 2^m − 1) ÷ 2k odd tree model and transform position model for odds in tree. Via comparing actual and virtual positions, prove if a (×3 + 2^m − 1) ÷ 2^k odd sequence can not converge after ∞ steps of (×3 + 2^m − 1) ÷ 2^k operation, the sequence must walk out of the right boundary of the tree.

Abstract:

Posted: 28 July 2025

https://doi.org/10.20944/preprints202301.0541.v20

Article

Computer Science and Mathematics

Signal Processing

Particle Filtering Estimation of Regime Switching Factor Model and Its Application in Statistical Arbitrage Strategy

Yu Mu

Robert J. Frey

Abstract: Statistical factor models are widely applied across various domains of the financial industry, including risk management, portfolio selection, and statistical arbitrage strategies. However, conventional factor models often rely on unrealistic assumptions and fail to account for the fact that financial markets operate under multiple regimes. In this paper, we propose a regime-switching factor model estimated using a particle filtering algorithm which is a Monte Carlo-based method well-suited for handling nonlinear and non-Gaussian systems. Our empirical results show that incorporating regime-switching dynamics significantly enhances the model’s ability to detect structure breaks and adapt to evolving market conditions. This leads to improved performance and reduced drawdown in the equity statistical arbitrage strategies.

Posted: 24 July 2025

https://doi.org/10.20944/preprints202507.2022.v1

of 7