Submitted:
27 May 2025
Posted:
28 May 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- Process heterogeneous datasets
- Handle missing or noisy variables
- Adapt to high-dimensional data structures
- Learn from vast historical observation sets to generate robust predictions
- Significant data preprocessing requirements
- Reduced interpretability compared to traditional models
- Performance sensitivity to hyperparameter selection
- Computational complexity in some implementations
- Precisely modeling temporal sequences in financial data
- Capturing complex nonlinear interactions
- Leveraging massive datasets for improved generalization
- LSTM networks are particularly well-suited for financial time series analysis due to their ability to capture long-term dependencies
- CNNs have proven effective for extracting discriminative features from transformed financial ratio matrices or tabular representations
- High computational complexity and resource requirements
- "Black box" nature that reduces interpretability
- Difficulty justifying predictions for critical applications like credit assessment
- We propose a comprehensive framework of the Fuzzy Support Vector Machine (Fuzzy SVM) using a diverse range of membership functions, including geometric, density-based, and entropy-driven approaches, to quantify the uncertainty of individual samples and enhance model robustness in imbalanced data scenarios.
- We extend the Rough Support Vector Machine (Rough SVM) paradigm by integrating multiple weighting strategies that reflect the granularity of lower and upper approximations, enabling the model to better capture data ambiguity and improve classification performance.
- We introduce a novel Shadowed Support Vector Machine (Shadowed SVM) approach that employs a Multi-Metric Fusion mechanism to define shadow regions near the decision boundary. This is achieved through a combination of geometric distances and margin-based metrics, followed by shadowed combination to control the influence of uncertain instances.
- We develop a Quotient Space Support Vector Machine (QS SVM) model that utilizes a Quotient Space Generator per class. This mechanism delineates the input space into localized regions by employing clustering algorithms, including K-Means or DBSCAN, thus facilitating the model’s ability to develop classifiers that are specific to each region and to accommodate the variations inherent in local data distributions.
- We empirically observe that Fuzzy SVM excels in achieving high overall accuracy, while the Shadowed SVM provides superior performance in handling data imbalance. Motivated by these complementary strengths, we propose a novel hybrid model—Fuzzy Shadowed Support Vector Machine (Fuzzy Shadowed SVM)—which combines fuzzy membership weighting with shadowed instance discounting to achieve both high accuracy and class balance.
2. Literature Review
- Oversampling: Artificially augmenting the minority class via random duplication (Random Oversampling) or synthetic generation (e.g., SMOTE [19]), which interpolates new instances from real examples.
- Undersampling: Reducing the majority class via random subset selection, though risking information loss about healthy firms.
- Advanced hybrids (Borderline-SMOTE, ADASYN, SMOTEENN) [28] combine these approaches. Empirical studies demonstrate these techniques significantly improve minority-class metrics like recall, weighted precision, and AUC-ROC (Zhou et al. 2019; Martín-Jiménez 2020).
- Random Forests: Bootstrap-aggregated decision trees robust to noise and correlations
- Support Vector Machines (SVM): Margin-maximizing classifiers in transformed spaces
- Decision Trees (CART, C4.5): Interpretable rule-based models
- Boosting algorithms (XGBoost, LightGBM, CatBoost): Ensemble methods combining weak learners
- Cost-sensitive learning: Algorithms like XGBoost permit class weighting (scale pos-weight) to penalize minority-class errors
- Integrated sampling: Techniques like Balanced Random Forest perform per-iteration resampling
- Alternative metrics: F1-score, AUC-PR, G-mean, or Matthews Correlation Coefficient (MCC) better evaluate imbalanced contexts [37].
- Weighted loss functions: Modifying binary cross-entropy with class-frequency weights or adopting focal loss [30] to emphasize hard samples
- Balanced batch training: Curating mini-batches with controlled class proportions [20]
- Temporal data augmentation: For LSTM/GRU models, generating synthetic sequences via dynamic time warping or Gaussian perturbation [18]
- Data-level rebalancing (SMOTE, ADASYN)
- DL representation power
- Secondary classifiers (e.g., Random Forest, XGBoost) for refined decisions
3. Granular Computing
3.1. Fuzzy Sets
- Membership Function: The core of fuzzy logic, which can take various shapes (triangular, trapezoidal, Gaussian, sigmoidal), chosen according to interpretative or modeling needs.
- Support: The set of elements where , indicating the domain of influence.
- Core: The set of elements where , representing full membership.
- Height: The maximum value of ; the set is normalized if the height equals 1.
- Union:
- Intersection:
- Complement:
“If speed is high and visibility is low, then decelerate sharply”
- Fuzzy control mechanisms (e.g., thermal regulation, self-operating vehicles)
- Multi-criteria decision analysis under conditions of uncertainty (fuzzy Analytic Hierarchy Process, fuzzy Technique for Order of Preference by Similarity to Ideal Solution)
- Medical diagnostics involving indistinct symptoms
- Fuzzy data examination and clustering methodologies (e.g., fuzzy c-means algorithm)
- Risk assessment and behavioral finance considerations
- Expert systems and symbolic artificial intelligence
- Subjectivity in choosing membership functions
- Difficulty in aggregating a large number of fuzzy rules
- Increasing computational complexity in large-scale systems
- Unsuitability for random uncertainties (where probability theory is more appropriate)
- Type-2 Fuzzy Sets (characterized by uncertainty pertaining to the membership function itself)
- Intuitionistic Fuzzy Sets (which incorporate a quantifiable measure of non-membership)
- Rough Sets (which are pertinent within contexts reliant on granularity)
3.2. Rough Sets
- U is the Universe: a finite set of objects,
- A represents a finite set of Attributes.
- Lower approximation : the set of objects that certainly belong to X, i.e., those whose equivalence classes are fully contained within X:
- Upper approximation : the set of objects that possibly belong to X, i.e., those whose equivalence classes intersect with X:
- A reduct is a minimal subset of attributes preserving the classification power of the full set.
- The core is the intersection of all reducts—attributes that are indispensable.
- Feature selection and dimensionality reduction
- Interpretable decision rule generation
- Analysis of incomplete or imprecise data
- Multi-criteria decision analysis
- Bioinformatics, finance (bankruptcy prediction), healthcare (diagnosis)
3.3. Shadowed Sets
- ⇒ the element clearly belongs to the set;
- ⇒ the element clearly does not belong to the set;
- ⇒ the element lies in a shadowed region, indicating indeterminacy.
- Positive region (membership 1): if , then ;
- Negative region (membership 0): if , then ;
- Shadowed region (indeterminate): if , then is undefined or remains within .
- Patient A: ⇒ classified as ill;
- Patient B: ⇒ classified as healthy;
- Patient C: ⇒ classification is indeterminate.
3.4. Quotient Space Theory
- X is the original information space,
- R is an equivalence relation on X,
- f is a function defined on the equivalence classes of R.
- Cognitive partiality: human perception is inherently local and approximate,
- Local processing: reasoning is performed within subspaces of the global problem.
- the root node represents the global space,
- lower levels denote finer abstractions,
- child nodes refine the representations of their parents.
- dimensionality reduction,
- reasoning over aggregated representations,
- robustness against uncertain or noisy data.
- Construction of a quotient space: selecting relevant attributes and defining R,
- Reasoning and prediction: operating within a simplified space, and refining representations when uncertainty arises.
- model bias toward the majority class,
- low recall on the minority class,
- limited generalization capabilities.
- intelligent grouping of data samples,
- localized treatment of the minority class,
- adaptive granularity tailored to rarity.
- Pre-granulation employing QST,
- Structured resampling predicated on granule characteristics,
- SVM training utilizing balanced granules,
- Hierarchical prediction facilitated through a QS Tree framework.
- enhances recall for the minority class,
- mitigates tendencies toward overfitting,
- promotes adaptable and localized decision-making.
4. Granular Support Vector Machines: Proposed Approach
- Gaussian Radial Basis Function (RBF) Kernel:
- Polynomial Kernel:
- Their resistance to overfitting
- Their flexibility through kernel selection
- Their effectiveness in high-dimensional spaces
- Fuzzy Support Vector Machine (Fuzzy SVM),
- Rough Support Vector Machine (Rough SVM),
- Shadowed Support Vector Machine (Shadowed SVM),
- Quotient Space Support Vector Machine (QS SVM),
- Fuzzy Shadowed Support Vector Machine (Fuzzy Shadowed SVM).
- Reducing the impact of outliers,
- Emphasizing firms on the brink of bankruptcy,
- Attenuating bias towards the majority class.
- The positive region (certainly bankrupt or non-bankrupt),
- The negative region (certainly not bankrupt or bankrupt),
- The boundary region (uncertain).
- Creates a fuzzy boundary between classes,
- Enhances the detection of critical regions,
- Reduces the influence of weakly informative examples.
- Structures data according to equivalence relations,
- Enables hierarchical classification,
- Enhances robustness against local variations.
- Robust detection of high-risk ambiguous firms,
- Better interpretation of transitional zones.
- Reduction of overfitting on uncertain cases,
- Explicit decision-making in borderline scenarios.
4.1. Fuzzy Support Vector Machine (Fuzzy SVM)
- Reduction of the effect of outliers,
- Emphasis on borderline companies near financial distress,
- Mitigation of bias toward the majority class.
-
Center Distance-Based MembershipThis function evaluates the membership of a sample based on its Euclidean distance to the nearest class center.For minority class samples, the membership is amplified:Description: Samples closer to any class center receive higher membership. Minority class instances are emphasized by doubling their score.
-
Global Sphere-Based MembershipThis function defines a membership value based on the distance to the global center of all samples.where is the global centroid and is the radius.Description: Points farther from the center receive lower membership. Minority samples get amplified membership values.
-
Hyperplane Distance MembershipThis function calculates membership values based on the distance to the decision hyperplane of a linear SVM.Description: Samples closer to the decision boundary receive higher scores. Minority class points have doubled membership.
-
Local Density-Based Membership (k-NN)This method uses the average distance to k-nearest neighbors to assess local density.Description: Samples in dense regions (smaller average distances) get higher membership values.
-
Local Entropy-Based MembershipUsing a probabilistic k-NN classifier, this function computes local class entropy.Description: Samples with high uncertainty (high entropy) receive lower membership values.
-
Intra-Class Distance MembershipThis function measures the distance of a sample to the center of its own class.Description: Points that are closer to the center of their own class get higher membership scores.
-
RBF-Based MembershipThis method uses a Gaussian radial basis function to assign membership based on distance to the global center.Description: Samples near the center receive values close to 1; distant ones decay exponentially.
-
RBF-SVM Margin MembershipThis function derives membership based on the confidence margin from an RBF-kernel SVM.where is the decision function of the RBF-SVM.Description: Samples close to the RBF-SVM boundary have high membership scores, capturing uncertainty near the decision margin.
-
Combined Membership FunctionA weighted aggregation of all eight membership functions is proposed as:Description: This function enables flexible integration of various membership strategies with user-defined weights for enhanced generalization and robustness in imbalanced scenarios.
4.2. Rough Support Vector Machine (Rough SVM)
- Positive region (certainly bankrupt or non-bankrupt),
- Negative region (certainly not bankrupt or bankrupt),
- Boundary region (uncertain cases).
- Indiscernibility Relation: The foundational element of rough set theory, computed using an epsilon distance threshold.
-
Lower and Upper Approximations:
- −
- The lower approximation contains objects that definitively belong to a class.
- −
- The upper approximation contains objects that may possibly belong to the class.
- −
- The boundary region is defined as the difference between these two approximations.
-
Sample Weighting Methods:
- −
- Rough Set Membership: Weights based on approximation set membership.
- −
- Rough Set Boundary Distance: Weights derived from distance to the boundary region.
- −
- Rough Set Quality: Weights determined by approximation quality.
- −
- Rough Set kNN Granularity: Weights based on local granularity of k-nearest neighbors.
- −
- Rough Set Reduction Importance: Weights reflecting attribute importance.
- −
- Rough Set Cluster Boundary: Weights assigned by proximity to cluster boundaries.
- −
- Rough Set Local Discernibility: Weights based on local instance discernibility.
- −
- Rough Set Combined: A weighted aggregation of all above methods.
-
Rough Set Membership-Based WeightingEquation:Description:This method assigns a weight based on whether an instance belongs to the lower approximation (certain region), the upper approximation (possible region), or outside both. Minority class instances are emphasized by doubling their scores.
-
Boundary Distance-Based Weighting:Equation:Description:This approach refines rough approximations by evaluating the relative position of an instance within the boundary region. A higher rank in the boundary implies greater uncertainty and thus lower weight.
-
Approximation Quality-Based Weighting:Equation:Description:This weighting method relies on the quality of approximation for each class, computed as the ratio of the size of the lower approximation to the upper approximation. Higher quality indicates clearer class definition.
-
kNN-Based Granularity Weighting:Equation:Description:This method measures the local purity around each sample, defined by the proportion of its k nearest neighbors that share the same label. High purity indicates greater certainty.
-
Feature Reduction Importance-Based Weighting:Equation:Description:The importance of each attribute is determined by its discriminative power, computed as the number of label changes when sorting instances by that attribute. Weights are assigned as a weighted sum of absolute attribute values.
-
Cluster Boundary-Based Weighting:Equation:Description:Weights are based on the distance of each instance to its closest cluster center (using k-means). Central points are given higher weights; marginal instances near boundaries are down-weighted
-
Local Discernibility-Based Weighting:Equation:Description:Weights reflect how many of the k nearest neighbors belong to different classes. Higher discernibility implies the instance is in a complex region, warranting higher emphasis.
-
Combined Rough Set Weighting:Equation:Description:This method computes a weighted linear combination of all the seven aforementioned weighting strategies. The weights can be tuned to reflect the relative importance of each criterion.
4.3. Shadowed Support Vector Machine (Shadowed SVM)
- Establishes a fuzzy boundary between classes,
- Enhances the detection of critical zones,
- Reduces the influence of uninformative examples.
- Full membership ()
- Non-membership ()
- Shadowed region ()
-
Distance to Class CentersThis method calculates the Euclidean distance of each instance to its respective class centroid. The inverse of the distance is normalized and passed to the shadowed conversion. This ensures that points near their class center (representing prototypical examples) receive higher importance.
-
Distance to Global Sphere CenterHere, we compute distances to the global mean vector and normalize them. Instances close to the global center are assumed to be more representative and are therefore favored.
-
Distance to Linear SVM HyperplaneWe train a linear SVM and use the absolute value of its decision function as a proxy for confidence. These values are normalized and inverted, assigning higher weights to instances closer to the decision boundary.
-
K-Nearest Neighbors DensityThis approach uses the average distance to k-nearest neighbors to estimate local density. High-density points are considered more informative and hence are promoted.
-
Local Entropy of Class DistributionBy training a KNN classifier, we compute the class distribution entropy in the neighborhood of each point. Lower entropy values indicate higher confidence, which translates into higher weights.
-
Intra-Class CompactnessThis function assesses each instance’s distance to its own class centroid. The inverse of this distance measures intra-class compactness, helping to down-weight class outliers.
-
Radial Basis Function KernelWe define a Gaussian RBF centered on the global dataset mean. Points near the center receive higher RBF values and are treated as more central to the learning task
- RBF-SVM Margin An RBF-kernel SVM is trained, and the margin is used as a measure of importance. Instances near the margin are prioritized, reflecting their critical role in determining the separating surface.
-
Minority Class Boosting MechanismAfter computing initial weights, an explicit adjustment is applied to enhance minority class representation:
- −
- If , assign
- −
- If , assign
This ensures that no minority class instance is completely ignored and those with ambiguous status are treated as fully informative. This enhancement is crucial in highly skewed scenarios. -
Multi-Metric Fusion via Shadowed CombinationThe function shadowed_combined aggregates all eight previously described metrics using a weighted average:where is the shadowed membership of instance i under metric j and is the corresponding metric weight.This Shadowed SVM significantly advances classical SVMs by embedding granular soft reasoning into the training process. Key advantages include:
- −
- Data integrity is preserved; no synthetic samples are generated.
- −
- Minority class enhancement is performed selectively and contextually.
- −
- The methodology is generalizable to any learning algorithm supporting instance weighting.
4.4. Quotient Space Support Vector Machine (Quotient Space SVM)
- Structures data based on equivalence relations,
- Enables hierarchical classification,
- Enhances robustness against local variations.
- Class-Specific Space Partitioning: The feature space is partitioned by class, with each subspace further divided into local clusters (granules). These clusters serve as prototypes, capturing the local data structure.
- Adaptive Prototype Allocation: Minority classes are assigned more prototypes to compensate for their scarcity. Clustering methods (e.g., K-means for regular structures or DBSCAN for density-adaptive partitioning) generate the prototypes.
- Quotient Space Projection: Each sample is mapped to a new feature space defined by its distances to the prototypes. This space is termed quotient because it abstracts the original structure while preserving discriminative relationships.
- Weighted SVM Training: Minority-class prototypes are assigned higher weights (density_factor), which propagate to their constituent samples. The final classifier is an SVM trained on the quotient space representation rather than the raw data, enabling: Improved linear separability, Enhanced robustness to class imbalance, and Superior generalization performance.
- Granular abstraction: Converts raw features into semantically richer distance-based representations.
- Balancing effect: For the minority class, more granular regions are created to increase representation diversity.
- Dimensionality control: Reduces the complexity by condensing local distributions.
- QuotientSpaceGenerator: Performs class-wise clustering and prototype extraction using KMeans or DBSCAN.
- QuotientSpaceSVM: Applies SVM on the transformed quotient representation with balancing weights.
- HierarchicalQuotientSpaceSVM: Constructs layered quotient transformations before SVM training.
- AdaptiveMetricQuotientSpaceSVM: Introduces Mahalanobis-based adaptive distance metrics.
4.5. Fuzzy-Shadowed SVM (FS-SVM)
- Reduced overfitting on uncertain cases,
- Explicit decision-making in borderline situations.
- Positive region (): membership set to 1,
- Negative region (): membership set to 0,
- Shadowed region (): membership remains uncertain in (0,1).
- Enhances minority class contribution via fuzzy memberships,
- Reduces overfitting and misclassification in ambiguous zones through shadowed granulation.
- Fuzzy Membership Calculation: Assign fuzzy memberships to each instance using distance-based, entropy-based, or density-based functions.
- Shadowed Transformation:
- Modified SVM Training: Use transformed fuzzy-shadowed weights in the SVM loss function to penalize misclassifications proportionally to sample certainty.
- Minority Emphasis: The fuzzy component ensures greater influence of rare class examples in decision boundary construction.
- Uncertainty Management: Shadowed sets allow safe treatment of boundary points by avoiding hard decisions for uncertain data.
- Performance Gains: Improved G-mean, Recall, and F1-score, ensuring better trade-off between sensitivity and specificity.
- Adaptability: Thresholds and offer flexibility in managing granularity and uncertainty.
- Preprocessing: Normalize data and compute imbalance ratio.
- Fuzzy Memberships: Use functions based on distance to class center or local density.
- Parameter Selection: Tune , , and regularization parameter C using cross-validation.
- Evaluation Metrics: Use G-mean, AUC-ROC, Recall, and F1-score rather than accuracy alone.
- Fuzzy Sets: Fuzzy logic assigns each training instance a degree of membership to its class, reflecting the confidence or representativeness of that instance. High membership indicates a central or prototypical instance; low membership reflects ambiguity or atypicality.
- Shadowed Sets: Introduced to model vague regions in uncertain environments, shadowed sets define a shadow region around the decision boundary where class labels are unreliable. In this model, instances in this margin are down-weighted to reduce their impact during training, recognizing their inherent ambiguity.
- Computes fuzzy membership degrees for all training samples using multiple geometric and statistical criteria;
- Identifies shadow regions by evaluating the distance of instances from the SVM decision boundary;
- Adjusts sample weights by combining fuzzy memberships and a shadow mask, reducing the influence of uncertain instances and enhancing minority class detection.
- Center Distance: Membership is inversely proportional to the distance to the class center.
- Sphere Distance: Membership decreases linearly with the distance to the enclosing hypersphere.
- Hyperplane Distance: Membership is proportional to the absolute distance to a preliminary SVM hyperplane.
- k-NN Density and Local Entropy: Measures local structure and class purity via neighborhood statistics.
- Intra-Class Cohesion: Membership is inversely related to within-class dispersion.
- RBF Kernel and SVM Margin: Membership decays exponentially with Euclidean or SVM margin distance.
- C: SVM regularization parameter;
- : RBF kernel width;
- : shadow threshold;
- : shadow weight;
- Membership method (e.g., “center_distance”, “svm_margin”).
- It models instance uncertainty on two levels: class confidence (fuzzy membership) and ambiguity near the decision boundary (shadow set).
- It provides a flexible and extensible framework with multiple interpretable membership functions.
- It introduces region-based instance discounting directly into kernel-based classifiers.
- It maintains interpretability, as the weighting mechanisms are derived from geometric or statistical properties of the data.
- It improves performance on minority class recognition, often reflected in F1-score, G-mean, and AUC-ROC.
4.6. Imbalanced Data Problem
-
Fuzzy SVM Formulation:With membership degrees , the optimization problem becomes:subject to:Computation of
-
Rough SVM FormulationFor each class , we define:Objective Function
-
Shadowed SVM Formulation:Optimization
-
QS SVM Formulation:Quotient spaces are defined with:Multi-scale Objective Function
5. Experimental Studies
-
The first dataset (data1) is the Bankruptcy Data from the Taiwan Economic Journal for the years 1999–2009, available on Kaggle:https://www.kaggle.com/datasets/fedesoriano/company-bankruptcy-prediction/data.It contains 95 features in addition to the bankruptcy class label, and the total number of instances is exactly 6,819.
-
The second dataset (data2) is the US Company Bankruptcy Prediction dataset, also sourced from Kaggle:https://www.kaggle.com/datasets/utkarshx27/american-companies-bankruptcy-prediction-dataset.It consists of 78,682 instances and 21 features.
- The third dataset (data3) is the UK Bankruptcy Data, containing 5,000 instances and 70 features.
- A diminished EBIT/Interest ratio, when juxtaposed with an elevated TD/TA, could signify a potential risk to solvency.
- A low Quick Ratio (QA/CL) alongside a high Cash/TA may reveal poor working capital management.
- An excessively high Inv/COGS ratio, even with a strong S/TA, could signal slow inventory turnover.
5.1. Comparaison with Others Models
- Fuzzy SVM: A theoretical framework that integrates fuzzy membership values to depict the extent of confidence or reliability associated with each training instance, consequently mitigating the impact of noisy or ambiguous data.
- Rough SVM: Based on rough set theory, this model handles uncertainty by distinguishing between lower and upper approximations of classes, allowing the learning process to focus on certain and uncertain regions of the feature space.
- Shadowed SVM: Extends fuzzy SVM by introducing a shadowed region, which explicitly models the zone of uncertainty between clear membership and non-membership, enhancing robustness in decision boundaries.
- QS SVM: Utilizes quotient space theory to group similar instances into equivalence classes, thereby reducing complexity and capturing hierarchical structures in the data.
- Fuzzy Shadowed SVM: A hybrid model that combines fuzzy logic and shadowed set theory to manage uncertainty more effectively, allowing for refined decision-making under vagueness and imprecision.
- SVM with Different Error Costs: This version of Support Vector Machines (SVM) applies different penalty weights for misclassifying the majority class (0.1) versus the minority class (1.0), aiming to improve balance between the classes.
- SVM-SMOTE: This method pairs SVM with the Synthetic Minority Over-sampling Technique (SMOTE), which creates artificial samples to boost the representation of the minority class.
- SVM-ADASYN: Building on SMOTE, Adaptive Synthetic Sampling (ADASYN) tailors the number of synthetic samples generated based on the local data distribution, focusing more on challenging areas.
- SVM with Undersampling: Here, the majority class size is reduced before training the SVM to help balance the dataset.
- Random Forest: An ensemble of decision trees known for its robustness and strong performance on imbalanced datasets.
- K-Nearest Neighbors (KNN): A simple, proximity-based classifier that can be sensitive to class imbalance, used here as a benchmark.
- Logistic Regression: A widely-used linear classifier serving as a baseline for binary classification tasks.
-
Accuracy: Measures the overall proportion of correct predictions.However, in imbalanced data, this metric can be misleading, as it may favor the majority class.
-
F1-score: The harmonic mean of Precision and Recall.It is effective when a balance between false positives and false negatives is required.
- AUC-ROC (Area Under the ROC Curve): Evaluates the model’s ability to discriminate between classes. Values close to 1 indicate strong discriminative power.
-
Precision: The proportion of true positive predictions among all positive predictions.It is crucial in scenarios where false positives are costly.
-
Recall (Sensitivity): The proportion of true positive predictions among all actual positives.Important in cases where missing positive instances (e.g., bankruptcies) should be minimized.
-
Specificity: The proportion of true negatives correctly identified.Complements Recall and provides insight into the model’s performance on the majority class.
-
G-mean: The geometric mean of Recall and Specificity.It reflects the balance between classification accuracy on both classes and is particularly suitable for imbalanced datasets.
5.1.1. Fuzzy Support Vector Machine (Fuzzy SVM)
5.1.2. Rough Support Vector Machine (Rough SVM)
| Model | Accuracy | F1-score | AUC-ROC | Precision | Recall | Specificity | G-mean |
|---|---|---|---|---|---|---|---|
| DEC | 0.9320 | 0.1364 | 0.8374 | 0.1154 | 0.1667 | 0.9766 | 0.4034 |
| SVM-Smote | 0.9000 | 0.1525 | 0.8163 | 0.0900 | 0.1000 | 0.9073 | 0.3735 |
| SVM-ADASYN | 0.8280 | 0.1134 | 0.7462 | 0.0625 | 0.2111 | 0.8320 | 0.3130 |
| SVM-Undersampling | 0.7720 | 0.1024 | 0.8282 | 0.0551 | 0.1222 | 0.7729 | 0.3471 |
| Random Forest | 0.9420 | 0.0000 | 0.7211 | 0.0000 | 0.0000 | 1.0000 | 0.0000 |
| KNN | 0.9410 | 0.0000 | 0.6068 | 0.0000 | 0.0000 | 0.9990 | 0.0000 |
| Logistic Regression | 0.9420 | 0.0000 | 0.6626 | 0.0000 | 0.0000 | 1.0000 | 0.0000 |
| Rough SVM (Membership) | 0.9520 | 0.0400 | 0.7013 | 0.0312 | 0.1556 | 0.9684 | 0.2320 |
| Rough SVM (Boundary) | 0.9520 | 0.0400 | 0.7019 | 0.0312 | 0.1556 | 0.9684 | 0.2320 |
| Rough SVM (Quality) | 0.9520 | 0.0400 | 0.7019 | 0.0312 | 0.1556 | 0.9684 | 0.2320 |
| Rough SVM (KNN Granular) | 0.9520 | 0.0769 | 0.7420 | 0.0588 | 0.1111 | 0.9674 | 0.3279 |
| Rough SVM (Red. Importance) | 0.9510 | 0.0000 | 0.7242 | 0.0000 | 0.1000 | 0.9684 | 0.0000 |
| Rough SVM (Cluster) | 0.9520 | 0.0400 | 0.7079 | 0.0312 | 0.1556 | 0.9684 | 0.2320 |
| Rough SVM (Discernibility) | 0.9550 | 0.0426 | 0.6672 | 0.0345 | 0.1556 | 0.9715 | 0.2323 |
| Rough SVM (Combined) | 0.9610 | 0.1739 | 0.7367 | 0.4000 | 0.3111 | 0.9969 | 0.4328 |
- Accuracy: 0.9610
- F1-score: 0.1739
- Precision: 0.4000
- Recall: 0.3111
- AUC-ROC: 0.7367
- Specificity: 0.9969
- G-mean: 0.4328
| Methodology | Imbalance Handling | G-mean |
|---|---|---|
| Classical Models (RF, KNN, LR) | None | 0.0000 |
| SVM + Resampling (SMOTE, ADASYN) | Data Resampling | 0.31–0.37 |
| Rough SVM (Simple) | Rough Granules | ≈ 0.2320 |
| Rough SVM (KNN Granular / Discern.) | Local Granularization | ≈ 0.33 |
| Rough SVM (Combined) | Hybrid Rough Model | 0.4328 |
5.1.3. Shadowed Support Vector Machine (Shadowed SVM)
| Model | Accuracy | F1-score | AUC-ROC | Precision | Recall | Specificity | G-mean |
|---|---|---|---|---|---|---|---|
| Shadowed SVM-Centre (, ) | 0.9699 | 0.2807 | 0.8710 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Centre (, ) | 0.9699 | 0.2807 | 0.8710 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Centre (, ) | 0.9699 | 0.2807 | 0.8710 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Centre (, ) | 0.9699 | 0.2807 | 0.8709 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Centre (, ) | 0.9699 | 0.2807 | 0.8711 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Centre (, ) | 0.9699 | 0.2807 | 0.8709 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Centre (, ) | 0.9699 | 0.2807 | 0.8688 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Centre (, ) | 0.9699 | 0.2807 | 0.8688 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Centre (, ) | 0.9699 | 0.2807 | 0.8687 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Sphere | 0.9699 | 0.2807 | 0.8695 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Hyperplan | 0.9545 | 0.2619 | 0.8404 | 0.2750 | 0.2500 | 0.9780 | 0.4945 |
| Shadowed SVM-KNN-Densité | 0.9699 | 0.2807 | 0.8708 | 0.6154 | 0.3818 | 0.9962 | 0.4256 |
| Shadowed SVM-Entropie-Locale | 0.9699 | 0.3051 | 0.8648 | 0.6000 | 0.2045 | 0.9955 | 0.4512 |
| Shadowed SVM-Intra-Classe | 0.9699 | 0.2807 | 0.8728 | 0.6154 | 0.2818 | 0.9962 | 0.4256 |
| Shadowed SVM-RBF | 0.9699 | 0.2807 | 0.8695 | 0.6154 | 0.2818 | 0.9962 | 0.4256 |
| Shadowed SVM-RBF-SVM-Margin | 0.9699 | 0.3051 | 0.8657 | 0.6000 | 0.2045 | 0.9955 | 0.4512 |
| Shadowed SVM-Combiné | 0.9699 | 0.3051 | 0.8659 | 0.6000 | 0.4045 | 0.9955 | 0.4512 |
| DEC | 0.9377 | 0.3609 | 0.9185 | 0.2697 | 0.2455 | 0.9508 | 0.3201 |
| SVM-Smote | 0.9304 | 0.3537 | 0.9119 | 0.2524 | 0.2909 | 0.9417 | 0.3459 |
| SVM-ADASYN | 0.8915 | 0.2745 | 0.9078 | 0.1750 | 0.2364 | 0.9000 | 0.3568 |
| SVM-Undersampling | 0.8409 | 0.2644 | 0.9213 | 0.1554 | 0.2864 | 0.8394 | 0.3626 |
| Random Forest | 0.9692 | 0.2759 | 0.9368 | 0.5714 | 0.1818 | 0.8955 | 0.2254 |
| KNN | 0.9507 | 0.2857 | 0.7424 | 0.6667 | 0.1818 | 0.8970 | 0.2258 |
| Logistic Regression | 0.9633 | 0.2188 | 0.8733 | 0.3500 | 0.1591 | 0.8902 | 0.2969 |
5.1.4. Quotient Space Support Vector Machine (Quotient Space SVM)
| Model | Accuracy | F1-score | AUC-ROC | Precision | Recall | Specificity | G-mean |
|---|---|---|---|---|---|---|---|
| QuotientSpaceSVM..k-means | 0.9000 | 0.1071 | 0.7278 | 0.0583 | 0.6667 | 0.8024 | 0.7314 |
| QuotientSpaceSVM..DBSCAN | 0.8830 | 0.0996 | 0.7829 | 0.0538 | 0.6667 | 0.7851 | 0.7235 |
| HierarchicalQuotientSpaceSVM | 0.8930 | 0.0881 | 0.7338 | 0.0478 | 0.5556 | 0.7974 | 0.6656 |
| AdaptiveMetricQuotientSpaceSVM | 0.7970 | 0.0978 | 0.6042 | 0.0531 | 0.6111 | 0.8004 | 0.6994 |
| DEC | 0.9320 | 0.1364 | 0.8374 | 0.1154 | 0.1667 | 0.9766 | 0.4034 |
| SVM-Smote | 0.9000 | 0.1525 | 0.8163 | 0.0900 | 0.1000 | 0.9073 | 0.3735 |
| SVM-ADASYN | 0.8280 | 0.1134 | 0.7462 | 0.0625 | 0.2111 | 0.8320 | 0.3130 |
| SVM-Undersampling | 0.7720 | 0.1024 | 0.8282 | 0.0551 | 0.1222 | 0.7729 | 0.3471 |
| Random Forest | 0.9420 | 0.0000 | 0.7211 | 0.0000 | 0.0000 | 1.0000 | 0.0000 |
| KNN | 0.9410 | 0.0000 | 0.6068 | 0.0000 | 0.0000 | 0.9990 | 0.0000 |
Quotient Space SVM Models
Traditional SVM-Based Models
Other Methods
Summary
5.1.5. Fuzzy Shadowed Support Vector Machine (Fuzzy Shadowed SVM)
6. Conclusion
- Mitigate classification bias in favor of the majority class;
- Preserve sensitivity to the minority class (failing firms);
- Model uncertainty in transitional regions of the feature space;
- Enhance financial interpretability through semantically meaningful granules.
- Large-scale validation across various sectors (banking, insurance, SMEs), comparing GSVM with classical rebalancing techniques (SMOTE, ADASYN);
- Adaptive learning of granularity, through algorithms dynamically adjusting the degree of fuzziness, approximation, shadowed uncertainty, or quotient abstraction;
- Integration with deep learning, via hybrid architectures combining CNN/transformers with granular SVMs;
- Cognitive visualization of uncertainty, with interfaces highlighting ambiguous, transitional, or high-risk zones;
- Advanced mathematical formalization, linking granular cognition and information theory;
- Dynamic optimization of the parameter and the shadowed membership value ;
- Learning of the weights for the combined strategy via meta-optimization or reinforcement learning;
- Extension to multi-class imbalance problems and non-binary classification;
- Integration of metaheuristics for dynamic hyperparameter optimization;
- Proposal of additional hybridizations, particularly Fuzzy Rough SVM, Fuzzy Quotient Space SVM, Rough Shadowed SVM, Rough Quotient Space SVM, and Shadowed Quotient Space SVM.
References
- Alaminos, David, Agustín del Castillo, and Manuel Fernandez. 2018. Correction: A global model for bankruptcy prediction. PLOS ONE 13(11).
- Barboza, Flavio, Herbert Kimura, and Edward I. Altman. 2017. Machine learning models and bankruptcy prediction. Expert Systems with Applications 83(83), 405–417.
- Borowska, Katarzyna and Jaroslaw Stepaniuk. 2022. Rough-granular approach in imbalanced bankruptcy data analysis. Procedia Computer Science 207, 1832–1841.
- Brenes, Raffael Farch, Arne Johannssen, and Nataliya Chukhrova. 2022. An intelligent bankruptcy prediction model using a multilayer perceptron. Intelligent Systems with Applications 16, 200136–200136.
- Chen, Linlin and Qingjiu Chen. 2020. A novel classification algorithm based on kernelized fuzzy rough sets. International Journal of Machine Learning and Cybernetics 11(11), 2565–2572.
- Chen, Yi and Jifeng Guo. 2023. Lifol: An efficient framework for financial distress prediction in high-dimensional unbalanced scenario. IEEE Transactions on Computational Social Systems, 1–12.
- Chen, Yi, Jifeng Guo, Junqin Huang, and Bin Lin. 2022. A novel method for financial distress prediction based on sparse neural networks. International Journal of Machine Learning and Cybernetics 13(7), 2089–2103.
- Chen, Zhensong, Wei Chen, and Yong Shi. 2020. Ensemble learning with label proportions for bankruptcy prediction. Expert Systems with Applications 146.
- Cho, Soo Hyun and Kyung Shik Shin. 2022. Feature-weighted counterfactual-based explanation for bankruptcy prediction. Expert Systems with Applications 216, 119390–119390.
- Dablain, Damien, Bartosz Krawczyk, and Nitesh V. Chawla. 2022. Deepsmote: Fusing deep learning and smote for imbalanced data. IEEE Transactions on Neural Networks and Learning Systems 34(9), 6390–6404.
- Figlioli, Bruno and Fabiano Guasti Lima. 2022. A proposed corporate distress and recovery prediction score based on financial and economic components. Expert Systems with Applications 197, 116726–116726.
- Gholampoor, Hadi and M. Asadi. 2024. Risk analysis of bankruptcy in the u.s. healthcare industries based on financial ratios: A machine learning analysis. Journal of Theoretical and Applied Electronic Commerce Research 19(2), 1303–1320.
- Ibrahim, H., S.A. Anwar, and M.I. Ahmad. 2021. Classification of imbalanced data using support vector machine and rough set theory: A review. In Journal of Physics: Conference Series, Volume 1878, pp. 012054.
- Iparraguirre-Villanueva, Orlando and Michael Cabanillas-Carbonell. 2024. Predicting business bankruptcy: A comparative analysis with machine learning models. Journal of Open Innovation 10(3), 100375–100375.
- Jabeur, Sami Ben and Vanessa Serret. 2023. Bankruptcy prediction using fuzzy convolutional neural networks. Research in International Business and Finance 64, 101844.
- Jang, Youjin, Inbae Jeong, and Yong K. Cho. 2020. Business failure prediction of construction contractors using a lstm rnn with accounting, construction market, and macroeconomic variables. Journal of Management in Engineering 36(2), 04019039.
- Jimenez-Castano, C., A. Alvarez-Meza, and A. Orozco-Gutierrez. 2020. Enhanced automatic twin support vector machine for imbalanced data classification. Pattern Recognition 107, 107442.
- Le, Tuong. 2022. A comprehensive survey of imbalanced learning methods for bankruptcy prediction. IET Communications 16(5), 433–441.
- Li, Junnan, Qingsheng Zhu, Quanwang Wu, Zhiyong Zhang, Yanlu Gong, Ziqing He, and Fan Zhu. 2021. Smote-nan-de: Addressing the noisy and borderline examples problem in imbalanced classification by natural neighbors and differential evolution. Knowledge-Based Systems 223, 107056.
- Li, Yang, Shi Baofeng, and Dong Yizhe. 2022. A credit risk evaluation model for imbalanced data classification based on class balanced loss modified cross entropy function. Journal of Systems & Management 31(2), 255.
- Liashenko, Olena, Tetyana Kravets, and Yevhenii Kostovetskyi. 2023. Machine learning and data balancing methods for bankruptcy prediction. Ekonomika 102(2), 28–46.
- Lohmann, Christian, Steffen Mallenhoff, and Thorsten Ohliger. 2022. Nonlinear relationships in bankruptcy prediction and their effect on the profitability of bankruptcy prediction models. Journal of Business Economics.
- Lombardo, Gianfranco, Andrea Bertogalli, Sergio Consoli, and Diego Reforgiato Recupero. 2024. Natural language processing and deep learning for bankruptcy prediction: An end-to-end architecture. IEEE Access, 1–1.
- Moslemnejad, Somaye and Javad Hamidzadeh. 2021. Weighted support vector machine using fuzzy rough set theory. Soft Computing 25(13), 8461–8481.
- Nguyen, Hoang Hiep, Jean-Laurent Viviani, and Sami Ben Jabeur. 2023. Bankruptcy prediction using machine learning and shapley additive explanations. Review of Quantitative Finance and Accounting, 1–42.
- Park, Min Sue, Hwijae Son, Chongseok Hyun, and Hyung Ju Hwang. 2021. Explainability of machine learning models for bankruptcy prediction. IEEE Access 9, 124887–124899.
- Perboli, Guido and Ehsan Arabnezhad. 2021. A machine learning-based dss for mid and long-term company crisis prediction. Expert Systems with Applications 174.
- Radovanovic, Jelena and Christian Haas. 2023. The evaluation of bankruptcy prediction models based on socio-economic costs. Expert Systems with Applications 227, 120275.
- Shangguan, Xuekui, Keyu Wei, Qifeng Sun, Yaoyu Zhang, and Ruijun Bai. 2023. Research on the standardization strategy of granular computing. International Journal of Cognitive Computing in Engineering.
- Soui, Makram, Salima Smiti, Mohamed Wiem Mkaouer, and Ridha Ejbali. 2020. Bankruptcy prediction using stacked auto-encoders. Applied Artificial Intelligence 34(1), 80–100.
- Stitson, M.O., J.A.E. Weston, A. Gammerman, V. Vovk, and V. Vapnik. 1996. Theory of support vector machines. University of London 117(827), 188–191.
- Sun, Weixin, Xuantao Zhang, Minghao Li, and Yong Wang. 2023. Interpretable high-stakes decision support system for credit default forecasting. Technological Forecasting and Social Change.
- Tharwat, Alaa and Thomas Gabel. 2020. Parameters optimization of support vector machines for imbalanced data using social ski driver algorithm. Neural Computing and Applications 32(11), 6925–6938.
- Velmurugan, Mythreyi, Chun Ouyang, Catarina Moreira, and Renuka Sindhgatta. 2021. Evaluating stability of post-hoc explanations for business process predictions. In International Conference on Service-Oriented Computing, pp. 49–64.
- Wang, S. and Guotai Chi. 2024. Cost-sensitive stacking ensemble learning for company financial distress prediction. Expert Systems with Applications 255, 124525–124525.
- Wang, Zhao, Cuiqing Jiang, and Huimin Zhao. 2023. Depicting risk profile over time: A novel multiperiod loan default prediction approach. Management Information Systems Quarterly.
- Xia, Shuyin, Xiaoyu Lian, Guoyin Wang, Xinbo Gao, Jiancu Chen, and Xiaoli Peng. 2024a. Gbsvm: An efficient and robust support vector machine framework via granular-ball computing. IEEE Transactions on Neural Networks and Learning Systems, 1–15.
- Xia, Shuyin, Xiaoyu Lian, Guoyin Wang, Xinbo Gao, Jiancu Chen, and Xiaoli Peng. 2024b. Gbsvm: An efficient and robust support vector machine framework via granular-ball computing. IEEE Transactions on Neural Networks and Learning Systems.
- Xue, Zhenxia, Roxin Zhang, Chuandong Qin, and Xiaoqing Zeng. 2020. An adaptive twin support vector regression machine based on rough and fuzzy set theories. Neural Computing and Applications 32(9), 4709–4732.
- Zhang, Xinsheng, Yulong Ma, and Minghu Wang. 2024. An attention-based logistic-cnn-bilstm hybrid neural network for credit risk prediction of listed real estate enterprises. Expert Systems 41(2), e13299.


| Model | Accuracy | F1-score | AUC-ROC | Precision | Recall | Specificity | G-mean |
|---|---|---|---|---|---|---|---|
| Fuzzy SVM (Centre) | 0.9530 | 0.0408 | 0.7010 | 0.0323 | 0.0556 | 0.9695 | 0.2321 |
| Fuzzy SVM (Sphere) | 0.9550 | 0.0000 | 0.7043 | 0.0000 | 0.0000 | 0.9725 | 0.0000 |
| Fuzzy SVM (Hyperplan) | 0.9550 | 0.0000 | 0.7058 | 0.0000 | 0.0000 | 0.9725 | 0.0000 |
| Fuzzy SVM (knn_density) | 0.9550 | 0.0426 | 0.7025 | 0.0345 | 0.0556 | 0.9715 | 0.2323 |
| Fuzzy SVM (local_entropy) | 0.9470 | 0.0000 | 0.7282 | 0.0000 | 0.0000 | 0.9644 | 0.0000 |
| Fuzzy SVM (intra_class) | 0.9540 | 0.0417 | 0.7133 | 0.0333 | 0.0556 | 0.9705 | 0.2322 |
| Fuzzy SVM (rbf) | 0.9580 | 0.0455 | 0.7531 | 0.0385 | 0.0556 | 0.9745 | 0.2327 |
| Fuzzy SVM (rbf_svm_margin) | 0.9520 | 0.0400 | 0.7037 | 0.0312 | 0.0556 | 0.9684 | 0.2320 |
| Fuzzy SVM (combined) | 0.9620 | 0.1764 | 0.8374 | 0.3154 | 0.1667 | 0.9766 | 0.4034 |
| DEC | 0.9520 | 0.0400 | 0.7021 | 0.0312 | 0.0556 | 0.9684 | 0.2320 |
| SVM-SMOTE | 0.9000 | 0.1525 | 0.8163 | 0.0900 | 0.1001 | 0.9073 | 0.2735 |
| SVM-ADASYN | 0.8280 | 0.1134 | 0.7463 | 0.0625 | 0.0111 | 0.8320 | 0.3130 |
| SVM-Undersampling | 0.7720 | 0.1024 | 0.8280 | 0.0551 | 0.0222 | 0.7729 | 0.3471 |
| Random Forest | 0.9020 | 0.0000 | 0.6211 | 0.0000 | 0.0000 | 0.7000 | 0.0000 |
| KNN | 0.9010 | 0.0000 | 0.6068 | 0.0000 | 0.0000 | 0.7980 | 0.0000 |
| Logistic Regression | 0.9020 | 0.0000 | 0.6626 | 0.0000 | 0.0000 | 0.6001 | 0.0000 |
| Model | Accuracy | F1-score | AUC-ROC | Precision | Recall | Specificity | G-mean |
|---|---|---|---|---|---|---|---|
| Fuzzy Shadowed-center | 0.9142 | 0.3314 | 0.9028 | 0.2214 | 0.6591 | 0.9227 | 0.7798 |
| Fuzzy Shadowed-sphere | 0.9282 | 0.3553 | 0.9168 | 0.2500 | 0.6136 | 0.9386 | 0.7589 |
| Fuzzy Shadowed-hyperplane | 0.9289 | 0.3660 | 0.9187 | 0.2569 | 0.6364 | 0.9386 | 0.7729 |
| Fuzzy Shadowed-combined | 0.9699 | 0.3051 | 0.8663 | 0.6000 | 0.2045 | 0.9955 | 0.8290 |
| DEC | 0.9377 | 0.3609 | 0.9185 | 0.2697 | 0.2455 | 0.9508 | 0.3201 |
| SVM-SMOTE | 0.9304 | 0.3537 | 0.9119 | 0.2524 | 0.2909 | 0.9417 | 0.3459 |
| SVM-ADASYN | 0.8915 | 0.2745 | 0.9078 | 0.1750 | 0.2364 | 0.9000 | 0.3568 |
| SVM-Undersampling | 0.8409 | 0.2644 | 0.9213 | 0.1554 | 0.2864 | 0.8394 | 0.3626 |
| Random Forest | 0.9692 | 0.2759 | 0.9368 | 0.5714 | 0.1818 | 0.8955 | 0.2254 |
| KNN | 0.9507 | 0.2857 | 0.7424 | 0.6667 | 0.1818 | 0.8970 | 0.2258 |
| Logistic Regression | 0.9633 | 0.2188 | 0.8733 | 0.3500 | 0.1591 | 0.8902 | 0.2969 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).