Submitted:
08 October 2025
Posted:
09 October 2025
Read the latest preprint version here
Abstract
Keywords:
MSC: 05C69, 68Q25, 90C27
1. Introduction
- Greedy Algorithms: A simple greedy algorithm selects vertices in order of increasing degree, adding a vertex to the independent set if it has no neighbors in the current set. This achieves an approximation ratio of , where is the maximum degree. For graphs with high degrees (), this yields a poor ratio of . A more sophisticated greedy approach, selecting vertices by minimum degree iteratively, achieves an approximation ratio of , as shown by Halldórsson and Radhakrishnan [2].
- Local Search and Randomized Algorithms: Local search techniques, such as those by Boppana and Halldórsson [3], improve the approximation ratio to by iteratively swapping small subsets of vertices to increase the independent set size. Randomized algorithms, like those based on random vertex selection or Lovász Local Lemma, can achieve similar ratios with probabilistic guarantees.
- Semidefinite Programming (SDP): Advanced techniques using SDP, such as those by Karger, Motwani, and Sudan [4], achieve approximation ratios of for general graphs. For specific graph classes, such as 3-colorable graphs, better ratios (e.g., ) are possible.
- Hardness of Approximation: The MIS problem is notoriously difficult to approximate. Håstad [5] and others have shown that, assuming , no polynomial-time algorithm can achieve an approximation ratio better than for any . This inapproximability result underscores the challenge of finding near-optimal solutions.
- Special Graph Classes: For specific graph classes, better approximations exist. For bipartite graphs, the maximum independent set can be computed exactly in polynomial time using maximum matching algorithms (via König’s theorem). For graphs with bounded degree or specific structures (e.g., planar graphs), constant-factor approximations are achievable.
- Iterative Refinement: Initializes a candidate set with all non-isolated vertices and iteratively refines it by constructing a maximum spanning tree of the induced subgraph, computing its maximum independent set (since trees are bipartite) using a matching-based approach, and updating the candidate set until it is independent in G. A greedy extension adds vertices to ensure maximality, producing .
- Greedy Minimum-Degree Selection: Sorts vertices by increasing degree and builds an independent set by adding each vertex if it has no neighbors in the current set, producing .
- Greedy Maximum-Degree Selection: Sorts vertices by decreasing degree and builds an independent set by adding each vertex if it has no neighbors in the current set, producing .
2. Research Data
3. Correctness of the Maximum Independent Set Algorithm
Algorithm Description
- If , . The empty set is an independent set, as it contains no vertices to be adjacent.
- If , . Since there are no edges, no pair of vertices is adjacent, so V is an independent set.
- For each component with vertex set , it uses the Hopcroft-Karp algorithm to find a maximum matching, then computes a minimum vertex cover C using König’s theorem.
- The independent set is , the complement of the vertex cover in the component.
- By König’s theorem, in a bipartite graph, the complement of a minimum vertex cover is a maximum independent set. If any two vertices in were adjacent, they would form an edge not covered by C, contradicting the vertex cover property.
- The union of these sets across components is independent in G, as components are disconnected.
- (1)
-
Iterative Refinement:
- Start with , where V is the set of non-isolated vertices after preprocessing.
- While is not independent in G, compute as the maximum independent set of a maximum spanning tree of , using iset_bipartite.
- Stop when is independent in G, verified by is_independent_set.
- Greedily extend : for each , if is independent, add u. Output .
- (2)
- Greedy Minimum-Degree Selection: Sort vertices by increasing degree and add each vertex v to if it has no neighbors in the current set.
- (3)
- Greedy Maximum-Degree Selection: Sort vertices by decreasing degree and add each vertex v to if it has no neighbors in the current set.
- (4)
- Output: Return if is largest, else the largest among or .
- is_independent_set returns True if and only if no edge has both , taking time by checking all edges.
- Thus, is an independent set in G.
- Greedy Extension: For each vertex , if is independent (verified by is_independent_set), add u. This ensures no edge exists between u and any vertex in , or between any pair in the updated set. Each addition preserves independence, so is independent.
- Vertices are sorted by degree, and for each vertex v, it is added to if none of its neighbors are in the current set.
- At each step, the check ensures that adding v introduces no edges, as for all .
- The resulting is independent, as each addition preserves the property that no two vertices in the set are adjacent.
- Vertices are sorted by decreasing degree, and each v is added if no neighbors are in the current set.
- Each addition preserves independence, so is independent.
- All three are independent in the non-isolated subgraph.
- Isolated vertices have degree 0, so adding them introduces no edges.
- Thus, the final S remains independent.
- (1)
- Preprocessing: Remove self-loops and isolated nodes from G. Let be the set of isolated nodes. If the graph is empty or edgeless, return .
- (2)
-
Iterative Refinement:
- (a)
- Start with , where V is the set of non-isolated vertices.
- (b)
-
While is not an independent set in G:
- Construct a maximum spanning tree of the subgraph .
- Compute the maximum independent set of (a tree, thus bipartite) using a matching-based approach, and set to this set.
- (c)
- Stop when is independent in G.
- (d)
- Greedily extend : for each , if is independent, add u. Let .
- (3)
- Greedy Selections: Compute by sorting vertices by increasing degree and adding each vertex v if it has no neighbors in the current set. Compute by sorting vertices by decreasing degree and adding each vertex v if it has no neighbors in the current set.
- (4)
- Output: Return if is the largest, else the largest among or .
- A clique C of size .
- An independent set I of size , assuming is an integer.
- All edges between C and I.
- Start with , .
- In each iteration, the maximum spanning tree favors dense connections in C, forming a star-like structure centered in C, reducing the set size by approximately half but converging slowly to select a single vertex from C, yielding , .
- Greedy extension adds no further vertices due to full connectivity to I.
- Iterative Approach: May reduce to , , giving a ratio of , e.g., for , which exceeds .
- Min-Greedy Approach: Selects vertices in minimum-degree order. Non-universal vertices in each clique have degree , while u has degree . The algorithm picks one vertex per clique, yielding of size m, so .
- Max-Greedy Approach: Starts with high-degree u, then skips cliques, but may recover one per clique in subsequent steps, yielding size m.
- The algorithm outputs the largest S (size m), with .
- (assuming G has non-isolated vertices).
- If , then , as .
- If , the ratio is often better. In bipartite graphs (), the iterative approach finds an optimal set, giving a ratio of 1. In cycle graphs (), either approach yields , with a ratio near 1. In the counterexample, the greedy approaches ensure , giving a ratio of (e.g., 2 for ). Since , the ratio is at most , and the worst case occurs when , , yielding .
- Graph Copy: Copying the graph takes , duplicating vertices and edges in the adjacency list.
- Self-Loop Removal: Identifying and removing self-loops via nx.selfloop_edges takes , checking each edge.
- Isolated Nodes: Identifying isolates (degree 0) takes by checking each vertex’s degree. Removing them takes .
- Empty Graph Check: Checking if the graph has no nodes or edges takes . Returning the isolates set takes .
- Connected Components: Finding components via BFS or DFS takes .
-
Per Component: For a component with vertices and edges (, ):
- Subgraph Extraction: Takes .
- Hopcroft-Karp Matching: Computing a maximum matching takes .
- Vertex Cover: Converting the matching to a minimum vertex cover takes .
- Set Operations: Computing the complement of the vertex cover and updating the independent set takes .
Total per component: . - Across Components: Summing, , and , since . Thus, total time is .
- is_independent_set: Checks all edges in , returning False if any edge has both endpoints in the set.
- Maximum Spanning Tree: Using Kruskal’s algorithm on with up to n vertices and m edges takes , dominated by edge sorting.
-
iset_bipartiteon Tree: The spanning tree has at most edges. Computing its maximum independent set takes , as:
- Components: (tree is connected or trivial).
- BFS-based coloring for bipartite tree: , simpler than Hopcroft-Karp.
- Vertex cover and set operations: .
- Number of Iterations: In the worst case, the set reduces by at least 1 vertex per iteration (e.g., star tree removes one vertex). Starting from , the loop runs at most times.
- Total per Iteration: .
- Total Loop: .
- For each of n vertices, check if is independent using is_independent_set, taking .
- Set operations (union, addition) take amortized per vertex.
- Total: .
- Sorting Vertices: Sorting n vertices by degree takes for each.
- Selection: For each of n vertices, check neighbors (up to m edges total) to ensure independence, taking across all vertices for each.
- Set Operations: Adding vertices to the set takes amortized, so total per greedy.
- Total per greedy: .
- For two greeds: .
- Preprocessing: .
- Bipartite check: .
- Bipartite case: .
-
Non-bipartite case:
- Iterative refinement: .
- Greedy selections: .
- Final selection and isolates: .
- Structural diversity: Covering random graphs (C-series), geometric graphs (MANN), and complex topologies (Keller, brock).
- Known optima: Enabling precise approximation ratio calculations.
- Hardware: 11th Gen Intel® Core™ i7-1165G7 (2.80 GHz), 32GB DDR4 RAM.
- Software: Windows 10 Home, Furones: Approximate Independent Set Solver v0.0.6 [6].
-
Methodology:
- A single run per instance.
- Solution verification against published clique numbers.
- Runtime measurement from graph loading to solution output.
- Optimal solutions (where known) via complement graph transformation.
- Theoretical approximation bound, where n is the number of vertices of the graph instance.
- Instance-specific hardness parameters (density, regularity).
- (1)
- Runtime (milliseconds): The total computation time required to find a maximal independent set, measured in milliseconds. This metric reflects the algorithm’s efficiency across graphs of varying sizes and densities, as shown in Table 2.
- (2)
-
Approximation Quality: We quantify solution quality through two complementary measures:
-
Approximation Ratio: For instances with known optima, we compute:where:
- : The optimal independent set size (equivalent to the maximum clique in the complement graph).
- : The solution size found by our algorithm.
A ratio indicates optimality, while higher values suggest room for improvement. Our results show ratios ranging from 1.0 (perfect) to 1.8 (suboptimal) across DIMACS benchmarks.
-
-
Runtime Performance: The algorithm demonstrates varying computational efficiency across graph classes:
- Sub-second performance on small dense graphs (e.g., C125.9 in 30.57 ms, keller4 in 298.87 ms).
- Minute-scale computations for mid-sized challenging instances (e.g., keller6 in 404,500 ms, p_hat1500-1 in 553,791 ms).
- Hour-long runs for the largest instances (e.g., C4000.5 in 3,731,507 ms).
Runtime correlates strongly with both graph size () and approximation difficulty - instances requiring higher approximation ratios (e.g., Keller graphs with ) consistently demand more computation time than similarly-sized graphs with better ratios. -
Solution Quality: The approximation ratio reveals three distinct performance regimes:
-
Optimal solutions () for structured graphs:
- Hamming graphs (hamming8-4, hamming10-4).
- MANN graphs (near-optimal with ).
-
Good approximations () for:
- Random graphs (C125.9, C250.9).
- Sparse instances (p_hat300-3, p_hat700-3).
-
Challenging cases () requiring improvement:
- Brockington graphs (brock200_2 ).
- Keller graphs (keller5 , keller6).
-
- Quality-Efficiency Tradeoff: The algorithm achieves perfect solutions () for structured graphs like Hamming and MANN instances while maintaining reasonable runtimes (e.g., hamming8-4 in 658 ms, MANN_a27 in 219 ms). However, the computational cost grows significantly for difficult instances like keller6 (404,500 ms) and C4000.5 (3,731,507 ms), suggesting a clear quality-runtime tradeoff.
-
Structural Dependencies: Performance strongly correlates with graph topology:
- Excellent on regular structures (Hamming, MANN).
- Competitive on random graphs (C-series with ).
- Challenging for irregular dense graphs (Keller, brock with ).
-
Practical Applications: The demonstrated performance makes this approach particularly suitable for:
- Circuit design applications (benefiting from perfect Hamming solutions).
- Scheduling problems (leveraging near-optimal MANN performance).
- Network analysis where -approximation is acceptable.
- Hybrid Approaches: Combining our algorithm with fast heuristics for initial solutions on difficult instances (e.g., brock and Keller graphs) to reduce computation time while maintaining quality guarantees.
- Parallelization: Developing GPU-accelerated versions targeting the most time-consuming components, particularly for large sparse graphs like p_hat1500 series and C4000.5.
-
Domain-Specific Optimizations: Creating specialized versions for:
- Perfect graphs (extending our success with Hamming codes).
- Geometric graphs (improving on current ratios).
-
Extended Benchmarks: Evaluation on additional graph classes:
- Real-world networks (social, biological).
- Massive sparse graphs from web analysis.
- Dynamic graph scenarios.


References
- Karp, R.M. Reducibility among Combinatorial Problems. In Complexity of Computer Computations; Miller, R.E., Thatcher, J.W., Bohlinger, J.D., Eds.; Plenum: New York, USA, 1972; pp. 85–103. [Google Scholar] [CrossRef]
- Halldórsson, M.M.; Radhakrishnan, J. Greed is good: Approximating independent sets in sparse and bounded-degree graphs. Algorithmica 1997, 18, 145–163. [Google Scholar] [CrossRef]
- Boppana, R.; Halldórsson, M.M. Approximating maximum independent sets by excluding subgraphs. BIT Numerical Mathematics 1992, 32, 180–196. [Google Scholar] [CrossRef]
- Karger, D.R.; Motwani, R.; Sudan, M. Approximate graph coloring by semidefinite programming. Journal of the ACM 1998, 45, 246–265. [Google Scholar] [CrossRef]
- Håstad, J. Clique is hard to approximate within n1-ϵ. Acta Mathematica 1999, 182, 105–142. [Google Scholar] [CrossRef]
- Vega, F. Vega, F. Furones: Approximate Independent Set Solver. https://pypi.org/project/furones. Accessed October 8, 2025.
- Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, October 11-13, 1993; Johnson, D.S., Trick, M.A., Eds.; Vol. 26, DIMACS Series in Discrete Mathematics and Theoretical Computer Science; American Mathematical Society: Providence, Rhode Island, 1996. [Google Scholar]
- Pullan, W.; Hoos, H.H. Dynamic Local Search for the Maximum Clique Problem. Journal of Artificial Intelligence Research 2006, 25, 159–185. [Google Scholar] [CrossRef]
- Batsyn, M.; Goldengorin, B.; Maslov, E.; Pardalos, P.M. Improvements to MCS algorithm for the maximum clique problem. Journal of Combinatorial Optimization 2014, 27, 397–416. [Google Scholar] [CrossRef]
- Fortnow, L. Fifty years of P vs. NP and the possibility of the impossible. Communications of the ACM 2022, 65, 76–85. [Google Scholar] [CrossRef]
| Nr. | Code metadata description | Metadata |
|---|---|---|
| C1 | Current code version | v0.0.6 |
| C2 | Permanent link to code/repository used for this code version | https://github.com/frankvegadelgado/furones |
| C3 | Permanent link to Reproducible Capsule | https://pypi.org/project/furones/ |
| C4 | Legal Code License | MIT License |
| C5 | Code versioning system used | git |
| C6 | Software code languages, tools, and services used | Python |
| C7 | Compilation requirements, operating environments & dependencies | Python ≥ 3.12 |
| Instance | Found Size | Optimal Size | Time (ms) | Approx. Ratio | |
|---|---|---|---|---|---|
| brock200_2 | 7 | 12 | 481.34 | 14.142 | 1.714 |
| brock200_4 | 13 | 17 | 409.48 | 14.142 | 1.308 |
| brock400_2 | 18 | 29 | 1744.26 | 20.000 | 1.611 |
| brock400_4 | 18 | 33 | 1679.18 | 20.000 | 1.833 |
| brock800_2 | 15 | 24 | 19270.17 | 28.284 | 1.600 |
| brock800_4 | 15 | 26 | 19384.45 | 28.284 | 1.733 |
| C1000.9 | 51 | 68 | 7727.88 | 31.623 | 1.333 |
| C125.9 | 29 | 34 | 30.57 | 11.180 | 1.172 |
| C2000.5 | 14 | 16 | 579255.58 | 44.721 | 1.143 |
| C2000.9 | 55 | 77 | 60996.07 | 44.721 | 1.400 |
| C250.9 | 35 | 44 | 140.84 | 15.811 | 1.257 |
| C4000.5 | 12 | 18 | 3731506.72 | 63.246 | 1.500 |
| C500.9 | 43 | 57 | 3222.56 | 22.361 | 1.326 |
| DSJC1000.5 | 10 | 15 | 89236.03 | 31.623 | 1.500 |
| DSJC500.5 | 10 | 13 | 9382.76 | 22.361 | 1.300 |
| gen200_p0.9_44 | 32 | ? | 136.17 | 14.142 | N/A |
| gen200_p0.9_55 | 36 | ? | 129.54 | 14.142 | N/A |
| gen400_p0.9_55 | 44 | ? | 713.99 | 20.000 | N/A |
| gen400_p0.9_65 | 37 | ? | 749.65 | 20.000 | N/A |
| gen400_p0.9_75 | 47 | ? | 716.33 | 20.000 | N/A |
| hamming10-4 | 32 | 32 | 11096.22 | 32.000 | 1.000 |
| hamming8-4 | 16 | 16 | 658.39 | 16.000 | 1.000 |
| keller4 | 8 | 11 | 298.87 | 13.077 | 1.375 |
| keller5 | 19 | 27 | 9268.72 | 27.857 | 1.421 |
| keller6 | 38 | 59 | 404499.90 | 57.982 | 1.553 |
| MANN_a27 | 125 | 126 | 218.91 | 19.442 | 1.008 |
| MANN_a45 | 342 | 345 | 997.82 | 32.171 | 1.009 |
| MANN_a81 | 1096 | 1100 | 9196.84 | 57.635 | 1.004 |
| p_hat1500-1 | 8 | 12 | 553791.48 | 38.730 | 1.500 |
| p_hat1500-2 | 54 | 65 | 202755.85 | 38.730 | 1.204 |
| p_hat1500-3 | 75 | 94 | 74414.30 | 38.730 | 1.253 |
| p_hat300-1 | 7 | 8 | 4661.84 | 17.321 | 1.143 |
| p_hat300-2 | 23 | 25 | 1708.14 | 17.321 | 1.087 |
| p_hat300-3 | 30 | 36 | 722.48 | 17.321 | 1.200 |
| p_hat700-1 | 7 | 11 | 47266.02 | 26.458 | 1.571 |
| p_hat700-2 | 38 | 44 | 20940.51 | 26.458 | 1.158 |
| p_hat700-3 | 55 | 62 | 8696.64 | 26.458 | 1.127 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
