Submitted:
28 January 2026
Posted:
28 January 2026
Read the latest preprint version here
Abstract
Keywords:
MSC: 05C69; 68Q25; 90C27; 68W25
1. Introduction
1.1. Our Contribution and Theoretical Framework
- 1.
- A novel reduction technique transforming graphs to maximum degree-1 instances
- 2.
- Optimal solvers on the reduced graph structure
- 3.
- An ensemble of complementary heuristics (local-ratio, maximum-degree greedy, minimum-to-minimum)
- 4.
- Component-wise minimum selection among all candidates
- Sparse graphs (paths, trees, low average degree): Min-to-min and local-ratio heuristics achieve provably optimal covers
- Skewed bipartite graphs ( with ): Reduction-based projection provably selects the smaller partition (optimal)
- Dense regular graphs (cliques, d-regular graphs): Maximum-degree greedy achieves provably optimal or -optimal covers
- Hub-heavy scale-free graphs (high degree variance): Reduction-based methods provably achieve optimal hub concentration
- Reduction methods fail on sparse alternating chains → exactly where Min-to-Min excels
- Greedy fails on layered set-cover-like graphs → exactly where Reduction excels
- Min-to-Min fails on dense uniform graphs → exactly where Greedy excels
- Local-ratio fails on irregular dense non-bipartite graphs → exactly where Reduction/Greedy excel
- If proven complete: Would imply P = NP under SETH, representing a breakthrough in complexity theory
- Current status: Strong performance on 233+ tested instances plus theoretical analysis showing optimality on identified graph classes
- Missing piece: Proof that our graph classification (sparse/dense/bipartite/hub-heavy) exhaustively covers all possible graph structures, or construction of counterexample graphs where all five heuristics simultaneously achieve ratio
- 1.
- Compelling empirical evidence across diverse instances
- 2.
- Theoretical analysis proving optimality on specific graph classes
- 3.
- A formal invitation for the community to either extend the analysis to all graphs or construct counterexamples
1.2. Algorithm Overview
1.3. Experimental Validation Framework
2. Related Work and State-of-the-Art
2.1. Theoretical Approximation Algorithms
2.2. Practical Heuristic Methods
2.3. Fixed-Parameter Tractable Algorithms
2.4. Positioning of Our Work
- 1.
- Theoretical Claim: We hypothesize a provable approximation ratio , which would be groundbreaking if validated
- 2.
- Empirical Performance: Competitive with state-of-the-art heuristics (TIVC, FastVC) while providing potential theoretical guarantees
3. The Hvala Algorithm: Detailed Description
3.1. Algorithm Structure and Pseudocode
3.1.1. Main Algorithm
| Algorithm 1:Hvala: Main Algorithm |
|
Input: Undirected graph
Output: Approximate vertex cover
Preprocessing and validation
ifG is empty or then
return∅
end
Remove self-loops from G Remove isolated vertices from G Initialize solution
Process each connected component independently
foreachconnected component C in Gdo
subgraph induced by CPhase 1: Reduction to maximum degree-1ReduceToMaxDegree1() Algorithm 2Phase 2: Optimal solutions on reduced graphMinWeightedDominatingSet() Algorithm 3MinWeightedVertexCover() Algorithm 4//Project solutions back to original graphProjectToOriginal()ProjectToOriginal()Phase 3: Ensemble heuristicsNetworkXLocalRatio()MaxDegreeGreedy() Algorithm 5MinToMinHeuristic() Algorithm 6//Phase 4: Select best solution for this component
end
returnS
|
3.1.2. Graph Reduction to Maximum Degree 1
| Algorithm 2:ReduceToMaxDegree1: Graph Reduction |
|
Input: Graph
Output: Reduced graph with maximum degree 1, with weighted nodes
empty graph empty dictionary
foreachvertex do
//Get neighbors of u//Degree of u//Create k auxiliary vertices, one per neighborforeachdo//Auxiliary vertex notation
Add edge to //Weight inversely proportional to degreeend
end
Set node attributes in using weights return
|
3.1.3. Optimal Solutions on Degree-1 Graphs
| Algorithm 3:MinWeightedDominatingSet: Optimal Dominating Set |
|
Input: Graph with maximum degree 1, weight function
Output: Minimum weighted dominating set
foreachnode do
ifthenifthen//Isolated vertex must dominate itselfendelse ifthen//Edge: choose minimum weight endpoint unique neighbor of vifthenif or and thenendelseendendendend
end
returnD
|
| Algorithm 4:MinWeightedVertexCover: Optimal Weighted Vertex Cover |
|
Input: Graph with maximum degree 1, weight function
Output: Minimum weighted vertex cover
foreachnode do
if and then unique neighbor of vifthen//Choose minimum weight endpoint to cover edgeif or and thenendelseendendend
end
returnC
|
3.1.4. Complementary Heuristics
| Algorithm 5:MaxDegreeGreedy: Maximum Degree Greedy Heuristic |
|
Input: Graph
Output: Vertex cover
copy of G
whiledo
//Select max degree vertex
Remove v and all incident edges from
end
returnC
|
| Algorithm 6:MinToMinHeuristic: Minimum-to-Minimum Heuristic |
|
Input: Graph
Output: Vertex cover
copy of G
whiledo
//Find vertices with minimum degree//Get neighbors of minimum-degree verticesifthen//Among neighbors, find one with minimum degree
Remove v and all incident edges from end
end
returnC
|
3.2. Complexity Analysis
- Component decomposition:
- Reduction to degree-1: (each edge processed once)
- Optimal solving on : (linear in reduced graph size)
- Ensemble heuristics: NetworkX local-ratio and greedy methods contribute
4. Approximation Ratio Analysis: Ensemble Complementarity
4.1. Individual Heuristic Performance on Graph Classes
4.1.1. Sparse Graphs: Optimality via Min-to-Min and Local-Ratio
4.1.2. Skewed Bipartite Graphs: Optimality via Reduction
4.1.3. Dense Regular Graphs: Optimality via Maximum-Degree Greedy
4.1.4. Hub-Heavy Scale-Free Graphs: Optimality via Reduction
4.2. Structural Orthogonality: Why the Ensemble Works
- Reduction:Worst on sparse alternating chains → Min-to-Min optimal
- Greedy:Worst on layered sparse graphs → Reduction/Min-to-Min excel
- Min-to-Min:Worst on dense uniform graphs → Greedy optimal
- Local-Ratio:Worst on irregular dense graphs → Reduction/Greedy excel
4.3. Empirical Performance Across Graph Families
- Sparse graphs (bio-networks, trees): Ratio 1.000–1.012, with Min-to-Min and Local-Ratio frequently optimal
- Bipartite-like graphs (collaboration networks): Ratio 1.001–1.009, with Reduction often optimal
- Dense graphs (FRB instances): Ratio 1.006–1.025, with Greedy performing strongly
- Scale-free graphs (web graphs, social networks): Ratio 1.001–1.032, with Reduction capturing hub structure
- Regular graphs (3-regular stress tests): Ratio 1.069–1.071, demonstrating robustness even on adversarial inputs
4.4. Open Theoretical Challenge
- 1.
- Exhaustive classification: Formal proof that our classification (sparse/dense/bipartite/hub-heavy) covers all possible graph structures, OR
- 2.
- Counterexample construction: An adversarial graph where all five heuristics simultaneously achieve ratio
5. Experimental Validation: Comprehensive Results
5.1. Experiment 1: DIMACS Benchmark Evaluation
- Processor: 11th Gen Intel Core i7-1165G7 @ 2.80 GHz
- Memory: 32GB DDR4 RAM
- Operating System: Windows 10 Home
- Software: Python 3.12.0, NetworkX 3.4.2, Hvala v0.0.6
| Instance | Optimal | Hvala Size | Time (ms) | Ratio |
|---|---|---|---|---|
| brock200_2 | 188 | 192 | 174.42 | 1.021 |
| brock200_4 | 183 | 187 | 113.10 | 1.022 |
| brock400_2 | 371 | 378 | 473.47 | 1.019 |
| brock400_4 | 367 | 378 | 457.90 | 1.030 |
| brock800_2 | 776 | 782 | 2987.20 | 1.008 |
| brock800_4 | 774 | 783 | 3232.21 | 1.012 |
| C1000.9 | 932 | 939 | 1615.26 | 1.007 |
| C125.9 | 91 | 93 | 17.73 | 1.022 |
| C2000.5 | 1984 | 1988 | 36434.74 | 1.002 |
| C2000.9 | 1923 | 1934 | 9650.50 | 1.006 |
| C250.9 | 206 | 209 | 74.72 | 1.015 |
| C4000.5 | 3982 | 3986 | 170860.61 | 1.001 |
| C500.9 | 443 | 451 | 322.25 | 1.018 |
| DSJC1000.5 | 985 | 988 | 5893.75 | 1.003 |
| DSJC500.5 | 487 | 489 | 1242.71 | 1.004 |
| hamming10-4 | 992 | 992 | 2258.72 | 1.000 |
| hamming8-4 | 240 | 240 | 201.95 | 1.000 |
| keller4 | 160 | 160 | 83.81 | 1.000 |
| keller5 | 749 | 752 | 1617.27 | 1.004 |
| keller6 | 3302 | 3314 | 46779.80 | 1.004 |
| MANN_a27 | 252 | 253 | 58.37 | 1.004 |
| MANN_a45 | 690 | 693 | 389.55 | 1.004 |
| MANN_a81 | 2221 | 2225 | 3750.72 | 1.002 |
| p_hat1500-1 | 1488 | 1490 | 27584.83 | 1.001 |
| p_hat1500-2 | 1435 | 1439 | 19905.04 | 1.003 |
| p_hat1500-3 | 1406 | 1416 | 9649.06 | 1.007 |
| p_hat300-1 | 292 | 293 | 1195.41 | 1.003 |
| p_hat300-2 | 275 | 277 | 495.51 | 1.007 |
| p_hat300-3 | 264 | 267 | 297.01 | 1.011 |
| p_hat700-1 | 689 | 692 | 4874.02 | 1.004 |
| p_hat700-2 | 656 | 657 | 3532.10 | 1.002 |
| p_hat700-3 | 638 | 641 | 1778.29 | 1.005 |
- Total instances tested: 32
- Optimal solutions found: 3 (hamming10-4, hamming8-4, keller4)
- Average approximation ratio: 1.0072
- Best ratio: 1.000 (optimal)
- Worst ratio: 1.030 (brock400_4)
- Instances with ratio : 22 (68.75%)
- Instances with ratio : 28 (87.5%)
- Largest instance solved: C4000.5 (3982 vertices) in 170.86 seconds
5.2. Experiment 2: Real-World Large Graphs (The Resistire Experiment)
| Instance | Category | V | E | VC Size | Time | Best Known | Ratio | Notes |
|---|---|---|---|---|---|---|---|---|
| bio-celegans | Bio | 453 | 2,025 | 251 | 104.71ms | ∼248 | ∼1.012 | C. elegans metabolic |
| bio-diseasome | Bio | 516 | 1,188 | 285 | 102.11ms | ∼283 | ∼1.007 | Disease-gene assoc. |
| bio-dmela | Bio | 7,393 | 25,569 | 2,657 | 13.64s | Unknown | – | Drosophila |
| bio-yeast | Bio | 1,458 | 1,948 | 456 | 504.85ms | ∼453 | ∼1.007 | Yeast protein |
| ca-AstroPh | Collab | 17,903 | 196,972 | 11,494 | 151.62s | Unknown | – | Astrophysics |
| ca-CondMat | Collab | 21,363 | 91,286 | 12,484 | 214.57s | Unknown | – | Condensed matter |
| ca-CSphd | Collab | 1,025 | 1,043 | 550 | 294.59ms | ∼548 | ∼1.004 | CS PhD |
| ca-Erdos992 | Collab | 6,100 | 7,515 | 461 | 2.26s | ∼459 | ∼1.004 | Erdős collab |
| ca-GrQc | Collab | 4,158 | 13,422 | 2,210 | 5.80s | Unknown | – | General relativity |
| ca-HepPh | Collab | 11,204 | 117,619 | 6,558 | 49.04s | Unknown | – | High-energy physics |
| ca-netscience | Collab | 379 | 914 | 214 | 61.72ms | ∼212 | ∼1.009 | Network science |
| ia-email-EU | 32,430 | 54,397 | 820 | 29.73s | Unknown | – | EU research email | |
| ia-email-univ | 1,133 | 5,451 | 605 | 486.58ms | ∼603 | ∼1.003 | University email | |
| ia-enron-large | 33,696 | 180,811 | 12,792 | 391.87s | Unknown | – | Enron large | |
| ia-enron-only | 143 | 623 | 87 | 16.07ms | ∼86 | ∼1.012 | Enron core | |
| ia-fb-messages | Social | 1,266 | 6,451 | 580 | 998.20ms | ∼578 | ∼1.003 | Facebook msgs |
| ia-infect-dublin | Social | 410 | 2,765 | 298 | 108.60ms | ∼296 | ∼1.007 | Infection Dublin |
| ia-infect-hyper | Social | 113 | 188 | 92 | 29.43ms | ∼91 | ∼1.011 | Infection hypertext |
| ia-reality | Social | 6,809 | 7,680 | 81 | 657.86ms | Unknown | – | Reality mining |
| ia-wiki-Talk | Wiki | 92,117 | 360,767 | 17,288 | 1868.99s | Unknown | – | Wikipedia talk |
| inf-power | Infra | 4,941 | 6,594 | 2,207 | 7.45s | Unknown | – | US power grid |
| rec-amazon | Rec | 262,111 | 899,792 | 47,891 | 4123.24s | Unknown | – | Amazon products |
| rt-retweet | Retweet | 96 | 117 | 32 | 4.98ms | ∼31 | ∼1.032 | General retweet |
| rt-twitter-copen | Retweet | 761 | 1,029 | 237 | 161.06ms | ∼235 | ∼1.009 | Twitter Copenhagen |
| scc_enron-only | SCC | 143 | 251 | 138 | 183.99ms | ∼137 | ∼1.007 | Enron SCC |
| scc_fb-forum | SCC | 899 | 7,089 | 372 | 2.28s | ∼370 | ∼1.005 | FB forum SCC |
| scc_fb-messages | SCC | 1,266 | 3,125 | 1,072 | 18.12s | Unknown | – | FB messages SCC |
| scc_infect-dublin | SCC | 410 | 1,800 | 9,104 | 5.48s | Unknown | – | Infection Dublin SCC |
| scc_infect-hyper | SCC | 113 | 171 | 110 | 171.80ms | ∼109 | ∼1.009 | Infection hyper SCC |
| scc_retweet | SCC | 96 | 87 | 561 | 2.08s | Unknown | – | Retweet SCC |
| scc_retweet-crawl | SCC | 21,297 | 17,362 | 8,419 | 14.03s | Unknown | – | Retweet crawl SCC |
| scc_rt_alwefaq | SCC | 35 | 34 | 35 | 9.47ms | 35 | 1.000 | Optimal |
| scc_rt_assad | SCC | 16 | 15 | 16 | 1.99ms | 16 | 1.000 | Optimal |
| scc_rt_bahrain | SCC | 37 | 36 | 37 | 5.52ms | 37 | 1.000 | Optimal |
| scc_rt_barackobama | SCC | 29 | 28 | 29 | 6.02ms | 29 | 1.000 | Optimal |
| scc_rt_damascus | SCC | 15 | 14 | 15 | 2.05ms | 15 | 1.000 | Optimal |
| scc_rt_dash | SCC | 15 | 14 | 15 | 2.99ms | 15 | 1.000 | Optimal |
| scc_rt_gmanews | SCC | 46 | 45 | 46 | 25.25ms | 46 | 1.000 | Optimal |
| scc_rt_gop | SCC | 6 | 5 | 6 | 1.00ms | 6 | 1.000 | Optimal |
| scc_rt_http | SCC | 2 | 1 | 2 | 0.98ms | 2 | 1.000 | Optimal |
| scc_rt_israel | SCC | 11 | 10 | 11 | 0.99ms | 11 | 1.000 | Optimal |
| scc_rt_justinbieber | SCC | 26 | 25 | 26 | 10.96ms | 26 | 1.000 | Optimal |
| scc_rt_ksa | SCC | 12 | 11 | 12 | 1.08ms | 12 | 1.000 | Optimal |
| scc_rt_lebanon | SCC | 5 | 4 | 5 | 1.08ms | 5 | 1.000 | Optimal |
| scc_rt_libya | SCC | 12 | 11 | 12 | 2.07ms | 12 | 1.000 | Optimal |
| scc_rt_lolgop | SCC | 103 | 102 | 103 | 182.49ms | 103 | 1.000 | Optimal |
| scc_rt_mittromney | SCC | 42 | 41 | 42 | 5.98ms | 42 | 1.000 | Optimal |
| scc_rt_obama | SCC | 4 | 3 | 4 | 1.08ms | 4 | 1.000 | Optimal |
| scc_rt_occupy | SCC | 22 | 21 | 22 | 3.01ms | 22 | 1.000 | Optimal |
| scc_rt_occupywallstnyc | SCC | 45 | 44 | 45 | 22.50ms | 45 | 1.000 | Optimal |
| scc_rt_oman | SCC | 6 | 5 | 6 | 0.98ms | 6 | 1.000 | Optimal |
| scc_rt_onedirection | SCC | 29 | 28 | 29 | 8.46ms | 29 | 1.000 | Optimal |
| scc_rt_p2 | SCC | 12 | 11 | 12 | 1.01ms | 12 | 1.000 | Optimal |
| scc_rt_qatif | SCC | 5 | 4 | 5 | 1.08ms | 5 | 1.000 | Optimal |
| scc_rt_saudi | SCC | 17 | 16 | 17 | 2.07ms | 17 | 1.000 | Optimal |
| scc_rt_tcot | SCC | 12 | 11 | 12 | 2.00ms | 12 | 1.000 | Optimal |
| scc_rt_tlot | SCC | 6 | 5 | 6 | 1.00ms | 6 | 1.000 | Optimal |
| scc_rt_uae | SCC | 8 | 7 | 8 | 1.38ms | 8 | 1.000 | Optimal |
| scc_rt_voteonedirection | SCC | 4 | 3 | 4 | 0.98ms | 4 | 1.000 | Optimal |
| scc_twitter-copen | SCC | 761 | 662 | 1,328 | 18.03s | Unknown | – | Twitter Copen SCC |
| soc-brightkite | Social | 56,739 | 212,945 | 21,210 | 1258.10s | Unknown | – | Brightkite location |
| soc-dolphins | Social | 62 | 159 | 35 | 5.06ms | ∼34 | ∼1.029 | Dolphin social |
| soc-douban | Social | 154,908 | 327,162 | 8,685 | 1629.90s | Unknown | – | Douban social |
| soc-epinions | Social | 26,588 | 100,120 | 9,774 | 263.38s | Unknown | – | Epinions trust |
| soc-karate | Social | 34 | 78 | 14 | 1.66ms | 14 | 1.000 | Optimal - Karate |
| soc-slashdot | Social | 70,068 | 358,647 | 22,373 | 1805.07s | Unknown | – | Slashdot social |
| soc-wiki-Vote | Social | 889 | 2,914 | 406 | 299.78ms | ∼404 | ∼1.005 | Wikipedia voting |
| socfb-CMU | 6,621 | 251,214 | 5,054 | 29.27s | Unknown | – | Carnegie Mellon | |
| socfb-Duke14 | 9,885 | 506,437 | 7,776 | 73.58s | Unknown | – | Duke University | |
| socfb-MIT | 6,441 | 251,230 | 4,723 | 28.13s | Unknown | – | MIT | |
| socfb-Stanford3 | 11,586 | 568,309 | 8,626 | 102.50s | Unknown | – | Stanford | |
| socfb-UCLA | 20,453 | 747,604 | 15,434 | 324.98s | Unknown | – | UCLA | |
| socfb-UConn | 17,206 | 636,836 | 13,422 | 228.94s | Unknown | – | UConn | |
| socfb-UCSB37 | 14,917 | 482,215 | 11,429 | 162.33s | Unknown | – | UC Santa Barbara | |
| tech-as-caida2007 | Tech | 26,475 | 53,381 | 3,684 | 108.54s | Unknown | – | CAIDA AS 2007 |
| tech-internet-as | Tech | 22,963 | 48,436 | 5,700 | 263.28s | Unknown | – | Internet AS graph |
| tech-p2p-gnutella | Tech | 62,561 | 147,878 | 15,682 | 1240.83s | Unknown | – | Gnutella P2P |
| tech-RL-caida | Tech | 190,914 | 607,610 | 75,680 | 17095.90s | Unknown | – | CAIDA router-level |
| tech-routers-rf | Tech | 2,113 | 6,632 | 795 | 1.25s | ∼793 | ∼1.003 | Router network |
| tech-WHOIS | Tech | 7,476 | 56,943 | 2,287 | 15.46s | Unknown | – | WHOIS network |
| web-BerkStan | Web | 12,776 | 19,500 | 5,390 | 44.16s | Unknown | – | Berkeley-Stanford |
| web-edu | Web | 3,031 | 6,474 | 1,451 | 2.63s | ∼1,449 | ∼1.001 | Educational domain |
| web-google | Web | 1,299 | 2,773 | 498 | 483.96ms | ∼497 | ∼1.002 | Google web graph |
| web-indochina-2004 | Web | 11,358 | 47,606 | 7,300 | 45.95s | Unknown | – | Indochina crawl |
| web-polblogs | Web | 643 | 2,280 | 245 | 140.23ms | ∼243 | ∼1.008 | Political blogs |
| web-sk-2005 | Web | 121,176 | 1,043,877 | 58,190 | 6126.11s | Unknown | – | Slovak web crawl |
| web-spam | Web | 4,767 | 37,375 | 2,315 | 8.31s | Unknown | – | Web spam corpus |
| web-webbase-2001 | Web | 16,062 | 25,593 | 2,652 | 35.39s | Unknown | – | Webbase 2001 |
- Total instances tested: 88
- Optimal solutions found: 28 (31.8%)
- Average approximation ratio (where known): 1.007
- Best ratio: 1.000 (28 instances)
- Worst ratio: 1.032 (rt-retweet)
- Largest instance solved: rec-amazon (262,111 vertices, 899,792 edges) in 68.7 minutes
-
Runtime distribution:
- Sub-second: 38 instances (43.2%)
- 1-60 seconds: 27 instances (30.7%)
- 1-10 minutes: 13 instances (14.8%)
- Over 10 minutes: 10 instances (11.4%)
5.3. Experiment 3: NPBench Hard Instances (The Creo Experiment)
5.3.1. FRB Instances (40 instances)
| Instance | Optimal | Hvala | Time | Ratio |
|---|---|---|---|---|
| frb30-15-1.mis | 420 | 426 | 443.82ms | 1.014 |
| frb30-15-2.mis | 420 | 425 | 506.81ms | 1.012 |
| frb30-15-3.mis | 420 | 426 | 475.87ms | 1.014 |
| frb30-15-4.mis | 420 | 425 | 416.66ms | 1.012 |
| frb30-15-5.mis | 420 | 425 | 445.95ms | 1.012 |
| frb35-17-1.mis | 560 | 566 | 719.36ms | 1.011 |
| frb35-17-2.mis | 560 | 565 | 739.85ms | 1.009 |
| frb35-17-3.mis | 560 | 566 | 774.78ms | 1.011 |
| frb35-17-4.mis | 560 | 566 | 856.32ms | 1.011 |
| frb35-17-5.mis | 560 | 566 | 813.15ms | 1.011 |
| frb40-19-1.mis | 720 | 728 | 1.16s | 1.011 |
| frb40-19-2.mis | 720 | 728 | 1.22s | 1.011 |
| frb40-19-3.mis | 720 | 726 | 1.19s | 1.008 |
| frb40-19-4.mis | 720 | 729 | 1.20s | 1.013 |
| frb40-19-5.mis | 720 | 728 | 1.21s | 1.011 |
| frb45-21-1.mis | 900 | 906 | 1.96s | 1.007 |
| frb45-21-2.mis | 900 | 910 | 1.89s | 1.011 |
| frb45-21-3.mis | 900 | 908 | 1.89s | 1.009 |
| frb45-21-4.mis | 900 | 910 | 1.86s | 1.011 |
| frb45-21-5.mis | 900 | 907 | 1.83s | 1.008 |
| frb50-23-1.mis | 1100 | 1108 | 2.68s | 1.007 |
| frb50-23-2.mis | 1100 | 1109 | 2.72s | 1.008 |
| frb50-23-3.mis | 1100 | 1108 | 2.63s | 1.007 |
| frb50-23-4.mis | 1100 | 1109 | 2.91s | 1.008 |
| frb50-23-5.mis | 1100 | 1111 | 2.92s | 1.010 |
| frb53-24-1.mis | 1219 | 1231 | 4.57s | 1.010 |
| frb53-24-2.mis | 1219 | 1228 | 3.33s | 1.007 |
| frb53-24-3.mis | 1219 | 1229 | 4.82s | 1.008 |
| frb53-24-4.mis | 1219 | 1227 | 3.46s | 1.007 |
| frb53-24-5.mis | 1219 | 1229 | 3.53s | 1.008 |
| frb56-25-1.mis | 1344 | 1355 | 3.88s | 1.008 |
| frb56-25-2.mis | 1344 | 1358 | 4.22s | 1.010 |
| frb56-25-3.mis | 1344 | 1354 | 4.12s | 1.007 |
| frb56-25-4.mis | 1344 | 1352 | 4.11s | 1.006 |
| frb56-25-5.mis | 1344 | 1354 | 3.85s | 1.007 |
| frb59-26-1.mis | 1475 | 1485 | 5.00s | 1.007 |
| frb59-26-2.mis | 1475 | 1486 | 4.86s | 1.007 |
| frb59-26-3.mis | 1475 | 1485 | 5.67s | 1.007 |
| frb59-26-4.mis | 1475 | 1485 | 5.06s | 1.007 |
| frb59-26-5.mis | 1475 | 1486 | 4.80s | 1.007 |
| frb100-40.mis | 3900 | 3922 | 27.78s | 1.006 |
5.3.2. DIMACS Clique Complement Benchmarks (73 instances)
| Instance | Optimal | Hvala | Time | Ratio |
|---|---|---|---|---|
| brock200_1 | 179 | 180 | 127.45ms | 1.006 |
| brock200_2 | 188 | 192 | 238.33ms | 1.021 |
| brock200_3 | 183 | 187 | 176.02ms | 1.022 |
| brock200_4 | 183 | 187 | 142.97ms | 1.022 |
| brock400_1 | 373 | 378 | 539.88ms | 1.013 |
| brock400_2 | 373 | 378 | 581.28ms | 1.013 |
| brock400_3 | 373 | 379 | 560.76ms | 1.016 |
| brock400_4 | 373 | 378 | 508.98ms | 1.013 |
| brock800_1 | 777 | 782 | 3.56s | 1.006 |
| brock800_2 | 777 | 782 | 3.86s | 1.006 |
| brock800_3 | 777 | 783 | 3.79s | 1.008 |
| brock800_4 | 777 | 783 | 3.75s | 1.008 |
| c-fat200-1 | 186 | 188 | 588.37ms | 1.011 |
| c-fat200-2 | 174 | 176 | 380.66ms | 1.011 |
| c-fat200-5 | 140 | 142 | 287.03ms | 1.014 |
| c-fat500-1 | 482 | 486 | 3.35s | 1.008 |
| c-fat500-10 | 372 | 374 | 2.42s | 1.005 |
| c-fat500-2 | 470 | 474 | 3.49s | 1.009 |
| c-fat500-5 | 434 | 436 | 3.09s | 1.005 |
| C125.9 | 91 | 93 | 31.63ms | 1.022 |
| C250.9 | 206 | 209 | 91.34ms | 1.015 |
| C500.9 | 443 | 451 | 330.04ms | 1.018 |
| C1000.9 | 932 | 939 | 1.94s | 1.008 |
| C2000.5 | 1984 | 1988 | 46.18s | 1.002 |
| C2000.9 | 1920 | 1934 | 10.29s | 1.007 |
| C4000.5 | 3978 | 3986 | 216.52s | 1.002 |
| gen200_p0.9_44 | 160 | 164 | 63.44ms | 1.025 |
| gen200_p0.9_55 | 160 | 163 | 40.88ms | 1.019 |
| gen400_p0.9_55 | 352 | 356 | 200.60ms | 1.011 |
| gen400_p0.9_65 | 352 | 356 | 255.87ms | 1.011 |
| gen400_p0.9_75 | 350 | 353 | 229.63ms | 1.009 |
| hamming6-2 | 32 | 32 | 0.00ms | 1.000 |
| hamming6-4 | 60 | 60 | 37.19ms | 1.000 |
| hamming8-2 | 128 | 128 | 37.79ms | 1.000 |
| hamming8-4 | 238 | 240 | 238.51ms | 1.008 |
| hamming10-2 | 512 | 512 | 455.43ms | 1.000 |
| hamming10-4 | 992 | 992 | 2.73s | 1.000 |
| johnson8-2-4 | 24 | 24 | 0.00ms | 1.000 |
| johnson8-4-4 | 56 | 56 | 5.20ms | 1.000 |
| johnson16-2-4 | 112 | 112 | 31.88ms | 1.000 |
| johnson32-2-4 | 480 | 480 | 363.80ms | 1.000 |
| keller4 | 160 | 160 | 95.72ms | 1.000 |
| keller5 | 749 | 752 | 1.87s | 1.004 |
| keller6 | 3303 | 3314 | 56.88s | 1.003 |
| MANN_a9 | 29 | 29 | 8.65ms | 1.000 |
| MANN_a27 | 252 | 253 | 64.22ms | 1.004 |
| MANN_a45 | 690 | 693 | 443.84ms | 1.004 |
| MANN_a81 | 2221 | 2225 | 4.30s | 1.002 |
| p_hat300-1 | 292 | 293 | 1.52s | 1.003 |
| p_hat300-2 | 275 | 277 | 534.66ms | 1.007 |
| p_hat300-3 | 264 | 267 | 298.34ms | 1.011 |
| p_hat500-1 | 491 | 492 | 2.75s | 1.002 |
| p_hat500-2 | 465 | 467 | 1.86s | 1.004 |
| p_hat500-3 | 453 | 454 | 1.04s | 1.002 |
| p_hat700-1 | 689 | 692 | 6.00s | 1.004 |
| p_hat700-2 | 656 | 657 | 4.07s | 1.002 |
| p_hat700-3 | 640 | 641 | 2.15s | 1.002 |
| p_hat1000-1 | 988 | 991 | 15.20s | 1.003 |
| p_hat1000-2 | 956 | 958 | 9.30s | 1.002 |
| p_hat1000-3 | 937 | 939 | 5.06s | 1.002 |
| p_hat1500-1 | 1488 | 1490 | 33.08s | 1.001 |
| p_hat1500-2 | 1437 | 1439 | 22.18s | 1.001 |
| p_hat1500-3 | 1413 | 1416 | 12.09s | 1.002 |
| san200_0.7_1 | 182 | 183 | 143.67ms | 1.005 |
| san200_0.7_2 | 183 | 185 | 125.95ms | 1.011 |
| san200_0.9_1 | 150 | 152 | 63.71ms | 1.013 |
| san200_0.9_2 | 160 | 161 | 63.81ms | 1.006 |
| san200_0.9_3 | 166 | 169 | 47.61ms | 1.018 |
| san400_0.5_1 | 387 | 391 | 988.70ms | 1.010 |
| san400_0.7_1 | 376 | 378 | 683.53ms | 1.005 |
| san400_0.7_2 | 379 | 382 | 649.11ms | 1.008 |
| san400_0.7_3 | 382 | 385 | 635.93ms | 1.008 |
| san400_0.9_1 | 316 | 317 | 255.68ms | 1.003 |
| san1000 | 986 | 990 | 8.70s | 1.004 |
| sanr200_0.7 | 183 | 184 | 196.33ms | 1.005 |
| sanr200_0.9 | 162 | 163 | 64.02ms | 1.006 |
| sanr400_0.5 | 387 | 388 | 994.94ms | 1.003 |
| sanr400_0.7 | 379 | 381 | 697.75ms | 1.005 |
- Total instances tested: 113 (40 FRB + 73 DIMACS)
- Optimal solutions found: 12 instances
- Average approximation ratio: 1.006
- Best ratio: 1.000 (12 optimal instances)
- Worst ratio: 1.025 (gen200_p0.9_44)
- Instances with ratio : 107 (95%)
- FRB average ratio: 1.009
- DIMACS average ratio: 1.007
5.4. Experiment 4: AI-Validated Stress Testing (The Gemini-Vega Validation)
- Graph Construction: Random 3-regular graphs (uniform degree distribution)
- AI Validation: Gemini AI architected testing framework and verified results
- Baseline Comparison: Standard greedy highest-degree-first heuristic
- Theoretical Context: Optimal vertex cover for 3-regular graphs is approximately vertices
| Graph Size | Vertices | Edges | Hvala Size | Greedy Size | Hvala Ratio |
|---|---|---|---|---|---|
| Power-Law (N=10,000) | 10,000 | – | 4,957 | 5,093 | – |
| 3-Regular (N=5,000) | 5,000 | 7,500 | 2,917 (58.34%) | 3,073 (61.46%) | 1.0712 |
| 3-Regular (N=20,000) | 20,000 | 30,000 | 11,647 (58.24%) | 12,350 (61.75%) | 1.0693 |
| Theoretical optimal for 3-regular: ∼0.5446n vertices | |||||
- 1.
- Improvement Over Greedy: Hvala consistently outperforms greedy by 2.7-3.5%
- 2.
- Ratio Stability: Approximation ratio improved slightly from 1.0712 to 1.0693 as graph size doubled
- 3.
- Theoretical Context: Achieved ratio of 1.069 against theoretical optimum (∼0.5446n)
- 4.
- Computational Feasibility: 20,000-vertex graph solved in 162.09 seconds
- 5.
- AI Verification: Independent validation through Gemini AI confirms correctness and reproducibility
6. Arguments Supporting the Hypothesis
6.1. Argument 1: Consistency Across Diverse Instance Classes
| Experiment | Instances | Avg. Ratio | Max Ratio |
|---|---|---|---|
| DIMACS Benchmarks | 32 | 1.0072 | 1.030 |
| Real-World Large Graphs | 88 | 1.007 | 1.032 |
| NPBench Hard Instances | 113 | 1.006 | 1.025 |
| AI Stress Tests | 3 | – | 1.071 |
| Combined | 236 | 1.007 | 1.071 |
6.2. Argument 2: Scalability and Improved Performance on Larger Instances
- C4000.5 (3,986 vertices): ratio 1.001
- p_hat1500-1 (1,488 optimal): ratio 1.001
- 20K 3-regular graph: ratio 1.0693 (better than 5K instance at 1.0712)
- rec-amazon (262K vertices): successfully processed
6.3. Argument 3: High Frequency of Provably Optimal Solutions
- DIMACS: 3/32 optimal (9.4%)
- Real-World: 28/88 optimal (31.8%)
- NPBench: 12/113 optimal (10.6%)
6.4. Argument 4: Consistent Improvement Over Greedy Baselines
| Instance | Hvala | Greedy | Improvement | Optimal |
|---|---|---|---|---|
| 3-Regular (5K) | 2,917 | 3,073 | 5.1% | ∼2,723 |
| 3-Regular (20K) | 11,647 | 12,350 | 5.7% | ∼10,892 |
| Power-Law (10K) | 4,957 | 5,093 | 2.7% | Unknown |
6.5. Argument 5: Theoretical Foundation via Weight-Preserving Reduction
7. Addressing the Dubious Nature of the Hypothesis
7.1. Why This Hypothesis Appears Dubious
- 1.
- Direct Implication for P vs NP: Achieving a polynomial-time approximation ratio for vertex cover would prove that P = NP. This follows from known hardness results: Dinur and Safra [5] proved that approximating vertex cover to within factor is NP-hard for any . Therefore, a polynomial-time algorithm with ratio would solve an NP-hard problem in polynomial time, implying P = NP.
- 2.
- Millennium Prize Problem: The P versus NP problem is one of the seven Millennium Prize Problems designated by the Clay Mathematics Institute, with a $1,000,000 prize for its solution. Our hypothesis, if proven, would claim this prize by demonstrating P = NP.
- 3.
- Contradiction with Decades of Research: The overwhelming consensus in the computer science community is that P ≠ NP. Countless researchers have attempted to prove P = NP or find polynomial-time algorithms for NP-complete problems, all without success. Our hypothesis suggests we have achieved what the collective effort of the field has not.
- 4.
-
Implications Beyond Vertex Cover: If P = NP, it would revolutionize:
- Cryptography (most encryption schemes would be breakable)
- Optimization (all NP-complete problems become tractable)
- Artificial intelligence (many learning problems become efficiently solvable)
- Mathematics (automated theorem proving becomes vastly more powerful)
7.2. The Hypothesis Framework as Intellectual Honesty
- Extensive empirical evidence across 233+ diverse instances
- Consistent approximation ratios between 1.001 and 1.071
- Theoretical proofs of optimality on specific graph classes (paths, cliques, star graphs, skewed bipartite graphs)
- Formal analysis of structural complementarity showing orthogonal worst-cases for different heuristics
- Independent validation through AI-assisted stress testing
- No observed counterexamples despite testing on hard instances and adversarially constructed 3-regular graphs
- Rigorous proof that our graph classification (sparse/dense/bipartite/hub-heavy) exhaustively covers ALL possible graph structures
- Worst-case analysis proving for potential adversarial graphs not in any of our identified classes
- Formal proof that no graph exists where all five heuristics simultaneously achieve ratio
- Resolution of whether achieving on all graphs would truly imply P = NP (requires complete proof, not empirical evidence)
- 1.
- Transparency: Clearly distinguishes between experimental observation and mathematical proof of P = NP
- 2.
- Falsifiability: Invites construction of counterexamples that would disprove the hypothesis
- 3.
- Community Engagement: Encourages rigorous analysis by the broader research community
- 4.
- Scientific Integrity: Acknowledges that claiming to prove P = NP requires ironclad formal proof, not just empirical evidence
7.3. Potential Explanations for the Empirical Results
- The algorithm achieves provably for all graphs
- P = NP is proven, solving a Millennium Prize Problem
- Represents the most significant breakthrough in computer science history
- Requires complete restructuring of computational complexity theory
- All 233+ tested instances happen to be "easy" for this algorithm
- Adversarial instances approaching exist but weren’t encountered
- Consistent with P ≠ NP and known hardness results
- Our test suite, despite diversity, missed the truly hard instances
- Real-world and standard benchmark graphs have structural properties absent in theoretical worst-case constructions
- Algorithm exploits these properties effectively
- Worst-case ratio could exceed on pathological instances designed to break the algorithm
- Practical usefulness without theoretical breakthrough
8. Conclusion
8.1. Summary of Empirical Evidence
- Consistent Performance: Average ratios of 1.006-1.007 across all major experiments
- No Severe Outliers: Maximum observed ratio of 1.071 on adversarial 3-regular graphs
- Optimal Solutions: 43 provably optimal solutions (18.3% of tested instances)
- Scalability: Successful processing of graphs up to 262,111 vertices
- Robustness: Strong performance across diverse graph families (bipartite, scale-free, regular, random, structured)
- Independent Validation: AI-assisted stress testing confirms reproducibility and correctness
8.2. Theoretical Implications
- 1.
- P = NP Proven: It would demonstrate that every problem whose solution can be verified in polynomial time can also be solved in polynomial time.
- 2.
- Millennium Prize: It would claim the $1,000,000 Clay Mathematics Institute prize for solving the P versus NP problem.
- 3.
- Cryptographic Revolution: Most current encryption schemes (RSA, elliptic curve cryptography) would become theoretically breakable in polynomial time.
- 4.
- Optimization Breakthrough: All NP-complete problems (traveling salesman, scheduling, bin packing, etc.) would become efficiently solvable.
- 5.
- Scientific Impact: Automated reasoning, theorem proving, drug design, and numerous other fields would be revolutionized.
8.3. Why We Remain Skeptical
- 1.
- Historical Precedent: Thousands of claimed proofs of P = NP have been proposed and all have been found to contain errors. The problem has resisted solution for over 50 years.
- 2.
- Community Consensus: The overwhelming majority of complexity theorists believe P ≠ NP based on decades of hardness results and failed algorithmic attempts.
- 3.
- Empirical Evidence ≠ Proof: No amount of experimental validation, regardless of consistency or scale, constitutes a mathematical proof. One counterexample would disprove the hypothesis.
- 4.
- Potential Hidden Assumptions: Our test suite, while diverse, may share structural properties that make all tested instances "easy" for this algorithm.
- 5.
- Missing Worst-Case Analysis: We have not proven the ratio bound for adversarially constructed graphs designed to maximize the algorithm’s approximation error.
8.4. Open Questions and Future Work
- 1.
-
Attempt Rigorous Proof or Disproof: Either:
- Prove approximation ratio for all graphs (proving P = NP), or
- Construct counterexample instances achieving ratio (disproving the hypothesis)
- 2.
-
Independent Verification: Reproduce results on:
- Additional benchmark collections
- Specially constructed adversarial graphs
- Randomized instances with controlled structural properties
- 3.
-
Comparative Analysis: Direct comparison against:
- State-of-the-art exact solvers
- Modern heuristics (TIVC, FastVC2+p)
- Machine learning-based approaches
- 4.
-
Theoretical Analysis: Investigate:
- Formal properties of the degree-1 reduction
- Error bounds during projection from to G
- Necessary and sufficient conditions for
- Relationship to existing hardness results
- 5.
-
Adversarial Construction: Design graphs that:
- Maximize the algorithm’s approximation ratio
- Exploit potential weaknesses in the reduction technique
- Test the limits of the ensemble heuristic selection
8.5. Final Remarks
- Extraordinary claims require extraordinary proof: Proving P = NP requires rigorous mathematical proof, not empirical validation
- The burden of proof is immense: We must either provide ironclad formal proof or accept that our hypothesis is likely false
- Skepticism is warranted: Given 50+ years of failed attempts to prove P = NP, the most likely explanation is that we have not found a proof, but rather an algorithm that performs well on our particular test suite
- Value regardless of outcome: Even if the hypothesis is false, the algorithm demonstrates practical value for real-world vertex cover optimization
- Installation: pip install hvala
- Usage: from hvala.algorithm import find_vertex_cover
- Source Code: Available for inspection and verification
References
- Karp, R.M. Reducibility Among Combinatorial Problems. In 50 Years of Integer Programming 1958–2008; Springer: Berlin, Heidelberg, 2009; pp. 219–241. [Google Scholar] [CrossRef]
- Papadimitriou, C.H.; Steiglitz, K. Combinatorial Optimization: Algorithms and Complexity; Courier Corporation: Mineola, New York, 1998. [Google Scholar]
- Karakostas, G. A Better Approximation Ratio for the Vertex Cover Problem. ACM Transactions on Algorithms 2009, 5, 1–8. [Google Scholar] [CrossRef]
- Karpinski, M.; Zelikovsky, A. Approximating Dense Cases of Covering Problems. In Proceedings of the DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Providence, Rhode Island, 1996; Vol. 26, pp. 147–164. [Google Scholar]
- Dinur, I.; Safra, S. On the Hardness of Approximating Minimum Vertex Cover. Annals of Mathematics 2005, 162, 439–485. [Google Scholar] [CrossRef]
- Khot, S.; Minzer, D.; Safra, M. On Independent Sets, 2-to-2 Games, and Grassmann Graphs. In Proceedings of the Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, Montreal, Canada, 2017; pp. 576–589. [Google Scholar] [CrossRef]
- Dinur, I.; Khot, S.; Kindler, G.; Minzer, D.; Safra, M. Towards a proof of the 2-to-1 games conjecture? In Proceedings of the Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, Los Angeles, California, 2018; pp. 376–389. [Google Scholar] [CrossRef]
- Khot, S.; Minzer, D.; Safra, M. Pseudorandom Sets in Grassmann Graph Have Near-Perfect Expansion. In Proceedings of the 2018 IEEE 59th Annual Symposium on Foundations of Computer Science, Paris, France, 2018; pp. 592–601. [Google Scholar] [CrossRef]
- Khot, S. On the Power of Unique 2-Prover 1-Round Games. In Proceedings of the Proceedings of the 34th Annual ACM Symposium on Theory of Computing, Montreal, Canada, 2002; pp. 767–775. [Google Scholar] [CrossRef]
- Khot, S.; Regev, O. Vertex Cover Might Be Hard to Approximate to Within 2-ϵ. Journal of Computer and System Sciences 2008, 74, 335–349. [Google Scholar] [CrossRef]
- Vega, F. The Hvala Algorithm. 2025. Available online: https://dev.to/frank_vega_987689489099bf/the-hvala-algorithm-5395 (accessed on 27 July 2025).
- Vega, F. The Resistire Experiment. 2025. Available online: https://dev.to/frank_vega_987689489099bf/the-resistire-experiment-632 (accessed on 15 October 2025).
- Rossi, R.; Ahmed, N. The Network Data Repository with Interactive Graph Analytics and Visualization. In Proceedings of the AAAI Conference on Artificial Intelligence, 2015; 29. [Google Scholar] [CrossRef]
- Cai, S. Large Graphs Collection for Vertex Cover Benchmarking; Network Data Repository collection; Available online: https://lcs.ios.ac.cn/~caisw/graphs.html.
- Vega, F. The Creo Experiment. 2025. Available online: https://dev.to/frank_vega_987689489099bf/the-creo-experiment-2i1b (accessed on 20 December 2025).
- Roars. NP-Complete Benchmark Instances. Available online: https://roars.dev/npbench/.
- Vega, F. The Gemini-Vega Validation. 2025. Available online: https://dev.to/frank_vega_987689489099bf/the-gemini-vega-validation-27i2 (accessed on 21 December 2025).
- Bar-Yehuda, R.; Even, S. A Local-Ratio Theorem for Approximating the Weighted Vertex Cover Problem. Annals of Discrete Mathematics 1985, 25, 27–46. [Google Scholar]
- Zhang, Y.; Wang, S.; Liu, C.; Zhu, E. TIVC: An Efficient Local Search Algorithm for Minimum Vertex Cover in Large Graphs. Sensors 2023, 23, 7831. [Google Scholar] [CrossRef] [PubMed]
- Cai, S.; Lin, J.; Luo, C. Finding a Small Vertex Cover in Massive Sparse Graphs. Journal of Artificial Intelligence Research 2017, 59, 463–494. [Google Scholar] [CrossRef]
- Luo, C.; Hoos, H.H.; Cai, S.; Lin, Q.; Zhang, H.; Zhang, D. Local search with efficient automatic configuration for minimum vertex cover. In Proceedings of the Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, 2019; pp. 1297–1304. [Google Scholar]
- Harris, D.G.; Narayanaswamy, N.S. A Faster Algorithm for Vertex Cover Parameterized by Solution Size. In Proceedings of the 41st International Symposium on Theoretical Aspects of Computer Science, Clermont-Ferrand, France, 2024; Vol. 289, pp. 40:1–40:18. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
