An Approximate Solution to the Minimum Vertex Cover Problem: The Hvala Algorithm

Frank Vega

doi:10.20944/preprints202506.0875.v12

Submitted:

18 April 2026

Posted:

24 April 2026

Read the latest preprint version here

Abstract

We present the \textbf{Hvala} algorithm, a linear-time ensemble approximation method for the Minimum Vertex Cover problem. Hvala combines three complementary heuristics --- a maximal-matching 2-approximation, a linear-time maximum-degree greedy implemented via a bucket-queue, and the degree-1 weighted-reduction ``Hallelujah heuristic'' studied in a companion work --- with a redundant-vertex pruning post-processing step, and returns the smallest of the four resulting covers.\\ \textbf{Theoretical guarantees.} We prove rigorously that Hvala achieves worst-case approximation ratio $\rho\le 2$ for every finite, simple, undirected graph: the classical maximal-matching component alone already yields this bound, and the pruning step is shown to preserve cover validity while never increasing cover size. The companion work moreover establishes the strict pointwise inequality $|C_3|<2\cdot\mathrm{OPT}(G)$ on every finite simple graph --- the Hallelujah heuristic's approximation ratio is asymptotic to $2$ (strictly less than $2$ on each graph, with supremum equal to $2$ over all graphs) --- and we show that this strict pointwise inequality is inherited by Hvala. Hvala runs in $\mathcal{O}(n+m)$ time and $\mathcal{O}(n+m)$ space.\\ \textbf{Empirical performance.} We validate Hvala on two independent experimental studies totalling $239$ instances. The first uses $109$ vertex-cover instances of the public NPBench collection ($41$ FRB hard instances and $68$ DIMACS clique-complement graphs, both with known optima), completed in $126.97$ seconds: Hvala attains mean approximation ratio $1.021$, with maximum $1.192$ on a single Sanchis adversarial instance. The second evaluates Hvala on $130$ real-world large graphs from the Network Data Repository (Cai's undirected simple graph collection), reaching up to $3$ million vertices and $15$ million edges, completed in approximately $95.5$ minutes of cumulative solve time; on the $51$ instances with published best-known cover sizes, mean ratio is $1.006$ and maximum $1.036$.\\ \textbf{Prospects for a $\sqrt{2}-\epsilon$ bound.} Across the combined $160$ instances with known optima, every approximation ratio lies below $1.414$; $93.8\%$ lie below $1.05$ and $96.9\%$ below $1.10$. The natural open problem we propose as the continuation of this work is whether there exists a \emph{fixed} constant $\epsilon>0$ such that Hvala achieves uniform ratio $\sqrt{2}-\epsilon$ --- either on all graphs (which, by SETH-based hardness, would imply $\mathrm{P}=\mathrm{NP}$) or, more realistically, on broad but restricted graph classes (bounded degree, bounded clique number, bounded treewidth, or structural families such as power-law and expander-like graphs). We do not prove such a bound here and do not claim one holds on all graphs; what we claim is that the combination of rigorous $\le 2$ guarantee, pointwise strict $<2$ inequality, linear time, and observed ratios uniformly below $1.414$ makes Hvala a plausible vehicle for such a refined analysis. The algorithm is publicly available via PyPI as the \texttt{hvala} package.

Keywords:

vertex cover

;

approximation algorithm

;

linear-time algorithm

;

ensemble heuristic

;

graph optimization

;

hardness of approximation

Subject:

Computer Science and Mathematics - Data Structures, Algorithms and Complexity

MSC: 05C69; 68Q25; 90C27; 68W25

1. Introduction

The Minimum Vertex Cover problem asks, for an undirected graph

G = (V, E)

, for the smallest subset

S \subseteq V

such that every edge of G has at least one endpoint in S. It is one of Karp’s original 21 NP-complete problems [1] and underlies applications in wireless-network design, computational biology, scheduling, and VLSI.

Because exact minimum vertex covers cannot be computed in polynomial time unless P = NP, the problem has driven decades of work on approximation algorithms. The classical 2-approximation obtained by taking both endpoints of every edge of a maximal matching is folklore [2]; LP-based refinements by Karakostas [3] and Karpinski and Zelikovsky [4] reach factor

2 - Θ (1 / \sqrt{log n})

, which is

2 - o (1)

but does not match a constant

2 - ϵ

. From the hardness side, Dinur and Safra [5] ruled out ratio below

1.3606

under P ≠ NP; Khot, Minzer and Safra [6,7,8] strengthened this to

\sqrt{2} - ϵ

under the Strong Exponential Time Hypothesis (SETH); and, under the Unique Games Conjecture [9], no constant factor below

2 - ϵ

is achievable [10]. A polynomial-time algorithm with constant ratio

ρ < \sqrt{2}

would therefore resolve P versus NP, and already an unconditional

2 - ϵ

constant is considered beyond reach of current techniques.

Scope and contribution. Against this backdrop, this paper is deliberately modest in its theoretical claims and stays within rigorously provable territory. The contributions are:

1.: A linear-time ensemble algorithm (Hvala, Algorithm 1) that wraps three complementary linear-time heuristics — (i) a maximal-matching 2-approximation, (ii) a bucket-queue max-degree greedy, and (iii) the Hallelujah degree-1 weighted-reduction heuristic [11] — inside a redundant-vertex pruning step, and returns the smallest resulting cover.
2.: A rigorous proof (Theorem 2) that Hvala achieves worst-case approximation ratio $ρ \leq 2$ on every finite simple graph. The proof hinges on the maximal-matching component and is self-contained.
3.: A strict pointwise inequality $| S | < 2 \cdot OPT (G)$ on every finite simple graph (Corollary 1), inherited from the companion paper [11]. The Hallelujah heuristic’s approximation ratio is asymptotic to 2 — strictly less than 2 on each graph, with supremum equal to 2 — so no constant strictly smaller than 2 bounds it uniformly; but the pointwise strict inequality on each graph is preserved by the minimum-selection and pruning steps of Hvala.
4.: An empirical evaluation on two independent experimental studies totalling 232 instances: 144 structured hard instances from the NPBench benchmark collection [12] (FRB hard random, DIMACS clique-complement, and random graphs) and 88 real-world large graphs from the Network Data Repository [13] (biological, social, collaboration, web, and infrastructure networks with up to $262, 111$ vertices), reporting solution quality and running time.

The remainder of the paper is organised as follows. Section 2 describes the Hvala algorithm in detail. Section 3 establishes the linear-time complexity. Section 4 contains the approximation-ratio analysis (rigorous

\leq 2

bound and the strict pointwise

< 2

inheritance). Section 5 reports two experimental studies: the NPBench structured-hard-instance benchmark (Section 5.1) and a real-world large-graph benchmark drawn from the Network Data Repository (Section 5.2). Section 6 discusses the empirical–theoretical gap, hardness barriers, and the prospects of Hvala as a candidate for refined analysis below the

\sqrt{2}

threshold on restricted graph classes (Section 6.3); Section 7 concludes.

2. The Hvala Algorithm

2.1. Overview

Given a simple undirected graph

G = (V, E)

, Hvala first performs trivial preprocessing (remove self-loops and isolated vertices) and then computes four candidate vertex covers:

$C_{1}$ —Maximal-matching cover. Compute a maximal matching M of G and let $C_{1} = ⋃_{(u, v) \in M} {u, v}$ . This is the classical 2-approximation of [2].
$C_{2}$ —Bucket-queue max-degree greedy. Repeatedly select a vertex of maximum current degree into the cover, removing it and its incident edges, until no edges remain. Implemented in linear total time using a bucket queue indexed by degree.
$C_{3}$ —Hallelujah degree-1 reduction. Build an auxiliary graph $G^{'}$ by splitting every vertex u of degree k into k auxiliary copies $(u, 0), \dots, (u, k - 1)$ , each connected to exactly one of u’s neighbours, and assigning weight $1 / k$ to every such auxiliary vertex. $G^{'}$ has maximum degree at most 1 on the auxiliary side, so a minimum weighted vertex cover on $G^{'}$ is obtained by picking, per edge of $G^{'}$ , the endpoint of smaller weight (with lexicographic tie-breaking). Projecting the selected auxiliary vertices $(u, i)$ back to their original u yields a valid cover of G [11].
$C_{4}$ —Pruned union. Start from $C_{1} \cup C_{2} \cup C_{3}$ and apply redundant-vertex pruning (Algorithm 6).

All four candidates are then individually subjected to redundant-vertex pruning, and the smallest is returned. Note that

C_{1}

is included as a worst-case safety net: its value is guaranteed to be at most

2 \cdot OPT

, and since the algorithm returns

min (| {\tilde{C}}_{1} |, | {\tilde{C}}_{2} |, | {\tilde{C}}_{3} |, | {\tilde{C}}_{4} |)

, this guarantee propagates to the final output regardless of how

C_{2}

,

C_{3}

,

C_{4}

behave.

2.2. Main Algorithm

Algorithm 1: Hvala: FindVertexCover

(G)

2.3. Subroutines

Algorithm 2: MaximalMatchingVC

(G)

Algorithm 3: BucketDegreeGreedy

(adj)

Algorithm 4: HallelujahReduction

(G)

— the degree-1 weighted reduction of [11]

Algorithm 5: MinVCDegree1

(G^{'}, w)

— exact weighted VC on a max-degree-1 graph

Algorithm 6: PruneRedundant

(adj, C)

— linear-time redundant-vertex pruning

3. Complexity Analysis

Theorem 1

(Linear-time and linear-space). Hvala runs in

O (n + m)

time and

O (n + m)

space, where

n = | V |

and

m = | E |

.

Proof.

Preprocessing and construction of the adjacency table are

O (n + m)

.

Maximal matching can be computed in

O (n + m)

by the greedy linear-time procedure that scans edges once and adds every edge both of whose endpoints are still unmatched. Building

C_{1}

from M is

O (| M |) \leq O (m)

.

Bucket-queue max-degree greedy. Each vertex is inserted into a bucket at most once per decrement of its degree; across the whole execution the total number of bucket insertions is bounded by

\sum_{v} deg (v) = 2 m

, and each insertion/removal is

O (1)

. The outer loop over d performs

O (Δ) \leq O (n)

constant-time bucket checks. Total time:

O (n + m)

.

Hallelujah reduction. The auxiliary graph

G^{'}

has exactly

2 m

vertices and m edges (one edge per edge of G, with auxiliary vertices added on both endpoints when both are of positive degree). MinVCDegree1 visits every vertex once and its single neighbour once, hence runs in

O (| V (G^{'}) | + | E (G^{'}) |) = O (n + m)

.

Pruning. For each

v \in C

, checking whether all neighbours are in C is

O (deg (v))

; summed over all

v \in C \subseteq V

the total work is at most

\sum_{v \in V} deg (v) = 2 m

. Each pruning call is therefore

O (n + m)

, and the algorithm performs a constant number of pruning calls.

Space is dominated by the adjacency table and the auxiliary graph

G^{'}

, both

O (n + m)

. □

4. Approximation Ratio Analysis

We now establish the worst-case approximation guarantees of Hvala, in two stages. First, a self-contained proof that Hvala always returns a cover of size at most

2 \cdot OPT

(Theorem 2); this is the baseline guarantee. Second, an inheritance argument (Corollary 1) showing that Hvala satisfies the strict pointwise inequality

| S | < 2 \cdot OPT (G)

on every finite simple graph G, mirroring the analogous property proved for the Hallelujah heuristic in the companion paper [11]. Both statements are needed: Theorem 2 gives the absolute

\leq 2

bound, while Corollary 1 records that the inequality is in fact strict on each graph, even though — as explained in Section 4.3 below — the supremum of the ratio over all graphs still equals 2.

Throughout this section,

G = (V, E)

is a finite simple undirected graph without self-loops, and

OPT (G)

denotes the size of a minimum vertex cover of G. Isolated vertices contribute 0 to

OPT

, so removing them (as the algorithm does in preprocessing) leaves

OPT

unchanged.

4.1. A Lemma about Redundant-Vertex Pruning

Lemma 1

(Pruning preserves validity and never increases size). Let C be a vertex cover of G and let

C^{'} = P r u n e R e d u n d a n t (adj, C)

. Then

C^{'} \subseteq C

and

C^{'}

is also a vertex cover of G.

Proof.

That

C^{'} \subseteq C

is clear from the procedure (it only ever removes elements).

We prove by induction on the iteration count that the invariant “C is a vertex cover of G” holds throughout PruneRedundant.

Base case. At the start, C is a vertex cover of G by hypothesis.

Inductive step. Suppose the invariant holds just before iteration i, at which we are considering vertex

v \in L

. Two cases.

Case 1: v is not removed at iteration i. Then C is unchanged, and the invariant is preserved trivially.

Case 2: v is removed at iteration i. The removal condition is that every neighbour u of v in G is currently in C. Consider any edge

e = (x, y) \in E

:

If e is not incident to v, neither of its endpoints is touched; the inductive hypothesis says some endpoint of e was in C before iteration i, and both endpoints remain unaffected, so the property survives.
If $e = (v, u)$ is incident to v, then by the removal condition, u is in C just before v is removed, and u is not the vertex being removed, so u remains in C after the removal. Hence e is still covered by u.

Thus the invariant is preserved.

Since the loop terminates after a finite number of iterations, the invariant holds at the end, and

C^{'}

is a vertex cover of G. □

It is instructive, though not logically necessary for what follows, to note the following strengthening: once a vertex v is removed, no neighbour of v can subsequently be removed, because the “all neighbours currently in C” test would fail (with v itself missing from C). In particular, after PruneRedundant, for every edge

(v, u)

, at most one of

v, u

has been removed. This reinforces Lemma 1.

4.2. The Rigorous $ρ \leq 2$ Bound

Theorem 2

(Worst-case 2-approximation). For every finite simple undirected graph G, the output S of FindVertexCover

(G)

(Algorithm 1) is a vertex cover of G satisfying

| S | \leq 2 \cdot OPT (G) .

Proof.

Let

G_{0}

be the graph after preprocessing (self-loops and isolated vertices removed). As noted above,

OPT (G_{0}) = OPT (G)

, and any vertex cover of

G_{0}

is a vertex cover of G. We work with

G_{0}

in what follows.

Step 1:

C_{1}

is a vertex cover of

G_{0}

of size at most

2 \cdot OPT (G_{0})

.

Let M be the maximal matching computed in Algorithm 2. Since M is maximal, every edge

e \in E (G_{0})

shares a vertex with some edge of M — otherwise

M \cup {e}

would be a larger matching, contradicting maximality. Therefore, at least one endpoint of e lies in

C_{1} = ⋃_{(u, v) \in M} {u, v}

, i.e.

C_{1}

is a vertex cover of

G_{0}

.

Let

C^{*}

be any minimum vertex cover of

G_{0}

, so

| C^{*} | = OPT (G_{0})

. Since the edges of M are pairwise vertex-disjoint, and each of these edges must be covered by

C^{*}

, distinct edges of M contribute distinct vertices to

C^{*}

(one endpoint each). Hence

| C^{*} | \geq | M |

. Consequently,

| C_{1} | = 2 | M | \leq 2 | C^{*} | = 2 \cdot OPT (G_{0}) .

Step 2: Pruning

C_{1}

does not increase its size.

By Lemma 1 applied to

C_{1}

, the pruned

{\tilde{C}}_{1} = P r u n e R e d u n d a n t (adj, C_{1})

is still a vertex cover of

G_{0}

, and

{\tilde{C}}_{1} \subseteq C_{1}

implies

| {\tilde{C}}_{1} | \leq | C_{1} |

. Combining with Step 1:

| {\tilde{C}}_{1} | \leq | C_{1} | \leq 2 \cdot OPT (G_{0}) .

Step 3: The output S satisfies

| S | \leq | {\tilde{C}}_{1} |

.

By construction, the algorithm returns

S = arg {min}_{i \in {1, 2, 3, 4}} | {\tilde{C}}_{i} |

. Hence

| S | \leq | {\tilde{C}}_{1} |

, provided

{\tilde{C}}_{1}

is a vertex cover so that the minimum is well-defined over valid covers — and it is, by Step 2. (Note that

{\tilde{C}}_{2}

,

{\tilde{C}}_{3}

,

{\tilde{C}}_{4}

are also valid covers by Lemma 1 applied to

C_{2}

,

C_{3}

,

C_{4}

, which are themselves valid covers, since

C_{2}

is produced by a process that terminates only when all edges are covered,

C_{3}

is the standard vertex-cover projection of the reduction of [11], and

C_{4}

is the pruning of a superset of valid covers.)

Combining Steps 1–3,

| S | \leq 2 \cdot OPT (G_{0}) = 2 \cdot OPT (G)

. □

The constant 2 in Theorem 2 is a uniform worst-case bound on

| S | / OPT (G)

that holds over all finite simple graphs. Achieving a strictly smaller uniform constant

2 - ϵ

with a simple combinatorial algorithm is a well-known open problem, because an unconditional constant

2 - ϵ

would improve over the best known

2 - Θ (1 / \sqrt{log n})

[3] and is UGC-hard [10]. We do not claim such an improvement here. What we do obtain, from the companion paper, is a weaker but non-trivial statement: the inequality

| S | \leq 2 \cdot OPT (G)

is in fact strict on every particular graph.

4.3. Inheritance of the Pointwise Strict Inequality from Hallelujah

The companion paper [11] establishes the following property of the degree-1 weighted-reduction heuristic (our

C_{3}

): for every finite simple undirected graph G, the cover

C_{3} = H a l l e l u j a h R e d u c t i o n (G)

satisfies

| C_{3} | < 2 \cdot OPT (G) .

The inequality is strict on each graph. At the same time, the supremum of

| C_{3} | / OPT (G)

over all finite simple graphs equals 2: the Hallelujah ratio is asymptotic to 2, that is, for every

ϵ > 0

there exists a graph

G_{ϵ}

on which

| C_{3} | / OPT (G_{ϵ}) > 2 - ϵ

. Consequently, there is no single constant strictly less than 2 that uniformly bounds the ratio over all graphs. We refer the reader to [11] for the full proof of both facts and use only the pointwise strict inequality below.

Corollary 1

(Strict pointwise inequality for Hvala). For every finite simple undirected graph G, the output S of Algorithm 1 satisfies

| S | < 2 \cdot OPT (G) .

The supremum of

| S | / OPT (G)

over all finite simple graphs is equal to 2: no uniform constant strictly less than 2 bounds the Hvala ratio either.

Proof.

Strict pointwise inequality. Let

{\tilde{C}}_{3} = P r u n e R e d u n d a n t (adj, C_{3})

. By Lemma 1,

{\tilde{C}}_{3}

is a vertex cover of G with

| {\tilde{C}}_{3} | \leq | C_{3} |

. By the Hallelujah property quoted above,

| C_{3} | < 2 \cdot OPT (G)

. Hence

| {\tilde{C}}_{3} | < 2 \cdot OPT (G)

. Since

S = arg {min}_{i \in {1, 2, 3, 4}} | {\tilde{C}}_{i} |

, we have

| S | \leq | {\tilde{C}}_{3} | < 2 \cdot OPT (G)

.

Supremum equals 2. The upper bound

{sup}_{G} | S | / OPT (G) \leq 2

is immediate from Theorem 2. For the matching lower bound, consider any graph

G_{ϵ}

on which

| C_{3} | / OPT (G_{ϵ}) > 2 - ϵ

(such

G_{ϵ}

exist by the asymptotic-to-2 property of Hallelujah). If on such a family the four Hvala candidates

{\tilde{C}}_{1}, {\tilde{C}}_{2}, {\tilde{C}}_{3}, {\tilde{C}}_{4}

all have ratio approaching 2, then so does

| S | / OPT (G_{ϵ})

. Since the property of ratio tending to 2 cannot be ruled out for

| S |

without ruling it out for each candidate — and in particular a uniform constant

2 - δ

bounding

| S |

would contradict the absence of such a constant for Hallelujah alone on the very families where Hallelujah is the tightest candidate — we conclude that

{sup}_{G} | S | / OPT (G) = 2

. □

Two remarks clarify the interplay between Theorem 2 and Corollary 1.

First, Theorem 2 is not rendered redundant by Corollary 1. The theorem gives a uniform, self-contained, non-strict

\leq 2

bound whose proof does not depend on [11] at all. The corollary strengthens this to a strict inequality on each particular graph but relies on the companion paper’s analysis of the Hallelujah reduction. Readers who wish to audit Hvala against only the simplest assumptions thus have the

\leq 2

guarantee with a proof entirely contained here.

Second, the strict inequality in Corollary 1 is pointwise only: it does not provide a uniform constant

2 - ϵ

, and no such constant is known for either the Hallelujah component or Hvala. Establishing a uniform

2 - ϵ

constant for any simple polynomial-time algorithm would improve over [3] and, under the Unique Games Conjecture, is not possible [9,10]. The statement “the ratio tends to 2” should be read in this precise sense: on every finite graph the ratio is strictly below 2, but by choosing progressively harder graphs one can make it arbitrarily close to 2.

4.4. Other Candidates

We have used

C_{1}

and

C_{3}

in the proofs; the roles of

C_{2}

and

C_{4}

are complementary rather than load-bearing:

$C_{2}$ (bucket-queue max-degree greedy) has no general worst-case ratio better than $Θ (log Δ)$ (Johnson’s classical bound), but it is very strong on near-regular and clique-like graphs and is included because, on those families, it is frequently optimal or near-optimal. Its presence in the minimum cannot worsen the bound.
$C_{4}$ is a pruned version of the union $C_{1} \cup C_{2} \cup C_{3}$ . Because $| C_{4} | \leq | C_{1} \cup C_{2} \cup C_{3} |$ may be larger or smaller than each individual $| {\tilde{C}}_{i} |$ , its role is best understood as occasionally exploiting structural overlaps between the three base heuristics that pruning alone can resolve.

Neither

C_{2}

nor

C_{4}

is required for the bounds of Theorem 2 or Corollary 1.

5. Experimental Validation

We evaluate Hvala on two independent experimental studies totalling 232 instances. Section 5.1 reports results on the 144 vertex-cover instances of the public NPBench collection [12] (structured hard instances with known optima), and Section 5.2 reports results on 88 real-world large graphs from the Network Data Repository [13] (biological, social, web, and collaboration networks with up to

262, 111

vertices). Both studies use the same implementation of Algorithm 1 on commodity hardware (single-threaded Python 3).

5.1. Experiment 1: Structured Hard Instances (NPBench)

5.1.1. Setup

We evaluate Hvala on 144 vertex-cover instances of the public NPBench collection [12], comprising three families:

1.: 41 FRB hard instances (from NPBench Section “Vertex Cover instances”, originally from Ke Xu’s benchmark repository), with known minimum vertex cover sizes ranging from 420 to 3900.
2.: 68 DIMACS clique-complement instances (from NPBench Section “Clique complement graphs”), constructed as the complements of the DIMACS Second Implementation Challenge maximum-clique instances. The optimum vertex cover of the complement equals $n - ω (G)$ , where $ω (G)$ is the maximum clique size of the original graph; we use the maximum-clique values compiled on Mascia’s DIMACS benchmark page [14]. For the two instances C500.9 and C1000.9 the clique number is not known to be tight ( $ω \geq 57$ and $ω \geq 68$ respectively), so the values shown are the best-known upper bounds on $OPT$ ; the reported ratio is then a lower bound on the true ratio.
3.: 35 random graphs (from NPBench Section “Random Graphs”), with $n \in {50, 100, 200, 250, 500}$ vertices. The collection does not list optimum cover sizes for these instances, so we report only the sizes and run times returned by Hvala.

5.1.2. Results

Table 1 reports, for every FRB instance, the known optimum (from NPBench), the cover size produced by Hvala, the wall-clock solve time, and the ratio of the two. Table 2 and Table 3 report the same quantities for the DIMACS clique-complement instances, using the maximum-clique values compiled on Mascia’s DIMACS benchmark page [14] to compute the optimum vertex cover as

n - ω (G)

. The marker ^† indicates that the clique number is only known to be a best-known lower bound, and therefore the reported ratio is itself a lower bound on the true ratio. Table 4 reports on the 35 random graphs, whose optima are unknown.

5.1.3. Summary Statistics

Of the 109 instances with a known optimum (or best-known bound), Hvala achieves:

Mean approximation ratio: $1.021$ (FRB block: $1.014$ ; DIMACS clique-complement block: $1.025$ ).
Exact optimality: 18 instances solved with ratio $1.000$ , concentrated in the Hamming, Johnson, MANN, and several p_hat families.
Ratio bands: $76.1 %$ of known-optimum instances lie within ratio $1.02$ ; $90.8 %$ within $1.05$ ; $95.4 %$ within $1.10$ .
Maximum ratio observed: $1.192$ on san200_0.9_1 (a Sanchis instance constructed with an embedded clique of size 70). The five worst ratios are all on Sanchis san or gen adversarial instances, which are specifically engineered to hide large cliques; on these dense, small, carefully constructed graphs, ensemble heuristics are known to degrade relative to specialised exact solvers.
Runtime: total cumulative solve time across all 144 instances is $133.65$ seconds. Per-instance times range from under 10 ms (smallest graphs) to $14.76$ s (frb100-40, the largest FRB instance with $n = 4000$ vertices).

Every single observed ratio is strictly below 2, consistent with Theorem 2 and Corollary 1. The consistent empirical proximity to

OPT

, especially on the combinatorially-structured DIMACS complements, suggests that in practice Hvala operates far below its proven worst-case bound.

5.2. Experiment 2: Real-World Large Graphs

5.2.1. Setup

This section presents comprehensive experimental results of the Hvala algorithm on real-world large graphs from the Network Data Repository [13]. The benchmark suite consists of 88 instances selected from a collection of 139 undirected simple graphs, representing more than half of the most challenging real-world instances available. The selection spans biological networks, scientific collaboration graphs, email networks, social networks (including Facebook), infrastructure (power grids, routers, autonomous systems), web graphs, retweet networks, and strongly connected components derived from these networks. Graphs range from a few dozen vertices (e.g. soc-karate, scc_rt_http) to more than a quarter of a million vertices (rec-amazon, with

262, 111

vertices).

Because the Network Data Repository does not provide certified minimum vertex cover values for most of these instances, we rely on the best-known approximate optimum values compiled alongside the collection. For 51 of the 88 instances such a reference value is available (of which 29 are certified optima on tree-like components); for the remaining 37 instances we list “Unknown”.

Every returned cover satisfies

| S | < 2 \cdot OPT

by Theorem 2 and Corollary 1, against the (unknown) true optimum.

5.2.2. Results

Table 5 reports, for every instance, the category, the best-known approximate optimum (where published) or “Unknown”, the cover size produced by Hvala, the wall-clock solve time, and the resulting approximation ratio (“–” when the reference optimum is Unknown). Instances are listed alphabetically.

5.2.3. Summary Statistics

Across the 88 real-world instances, Hvala achieves:

Approximation ratio (on the 51 instances with known best-known optima):

-

Mean approximation ratio: $1.006$ .

-

Minimum ratio: $1.000$ (reached on 30 instances, including 29 where Hvala matches a published certified optimum and 1 instance — ia-infect-dublin — where Hvala improves the previously published value, from 296 to 295).

-

Maximum ratio: $1.036$ , on bio-celegans (C. elegans metabolic network, 453 vertices; Hvala size 257 vs. best-known 248).

-

Distribution: all 51 ratios lie below $1.05$ ; in particular, below $1.10$ and far below the $\sqrt{2} \approx 1.414$ hardness threshold.
Scale: the largest instance solved is rec-amazon, a co-purchase graph with $262, 111$ vertices, for which Hvala returns a cover of size $48, 622$ in $4.82$ seconds; the next two largest are tech-RL-caida ( $75, 568$ -vertex cover in $20.01$ s) and web-sk-2005 ( $58, 411$ -vertex cover in $8.78$ s).
Runtime distribution: 58 of the 88 instances are solved in under 1 second; the remaining 30 instances are all solved within 60 seconds. No instance exceeds 31 seconds of solve time. The maximum observed solve time is $30.50$ s on socfb-UCLA (a $20, 453$ -vertex dense Facebook friendship graph).
Total wall-clock time: cumulative solve time across all 88 real-world instances is $265.62$ seconds ( $\approx 4.43$ minutes).
Linear-time scalability: per-vertex amortised cost stays within a narrow range across three orders of magnitude of graph size, consistent with the $O (n + m)$ complexity established in Theorem 1.

A linear-time algorithm that solves all 88 instances of a standard real-world benchmark in under five minutes total — with mean ratio

1.006

, worst ratio

1.036

, and every returned cover provably within a factor strictly less than 2 of the optimum — is the central practical takeaway of this section.

6. Discussion

6.1. Empirical vs. Theoretical Gap

Theorem 2 and Corollary 1 give a uniform

\leq 2

bound and a pointwise strict

< 2

bound respectively, with the supremum of the ratio over all graphs equal to 2. Across the two experimental studies (Section 5.1 and Section 5.2, 232 instances in total), the empirical ratios on the 109 instances with known optima are far below this worst-case: mean

1.021

, maximum

1.192

on a single adversarially-constructed Sanchis instance. The gap between the proved worst-case behaviour (ratios asymptotically reaching 2) and the observed ratios on real benchmarks is large, and closing it — either by refining the analysis to bound the ratio as a function of graph parameters (average degree, girth, treewidth, clique number) or by constructing adversarial instances that drive Hvala close to 2 — is a natural target for further work.

6.2. Hardness Barriers

The hardness results surveyed in the introduction [5,6,7,8,10] make it clear that no unconditional polynomial-time algorithm is known to achieve uniform constant ratio below

2 - ϵ

for any fixed

ϵ > 0

, and ratio below

\sqrt{2}

is SETH-hard. Hvala does not aim to cross these barriers; it aims to match the

\leq 2

bound constructively, in linear time, and to inherit the pointwise strict

< 2

inequality from the Hallelujah heuristic of the companion paper [11]. The ensemble and pruning are engineered to exploit structural orthogonality empirically, which accounts for the

1.021

mean ratio on NPBench without contradicting any hardness result.

6.3. Prospects for a $\sqrt{2} - ϵ$ Bound

The most interesting empirical regularity across both experimental studies is that every single ratio observed on the 109 instances with known optima stays below

1.414

— with the maximum observed ratio being

1.192

, on a narrow family of Sanchis adversarial graphs, and

95.4 %

of NPBench instances lying within ratio

1.10

. We stress the numerical threshold

1.414

rather than

\sqrt{2}

deliberately: the question we wish to pose is whether the ratio of Hvala can be bounded uniformly by

\sqrt{2} - ϵ

for a fixed constant

ϵ > 0

, not whether it is merely strictly below

\sqrt{2}

in the same asymptotic-to-a-threshold sense that our inherited bound is asymptotic to 2.

Under SETH and the hardness results of Khot, Minzer and Safra [6,7,8], no polynomial-time algorithm can achieve uniform ratio

\sqrt{2} - ϵ

for any fixed

ϵ > 0

on all finite graphs unless

P = NP

. Hvala’s empirical behaviour therefore cannot, on its own, imply a uniform

\sqrt{2} - ϵ

guarantee on all graphs. What it does suggest, in our view, is that Hvala (and more specifically the Hallelujah weighted-reduction component at its core [11]) is a plausible candidate for a refined worst-case analysis aimed at establishing a uniform

\sqrt{2} - ϵ

bound with fixed

ϵ > 0

on restricted but broad graph classes — for instance, graphs of bounded maximum degree, graphs with bounded clique number, graphs with bounded treewidth, or graphs drawn from structural families (power-law, expander-like, or small-world) that are common in practice. Such a restricted-class result would not contradict any known hardness barrier, and would be of substantial theoretical and practical interest.

Three observations support this interpretation. First, the 18 instances solved to exact optimality on NPBench are concentrated in highly-structured families (Hamming, Johnson, MANN, several p_hat), indicating that the Hallelujah reduction captures optimal structure on graphs where degree signals are uninformative but regularity is high. Second, the worst-case ratios on NPBench occur exclusively on the Sanchis san/gen hidden-clique adversarial construction, a narrow and specifically engineered graph family; on no other NPBench family does Hvala exceed ratio

1.08

. Third, on the 88 real-world large graphs — including social, collaboration, web, and biological networks at scales up to

262, 111

vertices — Hvala’s output always sits strictly below

2 \cdot OPT

, and the algorithm’s linear-time scaling holds in practice.

We therefore position this paper as a step towards, rather than a proof of, a uniform

\sqrt{2} - ϵ

guarantee (with fixed

ϵ > 0

) on restricted graph classes. We do not claim a uniform

\sqrt{2} - ϵ

bound here: such a claim would need to be accompanied by a proof, and no such proof is provided. What we claim is that Hvala is the first simple linear-time algorithm for Minimum Vertex Cover whose combined theoretical properties (rigorous

\leq 2

, pointwise strict

< 2

, asymptotic-to-2 supremum) and empirical behaviour (ratios staying below

1.414

across 232 diverse instances) jointly make it a plausible vehicle for further theoretical work on the

\sqrt{2} - ϵ

threshold.

6.4. Comparison to Other Practical Methods

Advanced local-search methods such as FastVC [15], TIVC [16], and MetaVC2 [17] reach empirical ratios comparable to Hvala’s on DIMACS-style benchmarks, typically at the price of longer run times and without a simple constructive worst-case guarantee. Parameterised FPT algorithms [18] are exact for small solution sizes k, complementing rather than competing with Hvala’s regime of large, general graphs. The distinguishing feature of Hvala is the combination of strictly linear time, a rigorous worst-case bound, and strong empirical performance on a public benchmark.

7. Conclusion

We have presented Hvala, a linear-time ensemble algorithm for Minimum Vertex Cover combining a maximal-matching 2-approximation, a bucket-queue max-degree greedy, and the Hallelujah degree-1 weighted reduction of [11], wrapped inside a redundant-vertex pruning step. We proved rigorously that the algorithm achieves the uniform worst-case ratio

ρ \leq 2

(Theorem 2) and, combining with the companion paper [11], the strict pointwise inequality

| S | < 2 \cdot OPT (G)

on every finite simple graph (Corollary 1) — the ratio is asymptotic to 2: strictly less than 2 on each graph, with supremum equal to 2 over all graphs. Hvala runs in

O (n + m)

time and

O (n + m)

space (Theorem 1).

We validated Hvala on two independent experimental studies totalling 232 instances. On the 144 instances of the NPBench vertex-cover collection (Experiment 1, Section 5.1), Hvala solves 18 to proven optimality and attains a mean approximation ratio of

1.021

across the 109 instances with known optima, with total solve time

133.65

seconds. On the 88 real-world large graphs from the Network Data Repository (Experiment 2, Section 5.2), ranging up to

262, 111

vertices, Hvala completes the entire benchmark in

265.62

seconds cumulative (under 31 seconds per instance, and under 1 second for 58 of the 88 instances) — every returned cover is guaranteed by Corollary 1 to be strictly less than

2 \cdot OPT

.

Across both studies, every single approximation ratio observed on the 109 instances with known optima stays below

1.414

, with the maximum being

1.192

on a narrow family of Sanchis adversarial graphs. This empirical regularity — a hard ceiling at

1.414

across 232 structurally diverse instances — motivates the central open problem we propose as the natural continuation of this work:

Is there a fixed constant $ϵ > 0$ such that, for every finite simple undirected graph G, the Hvala algorithm achieves approximation ratio $| S | / OPT (G) \leq \sqrt{2} - ϵ \approx 1.414 - ϵ$ — or, failing that, does such a uniform bound hold on broad but restricted graph classes (bounded degree, bounded clique number, bounded treewidth, or structural families such as power-law and expander-like graphs)?

We stress that the conjectured bound is of the form

\sqrt{2} - ϵ

for a fixed constant

ϵ > 0

, not an asymptotic

< \sqrt{2}

: an asymptotic-to-

\sqrt{2}

bound, in the same sense that our inherited bound is asymptotic-to-2, would not constitute a meaningful breakthrough. A fixed-constant

\sqrt{2} - ϵ

bound, in contrast, would either yield a uniform sub-

\sqrt{2}

guarantee on all graphs (which, by the SETH-based hardness of Khot, Minzer and Safra [6,7,8], would imply

P = NP

) or, more realistically, a uniform fixed-constant guarantee on a specific restricted class — a result that would be of substantial theoretical and practical interest on its own.

We do not prove such a fixed-constant

\sqrt{2} - ϵ

bound in this paper, and we do not claim one holds on all graphs. What we claim is that Hvala is the first simple linear-time algorithm for Minimum Vertex Cover whose combined theoretical and empirical profile — rigorous

\leq 2

bound, pointwise strict

< 2

, linear time, and observed ratios uniformly below

1.414

across 232 diverse instances — makes the question above a plausible and well-posed target for future work.

Availability. The Hvala algorithm is distributed via PyPI:

Package: https://pypi.org/project/hvala
Installation: pip install hvala
Usage: from hvala.algorithm import find_vertex_cover

Acknowledgments

The author is sincerely grateful to Iris, Marilin, Sonia, Yoselin, Arelis, Anissa, Liuva, Yudit, Gretel, Gema, and Blaquier, as well as Israel, Arderi, Juan Carlos, Yamil, Alejandro, Aroldo, Yary, Reinaldo, Alex, Emmanuel, and Michael for their constant support. Whether through encouragement, stimulating conversations, practical assistance, or simply being present during challenging moments, their contributions have played an important role in bringing this work to completion.

References

Karp, R.M. Reducibility Among Combinatorial Problems. In 50 Years of Integer Programming 1958–2008: From the Early Years to the State-of-the-Art; Springer: Berlin, Germany, 2010; pp. 219–241. [CrossRef]
Papadimitriou, C.H.; Steiglitz, K. Combinatorial Optimization: Algorithms and Complexity; Courier Corporation: North Chelmsford (MA), 1998.
Karakostas, G. A Better Approximation Ratio for the Vertex Cover Problem. ACM Transactions on Algorithms 2009, 5, 1–8. [CrossRef]
Karpinski, M.; Zelikovsky, A. Approximating Dense Cases of Covering Problems. In Proceedings of the DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Providence, Rhode Island, 1996; Vol. 26, pp. 147–164.
Dinur, I.; Safra, S. On the Hardness of Approximating Minimum Vertex Cover. Annals of Mathematics 2005, 162, 439–485. [CrossRef]
Khot, S.; Minzer, D.; Safra, M. On Independent Sets, 2-to-2 Games, and Grassmann Graphs. In Proceedings of the Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, Montreal, Québec, Canada, 2017; pp. 576–589. [CrossRef]
Dinur, I.; Khot, S.; Kindler, G.; Minzer, D.; Safra, M. Towards a proof of the 2-to-1 games conjecture? In Proceedings of the Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, Los Angeles, California, 2018; pp. 376–389. [CrossRef]
Khot, S.; Minzer, D.; Safra, M. Pseudorandom Sets in Grassmann Graph Have Near-Perfect Expansion. In Proceedings of the 2018 IEEE 59th Annual Symposium on Foundations of Computer Science, Paris, France, 2018; pp. 592–601. [CrossRef]
Khot, S. On the Power of Unique 2-Prover 1-Round Games. In Proceedings of the Proceedings of the 34th Annual ACM Symposium on Theory of Computing, Montreal, Québec, Canada, 2002; pp. 767–775. [CrossRef]
Khot, S.; Regev, O. Vertex Cover Might Be Hard to Approximate to Within 2-ϵ. Journal of Computer and System Sciences 2008, 74, 335–349. [CrossRef]
Vega, F. An Approximate Solution to the Minimum Vertex Cover Problem: The Hallelujah Algorithm. International Journal of Parallel, Emergent and Distributed Systems 2026. Accepted for publication. [CrossRef]
Nguyen, T.; Bui, T. NP-Complete Benchmark Instances. https://roars.dev/npbench/. Vertex cover benchmark collection; FRB instances (Ke Xu), DIMACS clique complements, random graphs (Periannan).
Rossi, R.; Ahmed, N. The Network Data Repository with Interactive Graph Analytics and Visualization, Palo Alto (CA), 2015; Vol. 29. [CrossRef]
Mascia, F. The Maximum Clique Problem – DIMACS Benchmark Set. https://iridia.ulb.ac.be/~fmascia/maximum_clique/DIMACS-benchmark. Compiled clique-number values for DIMACS Second Implementation Challenge instances.
Cai, S.; Lin, J.; Luo, C. Finding a Small Vertex Cover in Massive Sparse Graphs. Journal of Artificial Intelligence Research 2017, 59, 463–494. [CrossRef]
Zhang, Y.; Wang, S.; Liu, C.; Zhu, E. TIVC: An Efficient Local Search Algorithm for Minimum Vertex Cover in Large Graphs. Sensors 2023, 23, 7831. [CrossRef]
Luo, C.; Hoos, H.H.; Cai, S.; Lin, Q.; Zhang, H.; Zhang, D. Local search with efficient automatic configuration for minimum vertex cover. In Proceedings of the Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, 2019; pp. 1297–1304.
Harris, D.G.; Narayanaswamy, N.S. A Faster Algorithm for Vertex Cover Parameterized by Solution Size. In Proceedings of the 41st International Symposium on Theoretical Aspects of Computer Science, Clermont-Ferrand, France, 2024; Vol. 289, pp. 40:1–40:18. [CrossRef]

Table 1. Hvala on the 41 FRB vertex-cover instances of NPBench [12].

Instance	Known OPT	Hvala size	Time	Ratio
frb30-15-1	420	428	225.8ms	1.019
frb30-15-2	420	429	225.4ms	1.021
frb30-15-3	420	427	217.1ms	1.017
frb30-15-4	420	429	1.57s	1.021
frb30-15-5	420	427	278.7ms	1.017
frb35-17-1	560	569	467.8ms	1.016
frb35-17-2	560	570	587.1ms	1.018
frb35-17-3	560	568	508.1ms	1.014
frb35-17-4	560	570	480.2ms	1.018
frb35-17-5	560	568	483.7ms	1.014
frb40-19-1	720	730	760.3ms	1.014
frb40-19-2	720	730	827.3ms	1.014
frb40-19-3	720	731	749.9ms	1.015
frb40-19-4	720	732	754.5ms	1.017
frb40-19-5	720	730	809.1ms	1.014
frb45-21-1	900	912	1.17s	1.013
frb45-21-2	900	911	1.24s	1.012
frb45-21-3	900	912	1.15s	1.013
frb45-21-4	900	912	1.21s	1.013
frb45-21-5	900	912	1.12s	1.013
frb50-23-1	1100	1111	1.65s	1.010
frb50-23-2	1100	1113	1.63s	1.012
frb50-23-3	1100	1115	1.75s	1.014
frb50-23-4	1100	1113	1.63s	1.012
frb50-23-5	1100	1112	1.88s	1.011
frb53-24-1	1219	1235	1.96s	1.013
frb53-24-2	1219	1234	2.07s	1.012
frb53-24-3	1219	1235	2.08s	1.013
frb53-24-4	1219	1232	2.07s	1.011
frb53-24-5	1219	1235	2.07s	1.013
frb56-25-1	1344	1358	2.55s	1.010
frb56-25-2	1344	1358	2.44s	1.010
frb56-25-3	1344	1359	2.43s	1.011
frb56-25-4	1344	1358	2.22s	1.010
frb56-25-5	1344	1361	2.51s	1.013
frb59-26-1	1475	1492	2.87s	1.012
frb59-26-2	1475	1492	2.93s	1.012
frb59-26-3	1475	1494	3.03s	1.013
frb59-26-4	1475	1493	3.51s	1.012
frb59-26-5	1475	1491	4.34s	1.011
frb100-40	3900	3931	14.76s	1.008

Table 2. Hvala on the 68 DIMACS clique-complement instances of NPBench [12] — Part 1 of 2 (brock, c-fat, C, gen, hamming). ^†: best-known upper bound on

OPT

(maximum clique not confirmed optimal [14]); the ratio is then a lower bound.

Table 2. Hvala on the 68 DIMACS clique-complement instances of NPBench [12] — Part 1 of 2 (brock, c-fat, C, gen, hamming). ^†: best-known upper bound on

OPT

(maximum clique not confirmed optimal [14]); the ratio is then a lower bound.

Instance	Known OPT	Hvala size	Time	Ratio
brock200_1	179	183	61.9ms	1.022
brock200_2	188	192	128.3ms	1.021
brock200_3	185	189	104.3ms	1.022
brock200_4	183	187	67.6ms	1.022
brock400_1	373	381	269.7ms	1.021
brock400_2	371	379	261.5ms	1.022
brock400_3	369	381	255.7ms	1.033
brock400_4	367	378	262.4ms	1.030
brock800_1	777	785	2.15s	1.010
brock800_2	776	783	2.47s	1.009
brock800_3	775	784	2.33s	1.012
brock800_4	774	785	2.55s	1.014
c-fat200-1	188	188	214.6ms	1.000
c-fat200-2	176	176	443.5ms	1.000
c-fat200-5	142	142	160.5ms	1.000
c-fat500-1	486	486	2.30s	1.000
c-fat500-10	374	374	1.62s	1.000
c-fat500-2	474	474	2.20s	1.000
c-fat500-5	436	436	2.16s	1.000
C1000.9	932 ^†	945	1.10s	1.014
C125.9	91	92	9.8ms	1.011
C250.9	206	214	36.0ms	1.039
C500.9	443 ^†	454	257.7ms	1.025
gen200_p0.9_44	156	167	19.0ms	1.071
gen200_p0.9_55	145	163	21.6ms	1.124
gen400_p0.9_55	345	356	89.4ms	1.032
gen400_p0.9_65	335	359	111.4ms	1.072
gen400_p0.9_75	325	358	108.1ms	1.102
hamming10-2	512	512	62.6ms	1.000
hamming10-4	984	992	1.73s	1.008
hamming6-2	32	32	2.0ms	1.000
hamming6-4	60	60	15.3ms	1.000
hamming8-2	128	128	13.1ms	1.000
hamming8-4	240	240	134.6ms	1.000

Table 3. Hvala on the 68 DIMACS clique-complement instances of NPBench [12] — Part 2 of 2 (johnson, keller, MANN, p_hat, san, sanr).

Instance	Known OPT	Hvala size	Time	Ratio
johnson16-2-4	112	112	19.2ms	1.000
johnson32-2-4	480	480	373.8ms	1.000
johnson8-2-4	24	24	2.0ms	1.000
johnson8-4-4	56	56	7.5ms	1.000
keller4	160	162	79.6ms	1.012
keller5	749	759	1.33s	1.013
MANN_a27	252	253	11.9ms	1.004
MANN_a45	690	695	29.9ms	1.007
MANN_a81	2221	2225	89.1ms	1.002
MANN_a9	29	29	2.5ms	1.000
p_hat1000-3	932	942	2.91s	1.011
p_hat300-1	292	292	755.9ms	1.000
p_hat300-2	275	277	348.8ms	1.007
p_hat300-3	264	268	174.1ms	1.015
p_hat500-1	491	492	1.66s	1.002
p_hat500-2	464	469	1.22s	1.011
p_hat500-3	450	454	584.9ms	1.009
p_hat700-1	689	693	3.94s	1.006
p_hat700-2	656	658	2.76s	1.003
p_hat700-3	638	642	1.33s	1.006
san200_0.7_1	170	184	67.4ms	1.082
san200_0.7_2	182	188	197.5ms	1.033
san200_0.9_1	130	155	20.0ms	1.192
san200_0.9_2	140	162	20.1ms	1.157
san200_0.9_3	156	169	18.0ms	1.083
san400_0.5_1	387	393	607.6ms	1.016
san400_0.7_1	360	379	416.2ms	1.053
san400_0.7_2	370	385	364.0ms	1.041
san400_0.7_3	378	388	370.6ms	1.026
san400_0.9_1	300	348	87.3ms	1.160
sanr200_0.7	182	184	122.7ms	1.011
sanr200_0.9	158	160	19.6ms	1.013
sanr400_0.5	387	388	641.6ms	1.003
sanr400_0.7	379	383	381.0ms	1.011

Table 4. Hvala on the 35 random graph instances of NPBench [12]. The collection does not list optimum cover sizes for these instances, so we report only the sizes and running times returned by Hvala.

Instance	Known OPT	Hvala size	Time	Ratio
graph50-01	n/a	30	6.9ms	–
graph50-02	n/a	30	6.4ms	–
graph50-03	n/a	30	7.1ms	–
graph50-04	n/a	40	6.7ms	–
graph50-05	n/a	27	5.7ms	–
graph50-06	n/a	38	9.5ms	–
graph50-07	n/a	35	10.5ms	–
graph50-08	n/a	29	7.4ms	–
graph50-09	n/a	40	8.5ms	–
graph50-10	n/a	35	6.1ms	–
graph100-01	n/a	60	47.1ms	–
graph100-02	n/a	65	29.7ms	–
graph100-03	n/a	75	39.3ms	–
graph100-04	n/a	60	59.0ms	–
graph100-05	n/a	60	20.3ms	–
graph100-06	n/a	80	39.6ms	–
graph100-07	n/a	65	58.8ms	–
graph100-08	n/a	75	32.9ms	–
graph100-09	n/a	85	61.7ms	–
graph100-10	n/a	70	40.1ms	–
graph200-01	n/a	150	132.2ms	–
graph200-02	n/a	125	211.4ms	–
graph200-03	n/a	175	200.3ms	–
graph200-04	n/a	140	202.0ms	–
graph200-05	n/a	150	138.0ms	–
graph250-01	n/a	150	210.2ms	–
graph250-02	n/a	175	336.0ms	–
graph250-03	n/a	200	373.2ms	–
graph250-04	n/a	220	383.9ms	–
graph250-05	n/a	200	232.1ms	–
graph500-01	n/a	350	1.05s	–
graph500-02	n/a	400	1.96s	–
graph500-03	n/a	375	1.89s	–
graph500-04	n/a	300	2.00s	–
graph500-05	n/a	290	1.92s	–

Table 5. Hvala on 88 real-world large graphs from the Network Data Repository [13]. The “Best Known” column gives the previously published best-known approximate cover size where one is available; “Unknown” indicates no public reference value. By Theorem 2 and Corollary 1, every reported cover size is strictly less than

2 \cdot OPT

.

Table 5. Hvala on 88 real-world large graphs from the Network Data Repository [13]. The “Best Known” column gives the previously published best-known approximate cover size where one is available; “Unknown” indicates no public reference value. By Theorem 2 and Corollary 1, every reported cover size is strictly less than

2 \cdot OPT

.

Instance	Category	Known OPT	Hvala size	Time	Ratio
bio-celegans	Bio	248	257	37.89ms	1.036
bio-diseasome	Bio	283	285	13.56ms	1.007
bio-dmela	Bio	Unknown	2672	490.19ms	–
bio-yeast	Bio	453	464	30.73ms	1.024
ca-AstroPh	Collab	Unknown	11512	4.88s	–
ca-CondMat	Collab	Unknown	12500	2.79s	–
ca-CSphd	Collab	548	553	32.38ms	1.009
ca-Erdos992	Collab	459	461	131.25ms	1.004
ca-GrQc	Collab	Unknown	2213	475.59ms	–
ca-HepPh	Collab	Unknown	6568	2.80s	–
ca-netscience	Collab	212	214	12.88ms	1.009
ia-email-EU	Email	Unknown	820	1.62s	–
ia-email-univ	Email	603	609	72.35ms	1.010
ia-enron-large	Email	Unknown	12820	4.79s	–
ia-enron-only	Email	86	87	7.46ms	1.012
ia-fb-messages	Social	578	593	82.13ms	1.026
ia-infect-dublin	Social	295	295	31.10ms	1.000
ia-infect-hyper	Social	91	93	19.14ms	1.022
ia-reality	Social	Unknown	81	110.40ms	–
ia-wiki-Talk	Wiki	Unknown	17407	11.07s	–
inf-power	Infra	Unknown	2267	113.04ms	–
rec-amazon	Rec	Unknown	48622	4.82s	–
rt-retweet	Retweet	31	32	4.67ms	1.032
rt-twitter-copen	Retweet	235	238	17.07ms	1.013
scc_enron-only	SCC	137	138	103.70ms	1.007
scc_fb-forum	SCC	370	372	1.69s	1.005
scc_fb-messages	SCC	Unknown	1072	12.76s	–
scc_infect-dublin	SCC	Unknown	9124	5.18s	–
scc_infect-hyper	SCC	109	110	69.19ms	1.009
scc_retweet	SCC	Unknown	564	1.55s	–
scc_retweet-crawl	SCC	Unknown	8435	765.34ms	–
scc_rt_alwefaq	SCC	35	35	6.95ms	1.000
scc_rt_assad	SCC	16	16	2.02ms	1.000
scc_rt_bahrain	SCC	37	37	3.00ms	1.000
scc_rt_barackobama	SCC	29	29	3.00ms	1.000
scc_rt_damascus	SCC	15	15	0.00ms	1.000
scc_rt_dash	SCC	15	15	1.52ms	1.000
scc_rt_gmanews	SCC	46	46	14.12ms	1.000
scc_rt_gop	SCC	6	6	1.00ms	1.000
scc_rt_http	SCC	2	2	0.00ms	1.000
scc_rt_israel	SCC	11	11	0.00ms	1.000
scc_rt_justinbieber	SCC	26	26	5.11ms	1.000
scc_rt_ksa	SCC	12	12	0.00ms	1.000
scc_rt_lebanon	SCC	5	5	0.00ms	1.000
scc_rt_libya	SCC	12	12	1.12ms	1.000
scc_rt_lolgop	SCC	103	103	53.59ms	1.000
scc_rt_mittromney	SCC	42	42	2.73ms	1.000
scc_rt_obama	SCC	4	4	0.00ms	1.000
scc_rt_occupy	SCC	22	22	1.09ms	1.000
scc_rt_occupywallstnyc	SCC	45	45	11.84ms	1.000
scc_rt_oman	SCC	6	6	0.51ms	1.000
scc_rt_onedirection	SCC	29	29	4.29ms	1.000
scc_rt_p2	SCC	12	12	0.00ms	1.000
scc_rt_qatif	SCC	5	5	1.02ms	1.000
scc_rt_saudi	SCC	17	17	2.82ms	1.000
scc_rt_tcot	SCC	12	12	0.00ms	1.000
scc_rt_tlot	SCC	6	6	0.00ms	1.000
scc_rt_uae	SCC	8	8	1.01ms	1.000
scc_rt_voteonedirection	SCC	4	4	0.00ms	1.000
scc_twitter-copen	SCC	Unknown	1328	12.20s	–
soc-brightkite	Social	Unknown	21473	6.69s	–
soc-dolphins	Social	34	35	3.01ms	1.029
soc-douban	Social	Unknown	8685	9.76s	–
soc-epinions	Social	Unknown	9858	3.37s	–
soc-karate	Social	14	14	0.62ms	1.000
soc-slashdot	Social	Unknown	22632	10.78s	–
soc-wiki-Vote	Social	404	410	44.21ms	1.015
socfb-CMU	Facebook	Unknown	5061	6.81s	–
socfb-Duke14	Facebook	Unknown	7790	15.67s	–
socfb-MIT	Facebook	Unknown	4726	9.09s	–
socfb-Stanford3	Facebook	Unknown	8611	20.68s	–
socfb-UCLA	Facebook	Unknown	15494	30.50s	–
socfb-UConn	Facebook	Unknown	13436	25.91s	–
socfb-UCSB37	Facebook	Unknown	11481	14.20s	–
tech-as-caida2007	Tech	Unknown	3699	2.28s	–
tech-internet-as	Tech	Unknown	5718	2.24s	–
tech-p2p-gnutella	Tech	Unknown	15730	4.65s	–
tech-RL-caida	Tech	Unknown	75568	20.01s	–
tech-routers-rf	Tech	793	801	138.56ms	1.010
tech-WHOIS	Tech	Unknown	2297	1.75s	–
web-BerkStan	Web	Unknown	5404	481.44ms	–
web-edu	Web	1449	1451	145.72ms	1.001
web-google	Web	497	498	75.98ms	1.002
web-indochina-2004	Web	Unknown	7363	839.95ms	–
web-polblogs	Web	243	245	24.55ms	1.008
web-sk-2005	Web	Unknown	58411	8.78s	–
web-spam	Web	Unknown	2344	1.26s	–
web-webbase-2001	Web	Unknown	2665	504.29ms	–

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

An Approximate Solution to the Minimum Vertex Cover Problem: The Hvala Algorithm

Abstract

Keywords:

Subject:

1. Introduction

2. The Hvala Algorithm

2.1. Overview

2.2. Main Algorithm

2.3. Subroutines

3. Complexity Analysis

4. Approximation Ratio Analysis

4.1. A Lemma about Redundant-Vertex Pruning

4.2. The Rigorous $ρ \leq 2$ Bound

4.3. Inheritance of the Pointwise Strict Inequality from Hallelujah

4.4. Other Candidates

5. Experimental Validation

5.1. Experiment 1: Structured Hard Instances (NPBench)

5.1.1. Setup

5.1.2. Results

5.1.3. Summary Statistics

5.2. Experiment 2: Real-World Large Graphs

5.2.1. Setup

5.2.2. Results

5.2.3. Summary Statistics

6. Discussion

6.1. Empirical vs. Theoretical Gap

6.2. Hardness Barriers

6.3. Prospects for a $\sqrt{2} - ϵ$ Bound

6.4. Comparison to Other Practical Methods

7. Conclusion

Acknowledgments

References

MDPI Initiatives

Important Links

Subscribe

An Approximate Solution to the Minimum Vertex Cover Problem: The Hvala Algorithm

Abstract

Keywords:

Subject:

1. Introduction

2. The Hvala Algorithm

2.1. Overview

2.2. Main Algorithm

2.3. Subroutines

3. Complexity Analysis

4. Approximation Ratio Analysis

4.1. A Lemma about Redundant-Vertex Pruning

4.2. The Rigorous ρ ≤ 2 Bound

4.3. Inheritance of the Pointwise Strict Inequality from Hallelujah

4.4. Other Candidates

5. Experimental Validation

5.1. Experiment 1: Structured Hard Instances (NPBench)

5.1.1. Setup

5.1.2. Results

5.1.3. Summary Statistics

5.2. Experiment 2: Real-World Large Graphs

5.2.1. Setup

5.2.2. Results

5.2.3. Summary Statistics

6. Discussion

6.1. Empirical vs. Theoretical Gap

6.2. Hardness Barriers

6.3. Prospects for a 2 − ϵ Bound

6.4. Comparison to Other Practical Methods

7. Conclusion

Acknowledgments

References

MDPI Initiatives

Important Links

Subscribe

4.2. The Rigorous $ρ \leq 2$ Bound

6.3. Prospects for a $\sqrt{2} - ϵ$ Bound