Preprint
Article

This version is not peer-reviewed.

SAT in Polynomial Time: A Proof of P = NP

Submitted:

16 January 2025

Posted:

17 January 2025

Read the latest preprint version here

Abstract
The P versus NP problem is a cornerstone of theoretical computer science, asking whether problems that are easy to check are also easy to solve. "Easy" here means solvable in polynomial time, where the computation time grows proportionally to the input size. While this problem's origins can be traced to John Nash's 1955 letter, its formalization is credited to Stephen Cook and Leonid Levin. Despite decades of research, a definitive answer remains elusive. Central to this question is the concept of NP-completeness. If even one NP-complete problem, like SAT, could be solved efficiently, it would imply that all NP problems could be solved efficiently, proving P=NP. This research proposes a groundbreaking claim: SAT, traditionally considered NP-complete, can be solved in polynomial time, establishing the equivalence of P and NP.
Keywords: 
;  ;  ;  ;  

1. Introduction

The P versus NP problem is a fundamental question in computer science that asks whether problems whose solutions can be easily checked can also be easily solved [1]. “Easily” here means solvable in polynomial time, where the computation time grows proportionally to the input size [1,2]. Problems solvable in polynomial time belong to the class P, while NP includes problems whose solutions can be verified efficiently given a suitable “certificate” [1,2]. Alternatively, P and NP can be defined in terms of deterministic and non-deterministic Turing machines with polynomial-time complexity [1,2].
The central question is whether P and NP are the same. Most researchers believe that P is a strict subset of NP, meaning that some problems are inherently harder to solve than to verify. Resolving this problem has profound implications for fields like cryptography and artificial intelligence [3,4]. The P versus NP problem is widely considered one of the most challenging open questions in computer science. Techniques like relativization and natural proofs have yielded inconclusive results, suggesting the problem’s difficulty [5,6]. Similar problems, such as the VP versus VNP problem in algebraic complexity, remain unsolved [7].
The P versus NP problem is often described as a “holy grail” of computer science. A positive resolution could revolutionize our understanding of computation and potentially lead to groundbreaking algorithms for critical problems. The problem is listed among the Millennium Prize Problems. While recent years have seen progress in related areas, such as finding efficient solutions to specific instances of NP-complete problems, the core question of P versus NP remains unanswered [3]. A polynomial-time algorithm for any NP-complete problem would directly imply P equals NP [8]. Our work focuses on presenting such an algorithm for a well-known NP-complete problem.

2. Background and Ancillary Results

NP-complete problems are the Everest of computational challenges. Despite the ease of verifying proposed solutions with a succinct certificate, finding these solutions efficiently remains an elusive goal. A problem is classified as NP-complete if it satisfies two stringent criteria within computational complexity theory:
  • Efficient Verifiability: Solutions can be quickly checked using a concise proof [8].
  • Universal Hardness: Every problem in the class NP can be reduced to this problem without significant computational overhead [8].
The implications of finding an efficient algorithm for a single NP-complete problem are profound. Such a breakthrough would serve as a master key, unlocking efficient solutions for all problems in NP, with transformative consequences for fields like cryptography, artificial intelligence, and planning [3,4].
Illustrative examples of NP-complete problems include:
  • Boolean Satisfiability (SAT) Problem: Given a logical expression in conjunctive normal form, determine if there exists an assignment of truth values to its variables that makes the entire expression true [9].
  • Boolean 3-Satisfiability (3SAT) Problem: Given a Boolean formula in conjunctive normal form with exactly three literals per clause, determine if there exists a truth assignment to its variables that makes the formula evaluate to true [9].
  • Not-All-Equal 3-Satisfiability (NAE-3SAT) Problem: Given a Boolean formula in conjunctive normal form with exactly three literals per clause, decide if there exists a satisfying truth assignment such that each clause has at least one true literal and at least one false literal [9].
The provided examples represent a small subset of the extensively studied NP-complete problems relevant to our current work. The following problem can be solved in polynomial time:
Definition 1. 
Exact Set Packing By 3-Sets (3XSP) Problem
INSTANCE: A universe set U, a collection of n sets C = S 1 , , S n of 3-elements sets with S i U and a positive integer k n , where each element of U belongs to exactly two sets in C.
QUESTION: Is there a partition of m sets S 1 , , S m with k m such that S i S j = for 1 i j m ?
REMARKS: By employing matching techniques, this problem can be efficiently solved in polynomial time.
Formally, a Boolean formula ϕ is composed of:
  • Boolean variables: x 1 , x 2 , , x n ;
  • Boolean connectives: Any Boolean function with one or two inputs and one output, such as ∧(AND), ∨(OR), ¬(NOT), ⇒(IMPLICATION), ⇔(IF AND ONLY IF);
  • and parentheses.
A truth assignment for a Boolean formula ϕ is a mapping from the variables of ϕ to the Boolean values { true , false } . A truth assignment is satisfying if it makes ϕ evaluate to true. A Boolean formula is satisfiable if it has at least one satisfying truth assignment. A literal is a Boolean variable or its negation. A Boolean formula is in Conjunctive Normal Form ( C N F ) if it is a conjunction (AND) of clauses, where each clause is a disjunction (OR) of one or more literals [8]. The SAT problem asks whether a given Boolean formula in C N F is satisfiable [9]. A 3 C N F formula is a C N F formula in which each clause contains exactly three distinct literals [8]. For example, the following formula is in 3 C N F :
( x 1 ¬ x 2 x 3 ) ( ¬ x 1 x 3 x 2 ) ( ¬ x 1 ¬ x 3 x 2 ) .
The first clause, ( x 1 ¬ x 2 x 3 ) , contains the three literals x 1 , ¬ x 2 and x 3 . The version of SAT where formulas are in 3 C N F is called 3SAT [8]. We formally define the following problem:
Definition 2. 
Monotone Not-All-Equal 3-Satisfiability (NAE-3MSAT) Problem
INSTANCE: A Boolean formula in 3 C N F with monotone clauses (meaning the variables are never negated).
QUESTION: Is there exists a satisfying truth assignment such that each clause has at least one true variable and at least one false variable?
REMARKS: This problem is complete for NP [10]. Here, the certificate is a valid satisfying truth assignment, meaning it satisfies the NAE-3MSAT condition for each clause.
In addition, a 2 C N F formula is a Boolean formula in conjunctive normal form with exactly two distinct literals per clause. Now, we introduce the last problem:
Definition 3. 
Maximum Monotone XOR 2-Satisfiability (2MXSAT) Problem
INSTANCE: A Boolean formula in 2 C N F with monotone clauses (meaning the variables are never negated) using XOR logic operators (instead of using the operator ∨) and a positive integer k.
QUESTION: Is there exists a truth assignment that satisfies at least k clauses?
By presenting the NP-completeness and a polynomial-time solution to SAT, we would establish a proof that P equals NP.

3. Main Result

Even though the following reductions are widely known, we will rely on them as supporting results in our analysis [9].
Theorem 1. 
The problem SAT can be reduced to 3SAT in polynomial time.
Proof. 
We will not delve into the specific steps of this reduction, as it is a standard technique in computer science [8]. To reduce a general SAT instance to a 3SAT formula, we can follow these steps:
  • Expand Clauses with Few Literals: To ensure that all clauses contain exactly three literals, we introduce two new variables and expand clauses with at most two literals into clauses with three literals by considering all possible combinations of the new variables, both negated and positive. For instance, consider the two new variables A and B. A single-literal clause ( x ) can be equivalently expressed as:
    ( x A B ) ( x ¬ A ¬ B ) ( x A ¬ B ) ( x ¬ A B ) .
    Similarly, a two-literal clause ( x y ) is equivalent to:
    ( x y A ) ( x y ¬ A ) ( x y B ) ( x y ¬ B ) .
    Note that the same variables A and B are used in both cases.
  • Identify Long Clauses: Find all clauses with more than three literals.
  • Introduce New Variables: For a clause with n literals (where n > 3 ), introduce n 3 new variables.
  • Create New Clauses: Create a chain of clauses with three literals each, using the original literals and the new variables. Ensure that the satisfiability of the original clause is preserved in this chain of new clauses. To exemplify, consider a clause containing four literals, ( x y z w ) . By introducing a single additional variable, D, this clause can be logically represented as the conjunction of the following two clauses:
    ( x y ¬ D ) ( D z w ) .
By systematically applying this reduction to each clause, we can transform any SAT instance into an equivalent 3SAT formula. This reduction demonstrates that 3SAT is at least as hard as the general SAT problem, and thus, it is an NP-complete problem. □
Theorem 2. 
The problem 3SAT can be reduced to NAE-3SAT in polynomial time.
Proof. 
Any 3SAT formula ϕ can be reduced to an equivalent NAE-3SAT instance. We assume that no clause contains a literal and its negation. Such tautological clauses can be removed. We also remove clauses containing literals whose negations do not appear in the 3SAT instance. This reduction involves the following steps:
  • Variable Introduction:
    • Global Variable: Introduce a new variable w that does not appear in ϕ .
    • Clause Variables: For each clause c i = ( x y z ) in ϕ , introduce a new variable a i .
  • Clause Construction:
    -
    Clause Reduction: For each clause c i = ( x y z ) , construct two NAE-3SAT clauses:
    *
    ( x y a i ) ( z ¬ a i w ) .
By construction, a satisfying truth assignment for ϕ corresponds to a valid satisfying truth assignment for the NAE-3SAT instance when w is assigned the value false, and vice versa. When w is true, we can obtain a satisfying truth assignment for ϕ by negating all the values in a valid satisfying truth assignment for the NAE-3SAT instance. This reduction demonstrates that NAE-3SAT is NP-complete. □
Theorem 3. 
The problem NAE-3SAT can be reduced to NAE-3MSAT in polynomial time.
Proof. 
It is well-known that any Boolean formula ϕ in NAE-3SAT can be reduced to an equivalent NAE-3MSAT instance. This reduction involves the following steps:
  • Variable Introduction:
    • Literal Variables: For each variable x in ϕ , introduce two variables: x + representing the positive literal x and x representing the negative literal ¬ x . Additionally, we introduce three new variables a x , b x , and c x for each variable x in ϕ .
  • Clause Construction:
    • Clause Reduction: For each clause c i = ( x y z ) , construct one NAE-3MSAT clause:
      -
      ( x s x y s y z s z ) , where s v is + if literal v { x , y , z } is positive and − otherwise.
    • Variable Consistency: For each variable x in ϕ , construct four NAE-3MSAT clauses:
      -
      ( x + x a x ) , ( x + x b x ) , ( x + x c x ) , and ( a x b x c x ) . These clauses ensure that exactly one of x + and x is true.
By construction, a valid satisfying truth assignment for ϕ corresponds to a valid satisfying truth assignment for the NAE-3MSAT instance, and vice versa. Thus, this reduction proves that NAE-3MSAT is NP-complete. □
These are key findings.
Theorem 4. 
The problem NAE-3MSAT can be reduced to 2MXSAT in polynomial time.
Proof. 
We reduce an instance of NAE-3MSAT to an instance of 2MXSAT, consisting of a Boolean formula ϕ and a positive integer k. For each clause c i = ( x y z ) in the NAE-3MSAT formula, we construct the following expression using new variables a i , b i , and d i :
F i = ( a i b i ) ( b i d i ) ( a i d i ) ( x a i ) ( y b i ) ( z d i ) .
If c i satisfies the NAE-3MSAT condition (i.e., not all literals have the same truth value), then there exists a truth assignment to a i , b i , and d i such that at least five clauses in F i are satisfied. Conversely, if all literals in c i have the same truth value, it is impossible to satisfy five clauses in F i . The complete 2MXSAT formula ϕ is the conjunction of all F i :
ϕ = i = 1 m F i = F 1 F 2 F 3 F m 1 F m ,
where m is the number of clauses in the original NAE-3MSAT formula.
Setting k = 5 · m , we claim that the NAE-3MSAT formula has a valid satisfying truth assignment if and only if there is a truth assignment that satisfies at least k clauses in ϕ . This is because each NAE-satisfying clause in the original formula corresponds to at least five satisfiable clauses in the constructed formula, and vice-versa. Therefore, satisfying k = 5 · m clauses in ϕ guarantees a valid NAE-satisfying truth assignment for the original NAE-3MSAT formula. □
Theorem 5. 
3XSP admits a polynomial-time algorithm based on matching techniques.
Proof. 
We show that 3-Set Packing, restricted to instances where each element appears in exactly two sets, can be solved in polynomial time by reducing it to 2-Set Packing (equivalent to maximum matching). Given an instance of 3XSP with universe U = { u 1 , u 2 , , u m } and a collection C = { S 1 , S 2 , , S n } of 3-element sets S i U , where each element in U belongs to exactly two sets in C, we construct an instance of Set Packing by 2-Sets as follows:
  • Universe Transformation: Create a new universe U = { S 1 , S 2 , , S n } . Each element in U corresponds to a set in the original collection C.
  • Collection Construction: For each element u k U , create a set S u k = { S i , S j } if and only if S i S j = { u k } . Since each u k belongs to exactly two sets in C, there are precisely two sets S i and S j that contain it, making each S u k a 2-element set. The new collection is C = { S u 1 , S u 2 , , S u m } .
  • Equivalence: A set of k mutually disjoint sets in C corresponds to a set of k mutually disjoint sets in C , and vice-versa. If we select k disjoint sets from C, then the corresponding elements in U will not share any sets in C . Conversely, if we select k disjoint sets from C , this corresponds to selecting k disjoint sets from C.
  • Solving the Transformed Instance: The problem of finding a maximum number of mutually disjoint sets in C is an instance of Set Packing by 2-Sets. This problem is equivalent to finding a maximum matching in a graph where vertices represent the elements of U (the original sets S i ) and edges connect vertices whose corresponding sets share an element in the original universe U. Maximum matching can be solved in polynomial time using well-known algorithms [11].
  • Solution Verification: If the maximum number of disjoint sets found in C (and thus in C) is at least k, the original 3XSP instance has a solution; otherwise, it does not.
Polynomial Time Complexity:
  • Constructing U and C takes O ( n 2 ) time in the worst case, as we need to check intersections between pairs of sets in C.
  • Finding a maximum matching (equivalent to Set Packing by 2-Sets) can be done in polynomial time [11].
  • The remaining steps (comparing the size of the matching to k) take linear time.
Therefore, the entire process runs in polynomial time, showing that 3-Set Packing with the constraint that each element appears in exactly two sets is in P. □
This is a Main Insight.
Theorem 6. 
The problem 2MXSAT can be reduced to 3XSP in polynomial time.
Proof. 
To better visualize this polynomial-time reduction, we will use a graphical representation of the sets involved. To represent a 2MXSAT formula ϕ and a positive integer k as a collection of set C over a universe U, we introduce a gadget for each variable x in ϕ . This gadget consists of 2 · t triangles, where t is the larger of the number of occurrences of x in ϕ . Each triangle in the gadget corresponds to a possible truth assignment for the variable x. The apexes of the triangles are labeled with x i or ¬ x i to denote the truth assignment required for clause c i in ϕ .
The topology of the gadget ensures consistency. The construction (e.g., Figure 1) guarantees that if any positive vertex is matched with some vertices outside of the gadget then all negative vertices can only be matched by the triangles inside this gadget, and vice versa. Thus, the "availability" of a vertex to be matched by an outside vertex corresponds to the truth assignment. For instance, in Figure 1, these sets would be:
{ x i , a x , b x } , { ¬ x i , b x , d x } , { x j , d x , e x } , { ¬ x j , e x , f x } , { x h , f x , g x } , { ¬ x h , g x , a x } ,
where dashed vertices correspond to the literals of x that are absent from clauses c i , c j , c h in ϕ .
To represent each clause c i = ( x y ) in ϕ , we employ a gadget consisting of four sets:
{ x i , a i , b i } , { y i , b i , d i } , { ¬ y i , d i , e i } , { ¬ x i , e i , a i } ,
where dashed vertices correspond to the literals that are absent in c i . This gadget, illustrated in Figure 2, is used to encode each clause within the overall construction. Satisfying clause c i requires selecting exactly two disjoint sets. This is impossible if all variables in c i are either all true or all false. Conversely, if c i is satisfied, it guarantees that exactly two disjoint sets can be chosen.
A truth assignment satisfies at least k clauses of ϕ if and only if its corresponding set collection C can be partitioned into at least 3 · m + k disjoint sets, where m is the number of clauses in ϕ . This equivalence follows directly from the construction of C, which is designed to faithfully represent the logical structure of ϕ over the universe U.
The collection of sets C incorporates two types of set:
  • Variable Sets: The construction depicted in Figure 1 enables the selection of exactly one set for each variable occurrence within a clause of ϕ . Since each clause comprises two distinct variables, there are precisely 2 · m such sets.
  • Clause Sets: The final step (Figure 2) guarantees clause satisfaction in ϕ by requiring the selection of precisely two sets per clause. This selection of 2 · k sets corresponds to a truth assignment that satisfies k clauses of ϕ . Moreover, any truth assignment allows for the selection of at least one set per clause. Consequently, a total of m sets (one for each of the m clauses) plus k additional sets (from the satisfied clauses) are selected.
Hence, a truth assignment for ϕ that satisfies exactly k clauses can be directly mapped to a partition of C into precisely 2 · m variable sets and m + k clause sets. Conversely, any such partition of C can be interpreted as a truth assignment for ϕ that satisfies exactly k clauses. This one-to-one correspondence rigorously establishes the equivalence between two distinct properties: first, the satisfiability of at least k clauses within the formula ϕ ; and second, the existence of a specific partitioning of the set C into at least 3 · m + k disjoint sets. □
This is the Main Theorem.
Theorem 7. 
S A T P .
Proof. 
This follows directly from Theorems 1, 2, 3, 4, 5, and 6. □
This is the definitive result.
Theorem 8. 
P = N P .
Proof. 
Cook’s Theorem states that every NP problem can be reduced to SAT in polynomial time [9]. Given that SAT is an NP-complete problem, a polynomial-time solution for it, as presented here, would directly imply P equals NP. □

4. Conclusion

A definitive proof that P equals NP would fundamentally reshape our computational landscape. The implications of such a discovery are profound and far-reaching:
  • Algorithmic Revolution.
    -
    The most immediate impact would be a dramatic acceleration of problem-solving capabilities. Complex challenges currently deemed intractable, such as protein folding, logistics optimization, and certain cryptographic problems, could become efficiently solvable [3,4]. This breakthrough would revolutionize fields from medicine to cybersecurity. Moreover, everyday optimization tasks, from scheduling to financial modeling, would benefit from exponentially faster algorithms, leading to improved efficiency and decision-making across industries [3,4].
  • Scientific Advancements.
    -
    Scientific research would undergo a paradigm shift. Complex simulations in fields like physics, chemistry, and biology could be executed at unprecedented speeds, accelerating discoveries in materials science, drug development, and climate modeling [3,4]. The ability to efficiently analyze massive datasets would provide unparalleled insights in social sciences, economics, and healthcare, unlocking hidden patterns and correlations [3,4].
  • Technological Transformation.
    -
    Artificial intelligence would be profoundly impacted. The development of more powerful AI algorithms would be significantly accelerated, leading to breakthroughs in machine learning, natural language processing, and robotics [3,4]. While the cryptographic landscape would face challenges, it would also present opportunities to develop new, provably secure encryption methods [3,4].
  • Economic and Societal Benefits.
    -
    The broader economic and societal implications are equally significant. A surge in innovation across various sectors would be fueled by the ability to efficiently solve complex problems. Resource optimization, from energy to transportation, would become more feasible, contributing to a sustainable future [3,4].
In conclusion, a proof of P = N P would usher in a new era of computational power with transformative effects on science, technology, and society. While challenges and uncertainties exist, the potential benefits are immense, making this a compelling area of continued research.

Acknowledgments

The author would like to thank Iris, Marilin, Sonia, Yoselin, and Arelis for their support.

References

  1. Cook, S.A. The P versus NP Problem, Clay Mathematics Institute. https://www.claymath.org/wp-content/uploads/2022/06/pvsnp.pdf, 2022. Accessed December 20, 2024.
  2. Sudan, M. The P vs. NP problem. http://people.csail.mit.edu/madhu/papers/2010/pnp.pdf, 2010. Accessed December 20, 2024.
  3. Fortnow, L. Fifty years of P vs. NP and the possibility of the impossible. Communications of the ACM 2022, 65, 76–85. [CrossRef]
  4. Aaronson, S. P = ? N P . Open Problems in Mathematics 2016, pp. 1–122 [CrossRef]
  5. Baker, T.; Gill, J.; Solovay, R. Relativizations of the P = ? N P Question. SIAM Journal on Computing 1975, 4, 431–442 [CrossRef]
  6. Razborov, A.A.; Rudich, S. Natural Proofs. Journal of Computer and System Sciences 1997, 1, 24–35. [CrossRef]
  7. Wigderson, A. Mathematics and Computation: A Theory Revolutionizing Technology and Science; Princeton University Press, 2019.
  8. Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introduction to Algorithms, 3rd ed.; The MIT Press, 2009.
  9. Garey, M.R.; Johnson, D.S. Computers and Intractability: A Guide to the Theory of NP-Completeness, 1 ed.; San Francisco: W. H. Freeman and Company, 1979.
  10. Schaefer, T.J. The complexity of satisfiability problems. In Proceedings of the STOC ’78: Proceedings of the tenth annual ACM symposium on Theory of computing, 1978, pp. 216–226. [CrossRef]
  11. Greenlaw, R.; Hoover, H.J.; Ruzzo, W.L. Limits to Parallel Computation: P-Completeness Theory; Oxford University Press, USA, 1995.
Figure 1. Variable gadget for the occurrences of x in clauses c i , c j , c h of ϕ .
Figure 1. Variable gadget for the occurrences of x in clauses c i , c j , c h of ϕ .
Preprints 146323 g001
Figure 2. Clause gadget corresponding to the clause c i = ( x y ) in ϕ .
Figure 2. Clause gadget corresponding to the clause c i = ( x y ) in ϕ .
Preprints 146323 g002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated