NP and Computational Intractability

NP-complete problems is a set of problems that a polynomial-time algorithm for any one of them would imply the existence of a polynomial-time algorithm for all of them.

Polynomial-Time Reductions

To express the notation that a particular problem X is at least at hard as some other problem Y, we assume that there's a "black box" capable of solving X in a single step.

If an instance of problem Y could be solved using a polynomial number of standard computational steps and a polynomial number of calls to the black box that solves X, Y is polynomial-time reducible to X. (Y≤pXY \leq_{p} X)

  • If Y≤pXY \leq_{p} X and X could be solved in polynomial time, then Y could be solved in polynomial time.

  • If Y≤pXY \leq_{p} X and Y couldn't be solved in polynomial time, then X couldn't be solved in polynomial time.

Independent Set and Vertex Cover

Independent Set

In a graph G=(V,E)G = (V, E), a set of nodes S⊆VS \subseteq V is independent if no two nodes in SS are joined by an edge. The problem is to determine if G contains an independent set of size at least kk, given a number kk.

Vertex Cover

In a graph G=(V,E)G = (V, E), a set of nodes S⊆VS \subseteq V is a vertex cover if every edge e∈Ee \in E has at least one end in SS. The problem is to determine if G contains a vertex cover of size at most kk, given a number kk.

Reduction

In a graph G=(V,E)G = (V, E), SS is an independent set if and only if its complement V−SV - S is a vertex cover.

  • Independent set ≤p\leq_{p} vertex cover. If there's a black box to solve vertex cover, the independent set problem of size at least kk is equivalent to the vertex problem of size at most n−kn - k.

  • Vertex cover ≤p\leq_{p} independent set. If there's a black box to solve the independent set, the vertex cover problem of size at most kk is equivalent to the independent set of size at least n−kn - k.

Set Cover and Set Packing

Set Cover

Given a set UU of elements, a collection S1,…,SmS_1, \dots, S_m of subsets of UU, and a number kk, the problem is to determine if there exists a collection of at most kk of these sets whose union is equal to all of UU.

Set Packing

Given a set UU of elements, a collection S1,…,SmS_1, \dots, S_m of subsets of UU, and a number kk, the problem is to determine if there exists a collection of at least kk of these sets with the property that no two of them intersect.

Reduction

  • Vertex cover ≤p\leq_{p} Set cover: Let the edges be the elements of UU, and the sets SnS_n be the edges that incident to the vertex nn. The vertex cover problem is then converted to the set cover problem.

  • Independent set ≤p\leq_{p} Set Packing: Let the edges be the elements of UU, and the set SnS_n be the edges that incident to the vertex nn. The independent set problem is then converted to the set packing problem.

Reductions via "Gadgets": The Satisfiability Problem

The SAT and 3-SAT Problems

Let XX be a set of nn boolean variables x1,…,xnx_1, \dots, x_n, each can take the value 0 or 1. A clause is simply a disjunction of distinct terms t1∨t2∨⋯∨tlt_1 \vee t_2 \vee \dots \vee t_{l}. (ti∈x1,x2,…,xn,x1‾,…,xn‾t_i \in {x_1, x_2, \dots, x_n, \overline{x_1}, \dots, \overline{x_n}})

A truth assignment vv assigns 0 or 1 to each xix_i. An assignment satisfies a clause CC if C=1C = 1. An assignmeent satisfies a collection of clauses C1,…,CkC_1, \dots, C_k if C1∧C2∧⋯∧Ck=1C_1 \wedge C_2 \wedge \dots \wedge C_k = 1.

The SAT problem is to determine if a satisfying truth assignment exists for a set of clauses over a set of variables.

The 3-SAT problem if a satisfying truth assignment exists for a set of clauses, each of length 3, over a set of variables.

Reducing 3-SAT to Independent Set

The 3-SAT problem could be interpreted as choosing one term from each clause without any conflict, then find a truth assignment that causes all of them to evaluate to 1, which satisfies all clauses. Two terms conflict if one is equal to a variable xix_i and the other is equal to its negation xi‾\overline{x_i}.

Let GG be a graph with 3k3k nodes grouped into kk triangles where each represents a clause. If two terms belong to the same clause (triangle) or conflict, there's an edge that connects them. An independent set with size kk in GG is the set of terms without conflict, which is the solution to the 3-SAT problem. Therefore, 3-SAT ≤p\leq_{p} Independent set.

Transitivity of Reductions

If Z≤pY,Y≤pXZ \leq_{p} Y, Y \leq_{p} X, then Z≤pXZ \leq_{p} X.

Therefore, 3-SAT ≤p\leq_{p} Independent set \leq{p} Vertex cover \leq{p} Set cover.

Efficient Certification and the Definition of NP

Problems and Algorithms

Let ss be the string of input to a problem, which has the length ∣s∣|s|. An algorithm AA for a decision problem receives an input ss and returns the value "yes" or "no", and the returned value is denoted by A(s)A(s).

Let p(⋅)p(\cdot) be a polynomial function for the input ss, AA has a polynomial running time if AA terminates in at most O(p(∣s∣))O(p(|s|)) steps. PP is the set of all problems that an algorithm AA with a polynomial running time that solves XX exists.

Efficient Certification

To check a solution, let tt be a certificate string that contians the evidence that ss is a "yes" instance of a problem XX.

BB is an efficient certifier for a problem XX if:

  • BB is a polynomial-time algorithm that takes two input arguments ss and tt.

  • There's a polynomial function pp that for every string ss, s∈Xs \in X if there's a string tt that ∣t∣≤p(∣s∣)|t| \leq p(|s|), and B(s,t)="yes"B(s, t) = "yes".

Examples

  • 3-SAT: the certificate tt is an assignment of truth values to the variables; the certifier BB evaluates the given set of clauses with respect to this assignment.

  • Independent set: the certificate tt is the identity of a set of at least kk vertices; the certifier BB checks that, for these vertices, no edge joins any pair of them.

  • Set cover: the certificate tt is a list of kk sets from the given collection; the certifier BB checks that the union of these sets is equal to the underlying set UU.

P = NP?

  • PP is the set of all problems that an algorithm AA with a polynomial running time that solves XX exists.

  • NPNP is the set of all problems for which there exists an efficient certifier, which indicates that the solution to the problems could be efficiently verified.

P⊆NPP \subseteq NP. Let B=AB = A, ignore tt, and then use AA to directly solve the problem in polynomial time.

The question of whether P=NPP = NP, or whether every problem whose solution can be quickly verified can also be solved quickly, is one of the most famous unsolved problems in computer science.

NP-Complete Problems

If XX is an NP-complete problem, then X∈NPX \in NP and for all Y∈NP,Y≤pXY \in NP, Y \leq_{p} X. If XX is solvable in polynomial time, then P=NPP = NP.

Circuit Satisfiability

The definition of a circuit is a labeled, directed acyclic graph.

  • The inputs (nodes without incoming edges) are labeled either 0 or 1.

  • Every other node is labeled with one of the boolean operators AND, OR, or NOT. Nodes labeled with AND or OR will have two incoming edges, and nodes labeled with NOT will have one incoming edge.

  • There's an output node without outgoing edges, which is the output computed by the circuit.

The circuit satisfiability problem is to determine whether there's an assignment of values to the inputs that cause the output to be 1, given a circuit as input.

Any algorithm that takes a fixed number of bits as input and produces a yes/no answer could be represented by a circuit. (1 -> yes, 0 -> no) If the algorithm takes several polynomial steps, then the circuit has a polynomial size. Algorithms implemented on physical computers could be reduced to the boolean logic gates.

To proof that X≤pX \leq_{p} circuit satisfiability, we know that XX has an efficient certifier B()B(). To determine whether an input (with length nn) s∈Xs \in X, we need to determine whether a tt of length p(n)p(n) exists so that B(s,t)=yesB(s, t) = yes.

The question could be answered with the black box for circuit satisfiability. Suppose there's a circuit with n+p(n)n + p(n) inputs, and the first nn inputs are hard-coded with the value of ss. The remaining inputs will be labeled with variables representing tt. If there's a way to set the input so that the circuit produces an output of 1, then the tt (B(s,t)=yesB(s, t) = yes) exists, and s∈Xs \in X. Therefore, X≤pcircuitsatisfiabilityX \leq_{p} circuit satisfiability.

General Strategy for Proving NP-Complete

If YY is an NP-complete problem, and XX is a problem in NP with the property that Y≤pXY \leq_{p} X, then XX is NP-complete.

For a new problem XX:

  1. Prove that X∈NPX \in NP.

  2. Choose a problem YY that is known to be NP-complete.

  3. Consider an arbitrary instance sYs_Y of problem YY, and show how to construct an instance sXs_X of problem XX that satisfies the following properties:

  4. If sYs_Y is a yes instance of YY, then sXs_X is a yes instance of XX.

  5. If sXs_X is a yes instance of XX, then sYs_Y is a yes instance of YY.

Graph Coloring

Let GG be an undirected graph in which nodes are the regions to be colored and edges are the pairs that are neighbors. The problem is to assign a color to each node of GG. If (u,v)(u, v) is an edge, uu and vv are assigned different colors. The goal is to minimize the number of colors. If GG has a k-coloring, then it's a k-colorable graph. The algorithmic version of the problem is that given a graph GG and a bound kk, does GG have a k-coloring?

The Computational Complexity

The graph GG is 2-colorable if and only if it is bipartite. It could be solved efficiently with breadth-first search.

3-coloring is NP-complete. The problem is NP because the solution could be verified efficiently by checking the number of colors and if there's a pair of nodes that receive the same color.

To prove the NP-completeness, we could determine if 3-SAT could be solved using a black box for 3-coloring.

Let GG be a graph with three special nodes TrueTrue, FalseFalse, and BaseBase, which are joined into a triangle.

To begin, join each pair of nodes vi,vi‾v_i, \overline{v_i} with an edge, and join both these nodes to BaseBase.

  • In any 3-coloring of GG, the nodes viv_i and vi‾\overline{v_i} must get different colors, and must be different from BaseBase.

  • In any 3-coloring of GG, the nodes TrueTrue FalseFalse, and BaseBase must get all tree colors in some permutation. Therefore, one of viv_i and vi‾\overline{v_i} gets the TrueTrue color, and the other gets the FalseFalse color.

For each clause in the 3-SAT instance, attach a six-node subgraph to the nodes in the clause so that at least one of them must have TrueTrue color. Therefore, the 3-SAT instance is satisfiable if and only if GG has 3-coloring.

k-coloring for k>3k > 3 is NP-complete. Take an instance of 3-coloring, add k−3k - 3 ne wnodes, and join them to each other and to every node in G. The resulting graph is k-colorable if and only if the original graph GG is 3-colorable.

The circuit satisfiability problem could be reduced to an equivalent instance of 3-SAT problem, thus 3-SAT, Independent set, Set packing, Vertex cover, and Set cover are NP-complete.

Last updated