# Compression, Complexity, and P=NP?

## Model 1 vs. Model 2 for Compression

A Flaw in Compression Model 1: Source code for decompression algorithm itself might be highly complex.

### Question 1: Comprehensible Compression

Is there a "comprehensible" compression takes as input a target bitstream B, and outputs useful, readable Java code?

### Question 2: Optimal Compression

Is there an optimal compression takes as input a target bitstream B, and outputs the shortest possible Java program that outputs this bitstream?

## Kolmogorov Complexity

The Java-Kolmogorov complexity K\_J (B) is the length of the shortest Java program (in bytes) that generates B.

Fact #1: Kolmogorov Complexity is effectively independent of language. For any bit stream, the Java-Kolmogorov Complexity is no more than a constant factor larger than the Python-Kolmogorov Complexity.

Could write a Python interpreter in Java and then runs the Python program.

It means that most bitstreams are fundamentally incompressible no matter what programming language is used for the compression algorithm.

## Space/Time Bounded Compression

An optimal compression algorithm takes as input a target bitstream B, and outputs the shortest possible Java program that outputs this bitstream.

No “optimal compression” algorithm exists.

### Space Bounded Compression

Takes two inputs, a target bitstream B and a size S, and outputs a Java program of `length ≤ S` that outputs B.

Does not exist, since it could be used to find K\_J (B) by binary searching on S.

### Space/Time Bounded Compression

Takes three inputs, a target bitstream B, a size S, a maximum number of lines of bytecode executed T, and outputs a Java program of `length ≤ S` that outputs B in fewer than T executed lines of bytecode.

For each possible program p of length S or less:

* If p compiles, run program p until either:
* p terminates.
* We output B.
* T lines of bytecode are executed.

  Runtime: `O(T x 2 ^ S)`.

### Efficient Space/Time Bounded Compression

* Need to make a more precise definition of what we mean by “efficient”.
* Closely related to an important puzzle in computer science: Does P = NP?

## P = NP?

An efficient solution to these three problems (3SAT, Independent Set, and Longest Path) would also give an efficient space/time bounded compression algorithm.

Space/time bounded compression reduces to 3SAT, Independent Set, and Longest Path.

### Definition

Two important classes of yes/no problems: P: Efficiently solvable problems. NP: Problems with solutions that are efficiently verifiable.

Examples of problems in P:

* Is this array sorted?
* Does this array have duplicates?

Examples of problems in NP:

* Is there a solution to this 3SAT problem?
* In graph G, does there exist a path from s to t of weight > k?

Any decision problem for which a yes answer can be efficiently verified can be transformed into a 3SAT problem. This transformation is also "efficient" (polynomial time).
