Divide and Conquer
Divide and conquer refers to a class of algorithmic techniques in which one breaks the input into several parts, solves the problem in each part recursively, and then combines the solutions to these subproblems into an overall solution.
The Mergesort Algorithm
Mergesort sorts a given list of numbers by first dividing them into two equal halves, sorting each half separately by recursion, and then combining the results. The input has been reduced to size 2.
Let T(n) denote the worst-case running time on input instance of size n.
The algorithm takes O(n) time to split the inputs into two halves.
The algorithm takes T(n/2) to solve each half.
The algorithm takes O(n) to combine the solutions.
Recurrence relation: T(n)≤2T(n/2)+cn=2T(n/2)+O(n) for n>2, T(2)≤c for some constant c.
Solving Recurrences
Unrolling: Find the running time of the first few levels and identify a patten that can be continued as the recursion expands.
Substituting: Guess for a solution and substitute it into the recurrence relation to check whether it works with an argument by induction on $ n $.
Unrolling the Mergesort Recurrences
Analyzing the first few levels: The third level has four problems each of size n/4, each taking time at most cn/4, for a total of at most cn.
Identifying a pattern: The level i contributes a total of at most 2i(cn/2i) to the total running time.
Summing over all levels: The number of levels is log2n. Therefore, the function T(⋅) is bounded by O(nlogn) when n>1.
Substituting a Solution
Suppose the relation T(n)≤cnlog2n satisfies the recurrence relation.
Base step: T(2)≤cnlog2n=2c.
Inductive step: T(n)≤2c(n/2)log2(n/2)+cn=cn[(log2n)−1]+cn=cnlog2n. Therefore, the function T(⋅) is bounded by O(nlogn) when n>1.
Further Recurrence Relations
The general form of the divide-and-conquer algorithm is creating q subproblems of size n/2 each, and combine the results in O(n) time. For some constant c, T(n)≤qT(n/2)+cn when n>2, and T(2)≤c.
The Case of q > 2 Subproblems
The first few levels: The first level takes at most cn time. The second level takes at most $(q/2)cn time. The third level takes at most (q2/4)cn time.
Identifying a pattern: At level i, the total work performed is (qi/2i)cn.
Summing over all levels: T(n)≤cn∑j=0log2n−1(2q)j≤(r−1c)nlog2q=O(nlog2q).
Therefore, any function T(⋅) satisfying the recurrence relation with q>2 is bounded by O(nlog2q).
The Case of q = 1 Subproblem
The first few levels: The first level takes at most cn time. The second level takes at most $cn/2 time. The third level takes at most cn/4 time.
Identifying a pattern: At level i, the total work performed is cn/2i.
Summing over all levels: T(n)≤cn∑j=0log2n−12j1≤2cn=O(n).
Therefore, any function T(⋅) satisfying the recurrence relation with q=1 is bounded by O(n).
Related Recurrence
Consider a variant form of divide-and-conquer algorithm: Divide the input into two pieces of equal time, solve them by recursion, and combine the resutls in quadratic time. For some constant c, T(n)≤qT(n/2)+cn2 when n>2, and T(2)≤c.
The first few levels: The first level takes at most cn2 time. The second level takes at most $cn2/2 time. The third level takes at most cn2/4 time.
Identifying a pattern: At level i, the total work performed is cn2/2i.
Summing over all levels: T(n)≤cn2∑j=0log2n−12j1≤2cn2=O(n2).
The Closest Pair of Points
Given n points in the plane, find the pair that is closest together.
The Algorithm
Let the set of points be P=p1,…,pn, where pi has coordinats (xi,yi). The Euclidan distance between two points is d(pi,pj). The algorithm holds the assumption that no two points in P have the same x-coordinate or the same y-coordinate.
The Recursion
Let P′⊆P be the input of each recursion. Let the list Px′ be all the points in P′ that have been sorted by increasing x-coordinate, Py′ be all the points in P′ that have been sorted by increasing y-coordinate.
For each level of recursion, Q is defined as the set of points in the first ⌈n/2⌉ positions of Px′ and R is defined as the set of points in the final ⌊n/2⌋ positions of the list Px.
By a single pass through Px′ and Py′, four lists could be produced: Qx. Qy, Rx, Ry. Suppose that the closest pairs of points in Q and R are and , and .
Combine the Solutions
Let α be the minimum of and . Let L be the vertical line that crosses through the rightmost point of $ Q $.
If there exists q∈Q and $ r \in R forwhich d(q, r) < \alpha ,theneachof q and r lieswithinadistance \alpha of L $$.
Let S⊆P′ be the set of points in this narrow band and Sy be the points in S that sorted with their y-coordinates. Thus, there exist q inQ and r∈R for which d(q,r)<α if and only if there exists s,s′∈S for which d(s,s′)<α.
If s,s′∈S for which d(s,s′)<α, then s and s′ are within 15 positions of each other in the sorted list Sy. Therefore, for each point s∈Sy, the algorithm computes its distance to each of the next 15 points in Sy. If there's a pair of points for which d(s,s′)<α, they are the closest pair of points in P′. Otherwise, the closet pair is the minimum of and .
Analyzing the Algorithm
The algorithm correctly outputs a closest pair of points in P. Since the algorithm divides the input into left and right halves, and merges the solutions in linear time, it satisties the basic recurrence and the time complexity is O(nlogn).
Basic step: When ∣P∣≤3, the closest pair could be computed by brute-force.
Induction step: Assume that the remainder of the algorithm correctly determines the closest pair of a given input, then the closest pair in P either has both elements in one of Q or R, or it has one element in each.
Last updated