Prerequisite. A basic understanding of the analysis of algorithms, including the asymptotic analysis of the worst-case complexity and the average-case complexity. A nodding acquaintance with Python is helpful as well.
Quicksort is a divide-and-conquer, quadratic-time sorting algorithm that often performs better than most asymptotically optimal comparison search algorithms. Quicksort divides a list into two sublists
with respect to a fixed pivot and recurses down to the sublists until it reaches sublists of size 1. Here is a simple Python implementation of quicksort, with the pivot of a list always chosen to be the last element of the list.
def quick_sort_fixed_pivot(L): n = len(L) sort_fixed_pivot(L,0,n-1) return L def sort_fixed_pivot(L, p, r): if p < r: pivot_index = partition_fixed_pivot(L, p, r) sort_fixed_pivot(L,p,pivot_index-1) sort_fixed_pivot(L,pivot_index + 1, r) def partition_fixed_pivot(L, p, r): pivot = L[r] i = p for j in range(p,r): if L[j] <= pivot: L[i], L[j] = L[j], L[i] i = i+1 L[i], L[r] = L[r], L[i] return i
sort_fixed_pivot() on a sublist
L[r] to be the pivot, via
L[p:r+1] is divided into three sublists: the list of elements smaller than or equal to the pivot, the singleton list containing the pivot, and the list of elements larger than the pivot.
sort_fixed_pivot then recurses on to the first and the third sublists, sorting the sublists.
L is already sorted, then
partition_fixed_pivot(L, p, r) always returns
sort_fixed_pivot() can only to reduce the size of the list to be sorted by 1, it must go through iterations to sort an already-sorted list of size . Specifically, at the th iteration,
partition_fixed_pivot(L, 0, n-k) is executed, and
sort_fixed_pivot(L, 0, n-k-1) + [L[n-k]] is returned.
It therefore suffices to examine the time complexity of
partition_fixed_pivot(L, 0, n-k). Since every instance of the
if staement is executed, the runtime of
partition_fixed_pivot(L, 0, n-k) is bounded below by . It follows that the runtime of
quick_sort_fixed_pivot on a sorted list of length
n is bounded below by
2. Randomized Quicksort
Why, then, does the conventional wisdom dictate that quicksort is more efficien than many -time sorting algorithms? To understand the efficiency of the average-case runtime, we introduce an element of randomness to quicksort:
import random def quick_sort_randomized_pivot(L): n = len(L) sort_randomized_pivot(L, 0 , n-1) return L def sort_randomized_pivot(L, p, r): if p < r: pivot_index = partition_randomized_pivot(L, p, r) sort_randomized_pivot(L, p, pivot_index-1) sort_randomized_pivot(L, pivot_index+1, r) def partition_randomized_pivot(L, p, r): pivot = random.randint(p, r) L[pivot], L[r] = L[r], L[pivot] return partition_fixed_pivot(L, p, r)
Surely, there is a small but definite chance that the random selection of pivot results in always choosing the last element of the list, thus reducing
quick_sort_fixed_pivot. Therefore, the worst-case time complexity should still be .
Now, observe that the bottleneck in
partition_fixed_pivot. In fact, the
for loop in
partition_fixed_pivot is the single biggest contributor to the time complexity of both
quick_sort_randomized_pivot. This implies that the number of times the comparison statement
if L[j] <= pivot is executed is a good indicator of the time complexity of quicksort.
In light of the above observation, we shall show that the average time complexity of randomized quicksort is . To this end, we fix a list L and assume that
are the elements of , sorted. Given a probability distribution over , we let denote the random variable outputing the total number of comparisons
if L[j] <= pivot performed in the course of an execution of
quick_sort_randomized_pivot(L). For each , let denote the indicator random variable
where denotes the expected value of a random variable. Since is linear, we see that
where denotes the probability of an event.
We let denote the set . Observe that equals the probability that or the is the first pivot chosen, out of all the elements of , in the course of an execution of
quick_sort_randomized_pivot(L). Since each choice of pivot is independent of one another, we have that
It follows that
Since the harmonic numbers are asymptotically bounded by the logarithmic function, we conclude that
While randomized performs quite well “on average” (and in practice!), it still has the worst-case time complexity of . With carefully chosen pivots, however, we can guarantee a performance in all cases.
We have witnessed that a pivot selection strategy that divides a list into sublists of uneven sizes tends to perform badly. Therefore, an efficient search of the median element of a list would lead to an efficient choice of pivots. The Blum–Floyd–Pratt–Rivest–Tarjan selection algorithm, commonly known as quickselect, or the median-of-medians selection algorithm, can be used to find a median in time. In fact, quickselect can be used to find the th smallest element of the list in time.
The outline of the algorithm is as follows; here, we assume that consists of distinct elements.
- Given an input list of elements, we divide the list into sublists of 5 elements, with at most one sublist possibly containing fewer than 5 elements.
- We find the median of each of the sublists by sorting them. Insertion sort is a good choice here, since it is fast for small lists.
- We take the list of the medians found in 1-2 and apply the selection algorithm recursively to find the median of medians . If there are an even number of medians, we take the lower one.
- We rearrange by putting the terms no larger than on the front, in the middle, and terms larger than in the back.
- Let be the index of the pivot in the rearranged list, so that is the th smallest element of . If , then we have found the median. If not, we apply the selection process recursively to find the median. Specifically, if , then we apply the selection algorithm on
L[:i]to find the th smallest element. If , then we apply the selection algorithm on
L[i:]to find the th smallest element.
In short, the algorithm either selects the correct output or recurses down to a smaller sublist. Since the kth smallest element of a small list can be found in time, the algorithm always terminates with the correct output.
import math # find the kth smallest element of L def quickselect(L,k): n = len(L) # To reduce computational complexity, we compute the median directly # if the length of the list is small enough. if n < 10: return sorted(L)[k-1] # Divide L into sublists of length 5, sort them, and extract the medians. medians =  for i in range(math.floor(n/5)): L[5*i: 5*i+5].sort() medians.append(L[5*i+2]) if n % 5 != 0: L[5*math.floor(n/5):].sort() medians.append(L[5*math.floor(n/5) + math.floor((n % 5)/2)]) # Find recursively the median of medians mm = quickselect(medians, math.floor(len(medians)/2)) # Partition L with mm as the pivot mm_index = L.index(mm) L[n-1], L[mm_index] = L[mm_index], L[n-1] i = -1 for j in range(n-1): if L[j] <= mm: i = i+1 L[i], L[j] = L[j], L[i] i = i+1 L[i], L[n-1] = L[n-1], L[i] # After this, L[i] = mm. i = i+1 # Now mm is the ith largest element of L # Determine whether mm is at the correct spot. If not, recurse. if i == k: return mm elif i > k: return quickselect(L[:i], k) else: # i < k return quickselect(L[i:], k-i)
4. Proof of Linear-Time Complexity of Quickselect
To analyze the time complexity of the median-of-medians selection algorithm, we note that at least half of the medians found in Step 2 are no smaller than . This implies that at least half of the sublists contain at least 3 terms that are greater than , except for the one sublist that may not have 5 elements, and the sublist that contains . It follows that the number of elements of greater than is at least
Similarly, at least elements are less than , whence, in the worst-case scenario, Step 5 calls the selection algorithm recursively on at most elements.
Let denote the worst-case running time of the selection algorithm on a list of size . Step 1 takes O(n) time. Step 2 executes insertion sort on many small sets, and so it takes time. Step 3 takes time. Step 4 takes time, as it is a simple variation of the partition algorithm from quicksort. Step 5 takes time, as discussed above. It follows that
Let us fix a constant such that
for all .
Our goal is to find a constant such that for all large enough . How large should be? If were true for all for some fixed integer , we would have the estimate
In order to have a tight choice of , we expect to have the upper bound
which can hold if and only if
Rearranging the above inequality, we obtain
Now, if we assume that , then , and so the above inequality is equivalent to
It follows that and
for all , and so
for all .
Let us now pick and , so that and . By choosing a larger if necessary, we can assume that
for all , and that
for all .
then and , and so
By our choice of , we conclude that
for all .
In general, if for all , then for all . Since
for all , we conclude that
for all . It follows that .
5. Asymptotically Optimal Quicksort
Having established the linear-time complexity of quickselect, we proceed to construct a variant of quicksort with the worst-case time complexity of .
import math def quick_sort_median_of_medians_pivot(L): n = len(L) sort_median_of_medians_pivot(L,0,n-1) return L def sort_median_of_medians_pivot(L, p, r): if p < r: pivot_index = partition_median_of_medians_pivot(L, p, r) sort_median_of_medians_pivot(L,p,pivot_index-1) sort_median_of_medians_pivot(L,pivot_index + 1, r) def partition_median_of_medians_pivot(L, p, r): mm = median_of_medians(L[p:r+1], math.floor((r+1-p)/2)) initial_pivot_index = L[p:r+1].index(mm) + p L[r], L[initial_pivot_index] = L[initial_pivot_index], L[r] pivot = L[r] i = p for j in range(p,r): if L[j] <= pivot: L[i], L[j] = L[j], L[i] i = i+1 L[i], L[r] = L[r], L[i] return i
In the above implementation, we have sacrificed a little bit of performance for the sake of clarity.
partition_median_of_medians_pivot spends time going through
L[p:r+1], looking for the exact location of the median of medians. Moreover,
median_of_medians itself spends extra time going through
L, looking for the exact location of the median of medians. These overheads could have prevented by having
median_of_medians return the ordered pair
(med, med_index) of the median and the index in the original list. The above implementation is simpler, however, and the overhead does not change the asymptotic runtime of
Let us now show that the worst-case runtime complexity of
quick_sort_median_of_medians_pivot is . Let denote the worst-case runutime of
quick_sort_median_of_medians_pivot on a list of length . Since the pivot is always the median, we see that
It now follows that
as was to be shown.
6. Additional remarks and Further Results
6.1. As mentioned in [Blum–Floyd–Pratt–Rivest–Tarjan 1973], the interest in selection problem dates back to 1883, when Lewis Carroll published “Lawn Tennis Tournaments: the true method of assigning prizes with a proof of the fallacy of the present method”:
Let it not be supposed that, in thus proposing to make these Tournaments a game of pure skill … I am altogether eliminating the element of luck … a thousand accidents might occur to prevent his playing best: the 4th best, 5th best, or even a worst Player, need not despair of winning even the 1st prize. Nor, again, let it be supposed that the present system, which allows an inferior player a chance of the 2nd prize … The proposed form of Tournament, though lasting a shorter time than the present one, has a great many more contests going on at once, and consequently furnishes the spectacle-loving public with a great deal more to look at.
6.2. Let be the minimum number of comparisons required to select the th smallest element in a list of elements. The relative difficulty of computing percentile levels is measured by
To make sense of the above definition, we let and observe that
In other words, is the “tight” asymptotic upper bound on the minimum number of comparisons required to select the th element in a list of elements, averaged over all choices of . It can be thought of as the value of the constant hidden by the upper bound for the time complexity of the quickselect algorithm. Limit superior rules out the anomalous behaviors at small values of .
[BFPRT 1973] presents the following bounds on :
The upper bound was improved in [Schönhage–Paterson–Pippenger 1976] to in the case of finding the median, and again to 2.95 in [Dor–Zwick 1999]. The current best lower bound on is 2, as per Bent–John 1985]. In the case of finding the median, [Dor–Zwick 2001] establishes the slightly improved lower bound of .
6.3. More sophisticated analysis based on the notion of entropy shows that “no comparison-based algorithm can do better” than quicksort. [Sedgewick–Bentley 2002]. Wild’s currently-unpublished preprint [Wild 2017] establishes an improved bound on the quickselect-based quicksort.