# Quickselect and asymptotically optimal quicksort

Prerequisite. A basic understanding of the analysis of algorithms, including the asymptotic analysis of the worst-case complexity and the average-case complexity. A nodding acquaintance with Python is helpful as well.

## 1. Quicksort

Quicksort is a divide-and-conquer, quadratic-time sorting algorithm that often performs better than most asymptotically optimal comparison search algorithms. Quicksort divides a list $L$ into two sublists

with respect to a fixed pivot $p \in L$ and recurses down to the sublists until it reaches sublists of size 1. Here is a simple Python implementation of quicksort, with the pivot of a list always chosen to be the last element of the list.

def quick_sort_fixed_pivot(L):
n = len(L)
sort_fixed_pivot(L,0,n-1)
return L

def sort_fixed_pivot(L, p, r):
if p < r:
pivot_index = partition_fixed_pivot(L, p, r)
sort_fixed_pivot(L,p,pivot_index-1)
sort_fixed_pivot(L,pivot_index + 1, r)

def partition_fixed_pivot(L, p, r):
pivot = L[r]
i = p
for j in range(p,r):
if L[j] <= pivot:
L[i], L[j] = L[j], L[i]
i = i+1
L[i], L[r] = L[r], L[i]
return i


sort_fixed_pivot() on a sublist L[p:r+1] chooses L[r] to be the pivot, via partition_fixed_pivot(). L[p:r+1] is divided into three sublists: the list of elements smaller than or equal to the pivot, the singleton list containing the pivot, and the list of elements larger than the pivot. sort_fixed_pivot then recurses on to the first and the third sublists, sorting the sublists.

If L is already sorted, then partition_fixed_pivot(L, p, r) always returns r. Therefore, sort_fixed_pivot splits L[p:r+1] into L[p:r], [L[r]], and []. Since sort_fixed_pivot() can only to reduce the size of the list to be sorted by 1, it must go through $n$ iterations to sort an already-sorted list of size $n$. Specifically, at the $k$th iteration, partition_fixed_pivot(L, 0, n-k) is executed, and sort_fixed_pivot(L, 0, n-k-1) + [L[n-k]] is returned.

It therefore suffices to examine the time complexity of partition_fixed_pivot(L, 0, n-k). Since every instance of the if staement is executed, the runtime of partition_fixed_pivot(L, 0, n-k) is bounded below by $\Omega(n-k)$. It follows that the runtime of quick_sort_fixed_pivot on a sorted list of length n is bounded below by

## 2. Randomized Quicksort

Why, then, does the conventional wisdom dictate that quicksort is more efficien than many $O(n \log n)$-time sorting algorithms? To understand the efficiency of the average-case runtime, we introduce an element of randomness to quicksort:

import random

def quick_sort_randomized_pivot(L):
n = len(L)
sort_randomized_pivot(L, 0 , n-1)
return L

def sort_randomized_pivot(L, p, r):
if p < r:
pivot_index = partition_randomized_pivot(L, p, r)
sort_randomized_pivot(L, p, pivot_index-1)
sort_randomized_pivot(L, pivot_index+1, r)

def partition_randomized_pivot(L, p, r):
pivot = random.randint(p, r)
L[pivot], L[r] = L[r], L[pivot]
return partition_fixed_pivot(L, p, r)


Surely, there is a small but definite chance that the random selection of pivot results in always choosing the last element of the list, thus reducing quick_sort_randomized_pivot to quick_sort_fixed_pivot. Therefore, the worst-case time complexity should still be $\Omega(n^2)$.

Now, observe that the bottleneck in quick_sort_fixed_pivot is partition_fixed_pivot. In fact, the for loop in partition_fixed_pivot is the single biggest contributor to the time complexity of both quick_sort_fixed_pivot and quick_sort_randomized_pivot. This implies that the number of times the comparison statement if L[j] <= pivot is executed is a good indicator of the time complexity of quicksort.

In light of the above observation, we shall show that the average time complexity of randomized quicksort is $O(n \log n)$. To this end, we fix a list L and assume that

are the elements of $L$, sorted. Given a probability distribution over $L$, we let $C$ denote the random variable outputing the total number of comparisons if L[j] <= pivot performed in the course of an execution of quick_sort_randomized_pivot(L). For each $1 \leq i,j \leq n$, let $C_{ij}$ denote the indicator random variable

so that

where $\mathbb{E}$ denotes the expected value of a random variable. Since $\mathbb{E}$ is linear, we see that

where $\mathbb{P}$ denotes the probability of an event.

We let $L_{ij}$ denote the set $\{a_i,a_{i+1},\ldots,a_j\}$. Observe that $\mathbb{P}[a_i \mbox{ is compared to } a_j]$ equals the probability that $a_i$ or the $a_j$ is the first pivot chosen, out of all the elements of $L_{ij}$, in the course of an execution of quick_sort_randomized_pivot(L). Since each choice of pivot is independent of one another, we have that

It follows that

Since the harmonic numbers are asymptotically bounded by the logarithmic function, we conclude that

## 3. Quickselect

While randomized performs quite well “on average” (and in practice!), it still has the worst-case time complexity of $O(n^2)$. With carefully chosen pivots, however, we can guarantee a $\Theta(n \log n)$ performance in all cases.

We have witnessed that a pivot selection strategy that divides a list into sublists of uneven sizes tends to perform badly. Therefore, an efficient search of the median element of a list would lead to an efficient choice of pivots. The Blum–Floyd–Pratt–Rivest–Tarjan selection algorithm, commonly known as quickselect, or the median-of-medians selection algorithm, can be used to find a median in $O(n)$ time. In fact, quickselect can be used to find the $k$th smallest element of the list in $O(n)$ time.

The outline of the algorithm is as follows; here, we assume that $L$ consists of distinct elements.

1. Given an input list $L$ of $n$ elements, we divide the list into $\lfloor \frac{n}{5} \rfloor$ sublists of 5 elements, with at most one sublist possibly containing fewer than 5 elements.
2. We find the median of each of the $\lceil \frac{n}{5} \rceil$ sublists by sorting them. Insertion sort is a good choice here, since it is fast for small lists.
3. We take the list of the medians found in 1-2 and apply the selection algorithm recursively to find the median of medians $m$. If there are an even number of medians, we take the lower one.
4. We rearrange $L$ by putting the terms no larger than $m$ on the front, $m$ in the middle, and terms larger than $m$ in the back.
5. Let $i-1$ be the index of the pivot $m$ in the rearranged list, so that $m$ is the $i$th smallest element of $L$. If $i = k$, then we have found the median. If not, we apply the selection process recursively to find the median. Specifically, if $i > k$, then we apply the selection algorithm on L[:i] to find the $k$th smallest element. If $% $, then we apply the selection algorithm on L[i:] to find the $(k-i)$th smallest element.

In short, the algorithm either selects the correct output or recurses down to a smaller sublist. Since the kth smallest element of a small list can be found in $O(1)$ time, the algorithm always terminates with the correct output.

import math

# find the kth smallest element of L
def quickselect(L,k):
n = len(L)

# To reduce computational complexity, we compute the median directly
# if the length of the list is small enough.
if n < 10:
return sorted(L)[k-1]

# Divide L into sublists of length 5, sort them, and extract the medians.
medians = []
for i in range(math.floor(n/5)):
L[5*i: 5*i+5].sort()
medians.append(L[5*i+2])
if n % 5 != 0:
L[5*math.floor(n/5):].sort()
medians.append(L[5*math.floor(n/5) + math.floor((n % 5)/2)])

# Find recursively the median of medians
mm = quickselect(medians, math.floor(len(medians)/2))

# Partition L with mm as the pivot
mm_index = L.index(mm)
L[n-1], L[mm_index] = L[mm_index], L[n-1]
i = -1
for j in range(n-1):
if L[j] <= mm:
i = i+1
L[i], L[j] = L[j], L[i]
i = i+1
L[i], L[n-1] = L[n-1], L[i] # After this, L[i] = mm.

i = i+1 # Now mm is the ith largest element of L

# Determine whether mm is at the correct spot. If not, recurse.
if i == k:
return mm
elif i > k:
return quickselect(L[:i], k)
else: # i < k
return quickselect(L[i:], k-i)


## 4. Proof of Linear-Time Complexity of Quickselect

To analyze the time complexity of the median-of-medians selection algorithm, we note that at least half of the medians found in Step 2 are no smaller than $m$. This implies that at least half of the $\lceil \frac{n}{5} \rceil$ sublists contain at least 3 terms that are greater than $m$, except for the one sublist that may not have 5 elements, and the sublist that contains $m$. It follows that the number of elements of $L$ greater than $m$ is at least

Similarly, at least $\frac{3}{10}n - 6$ elements are less than $x$, whence, in the worst-case scenario, Step 5 calls the selection algorithm recursively on at most $\frac{7}{10}n+6$ elements.

Let $T(n)$ denote the worst-case running time of the selection algorithm on a list of size $n$. Step 1 takes O(n) time. Step 2 executes insertion sort on $O(n)$ many small sets, and so it takes $O(n)$ time. Step 3 takes $T(n/5)$ time. Step 4 takes $O(n)$ time, as it is a simple variation of the partition algorithm from quicksort. Step 5 takes $T(7n/10 + 6)$ time, as discussed above. It follows that

Let us fix a constant $A$ such that

for all $n$.

Our goal is to find a constant $C$ such that $T(n) \leq Cn$ for all large enough $n$. How large should $C$ be? If $T(n) \leq Cn$ were true for all $n > N$ for some fixed integer $N$, we would have the estimate

In order to have a tight choice of $C$, we expect to have the upper bound

which can hold if and only if

Rearranging the above inequality, we obtain

Now, if we assume that $N > 70$, then $n > 70$, and so the above inequality is equivalent to

It follows that $N > 70$ and

implies

for all $n > N$, and so

for all $n > N$.

Let us now pick $N = 140$ and $C = 5A$, so that $N > 70$ and $C \geq \frac{10A}{N - 70}$. By choosing a larger $A$ if necessary, we can assume that

for all $n$, and that

for all $n \leq 140$.

If

then $\left\lceil\frac{n}{5}\right\rceil \leq 140$ and $\frac{7}{10}n + 6 \leq 140$, and so

By our choice of $C$, we conclude that

for all $n \leq 191$.

In general, if $T(n) \leq Cn$ for all $n \leq M$, then $T(n) \leq Cn$ for all $n \leq \left\lfloor \frac{10}{7} \times (M - 6) \right\rfloor$. Since

for all $M \geq 140$, we conclude that

for all $n \geq 140$. It follows that $T(n) \leq O(n)$.

## 5. Asymptotically Optimal Quicksort

Having established the linear-time complexity of quickselect, we proceed to construct a variant of quicksort with the worst-case time complexity of $O(n \log n)$.

import math

def quick_sort_median_of_medians_pivot(L):
n = len(L)
sort_median_of_medians_pivot(L,0,n-1)
return L

def sort_median_of_medians_pivot(L, p, r):
if p < r:
pivot_index = partition_median_of_medians_pivot(L, p, r)
sort_median_of_medians_pivot(L,p,pivot_index-1)
sort_median_of_medians_pivot(L,pivot_index + 1, r)

def partition_median_of_medians_pivot(L, p, r):
mm = median_of_medians(L[p:r+1], math.floor((r+1-p)/2))
initial_pivot_index = L[p:r+1].index(mm) + p
L[r], L[initial_pivot_index] = L[initial_pivot_index], L[r]
pivot = L[r]

i = p
for j in range(p,r):
if L[j] <= pivot:
L[i], L[j] = L[j], L[i]
i = i+1
L[i], L[r] = L[r], L[i]
return i



In the above implementation, we have sacrificed a little bit of performance for the sake of clarity. partition_median_of_medians_pivot spends $O(n)$ time going through L[p:r+1], looking for the exact location of the median of medians. Moreover, median_of_medians itself spends extra $O(n)$ time going through L, looking for the exact location of the median of medians. These overheads could have prevented by having median_of_medians return the ordered pair (med, med_index) of the median and the index in the original list. The above implementation is simpler, however, and the $O(n)$ overhead does not change the asymptotic runtime of partition_median_of_pivot.

Let us now show that the worst-case runtime complexity of quick_sort_median_of_medians_pivot is $O(n \log n)$. Let $T(n)$ denote the worst-case runutime of quick_sort_median_of_medians_pivot on a list of length $n$. Since the pivot is always the median, we see that

It now follows that

as was to be shown.

## 6. Additional remarks and Further Results

6.1. As mentioned in [Blum–Floyd–Pratt–Rivest–Tarjan 1973], the interest in selection problem dates back to 1883, when Lewis Carroll published “Lawn Tennis Tournaments: the true method of assigning prizes with a proof of the fallacy of the present method”:

Let it not be supposed that, in thus proposing to make these Tournaments a game of pure skill … I am altogether eliminating the element of luck … a thousand accidents might occur to prevent his playing best: the 4th best, 5th best, or even a worst Player, need not despair of winning even the 1st prize. Nor, again, let it be supposed that the present system, which allows an inferior player a chance of the 2nd prize … The proposed form of Tournament, though lasting a shorter time than the present one, has a great many more contests going on at once, and consequently furnishes the spectacle-loving public with a great deal more to look at.

6.2. Let $f(k, n)$ be the minimum number of comparisons required to select the $k$th smallest element in a list of $n$ elements. The relative difficulty of computing percentile levels is measured by

To make sense of the above definition, we let $\alpha = \frac{k}{n-1}$ and observe that

In other words, $F(k/(n-1))$ is the “tight” asymptotic upper bound on the minimum number of comparisons required to select the $(k+1)$th element in a list of $n$ elements, averaged over all choices of $n$. It can be thought of as the value of the constant hidden by the $O(n)$ upper bound for the time complexity of the quickselect algorithm. Limit superior rules out the anomalous behaviors at small values of $n$.

[BFPRT 1973] presents the following bounds on $F(\alpha)$:

The upper bound was improved in [Schönhage–Paterson–Pippenger 1976] to $3$ in the case of finding the median, and again to 2.95 in [Dor–Zwick 1999]. The current best lower bound on $F(\alpha)$ is 2, as per Bent–John 1985]. In the case of finding the median, [Dor–Zwick 2001] establishes the slightly improved lower bound of $2 + 2^{-80}$.

6.3. More sophisticated analysis based on the notion of entropy shows that “no comparison-based algorithm can do better” than quicksort. [Sedgewick–Bentley 2002]. Wild’s currently-unpublished preprint [Wild 2017] establishes an improved bound on the quickselect-based quicksort.

Tags:

Updated: