1.1 Quicksort and Binary Search Trees

Let us start the review of data structures with the most commonly used sorting algorithm, quicksort. We will then discover a hidden but deep connection between quicksort and a common data structure, binary search tree (BST). At the end of this section you will hopefully understand these two concepts in a much deeper way.

Quicksort: Functional (out-of-place) Implementation

You might recall from data structures rather complicated implementations of quicksort in C/C++ or Java like this (actually, the original version is even more complicated, which I could never understand):

public void sort(int low, int high) {
    if (low >= high) return;
    int p = partition(low, high);
    sort(low, p);
    sort(p + 1, high);
}
void swap(int i, int j) {   
    int temp = a[i]; a[i] = a[j]; a[j] = temp;
}
int partition(int low, int high) {
    int pivot = a[low];
    int i = low - 1, j = high + 1;
    while (i < j) {
        i++; while (a[i] < pivot) i++;
        j--; while (a[j] > pivot) j--;
        if (i < j) swap(i, j);
    }
    return j;
}

But actually we can write quicksort in Python in just a few lines:

def qsort(a):
    if a == []:
        return []
    pivot = a[0]
    left = [x for x in a if x < pivot]
    right = [x for x in a[1:] if x >= pivot]
    return qsort(left) + [pivot] + qsort(right)

Here we just do a simple partition of array a into two parts using the pivot (here a[0]): left which contains the elements in a that is smaller than the pivot, and right which contains those bigger than or equal to the pivot. Then we just recursively quicksort both left and right and combine them with the pivot in the middle. Viola!

Remarks:

This is indeed much shorter and simpler than the standard implementations found in most textbooks! What’s the difference? Well, they partition the array in-place (by swapping), while we make two new arrays (left and right). Besides using slightly less memory (no need to allocate new arrays left and right), their implementation is also in principle slightly faster in terms of a constant factor, but this difference does not change the time complexity. In algorithm analysis, we care about the complexity, not the constant factor.
Functional programming (e.g., the list comprehension here) makes our out-of-place implementation super elegant and very close to both English and mathematics (you can literally translate left = [x for x in a if x < pivot] into \(\mathit{left} = \{ x \mid x \in a, x < \mathit{pivot}\}\), almost token by token).
As you probably noticed, the choice of the pivot should totally be randomized, otherwise there are many worst case scenarios (see below). This randomized pivoting can be implemented using a swap before the pivot = a[0] line:

    i = random.randrange(len(a))
    a[0], a[i] = a[i], a[0] # the new a[0] is the pivot

Do not forget the [1:] and >= in the right = ... line. If you forgot the first but not the second, you will have infinite recursions.

Simple Analysis of Divide-n-Conquer: Best and Worst Case Complexities

Let us now analyze the time complexity for quicksort (assume input a has size \(n\)). This is a typical divide-n-conquer (therefore recursive) algorithm, which has three parts instead of two!

divide: the two partition lines (left = ... and right = ...)
conquer: the two recursive calls (qsort(left) and qsort(right)), and
combine: the two list concatenation operations (... + [pivot] + ...)

Many students think of divide-n-conquer as just “divide and conquer” (as the name suggests), but that is a big misconception: there is always a combine step! Don’t forgot the combine step in the analysis!

In analyzing a divide-n-conquer algorithm, let us always start with the non-recursive parts (divide and combine), since they are easier. For divide, the two partition lines each cost \(O(n)\) time, because they each visit the whole array once. For combine, the first operation (... + [pivot]) takes \(O(1)\) time because the second list is a singleton but the second operation (... [pivot] + ...) takes \(O(n)\) time because the second list can have length \(n\). (note: combining two arrays takes \(O(\ell)\) time where \(\ell\) is the size of the second array because you need to append each element from the second array to the first). So we conclude that divide+combine is \(O(n)\).

Caveat: in the standard in-place implementation, the combine step has no work (\(O(1)\) time), because you’re always operating on the same array (the input to the recursion is an \([i,j]\) span instead of the array) so there is no need to “concatenate”. This is another advantage of the standard implementation, but again, this difference does not change the fact that divide+combine is \(O(n)\) time. The standard implementation (Hoare scheme) is so complicated that it is not worth our effort in an introductory course.

The rest of the analysis depends on how balanced the recursion tree is. In the best case, the division is always balanced, i.e., the pivot is always (roughly) the median of the array, which divides the array (roughly) equally. Here is a picture:

(4)  6  2  5  3  7  1
--------------------->       O(n) -+
[2 3 1]    4   [6 5 7]             |  
(2) 3 1        (6) 5 7             |
------>        ------>       O(n) -+-> O(log n) levels
[1] 2 [3]     [5] 6 [7]            |
(1) 2 (3)     (5) 6 (7)            |
-->   -->     -->   -->      O(n) -+
[]1[] []3[]  []5[] []7[]

Since each level takes \(O(n)\) time for partitioning, and there are \(O(\log n)\) levels (because each partition halves), so the total time is \(O(n\log n)\). This is the “recursion tree method”.

Or we can write the recursion:

\[T(n) = 2T(n/2) + O(n)\]

which solves (e.g., by the Master Theorem to \(T(n)=O(n\log n)\).

However, in the worst case, the pivot is always the smallest or largest element in the array (e.g., already sorted or inversely sorted), in which case one side is empty but the other side is only smaller than a by just one element and has \(n-1\) elements:

\[T(n) = T(n-1) + O(n)\]

which solves to \(T(n)=O(n)+O(n-1)+ ... +O(1) = O(n^2)\).

Here is a picture:

(5) 4 3 2 1
---------->         O(n)
[4 3 2 1] 5 []
(4) 3 2 1
-------->           O(n-1)
[3 2 1] 4 []
(3) 2 1
------>             ...
[2 1] 3 []
(2) 1
---->
[1] 2 []
(1)
-->                 O(1)
[] 1 []

Clearly \(n + (n-1) + ... + 1=O(n^2)\).

So we established the basic time complexities of quicksort:

best case: \(O(n\log n)\)
worst case: \(O(n^2)\)

Analyzing the average-case complexity is much more involved, and we will save it for later. But as a preview, think about the following questions:

What’s a worst-case input, other than the conventional sorted or inversely sorted arrays? How many worst-case orderings are there among all permutations \((n!)\)?
What’s a best-case input? How many best-case orderings are there among all permutations?

From ``Buggy’’ Quicksort to BSTs

Many years ago when I was teaching at the University of Pennsylvania, one student (after numerous failed debugging attempts) asked me why her quicksort was not working despite looking so “correct”:

def qsort2(a):
    if a == []:
        return []
    pivot = a[0]
    left = [x for x in a if x < pivot]
    right = [x for x in a[1:] if x >= pivot]
    return [qsort2(left)] + [pivot] + [qsort2(right)]

Initially I was puzzled, because it was basically verbatim from my code, but then I realized the output was weird but it contains an intriguing pattern, e.g.:

$ qsort2([4,2,6,3,5,7,1,9])
[[[[], 1, []], 2, [[], 3, []]], 4, [[[], 5, []], 6, [[], 7, [[], 9, []]]]]

What is this weird list actually representing?

It actually encodes a binary search tree (BST), with the first pivot (4) being the root!

                      4
                    /   \
                  2       6
                 / \     / \
                1   3   5   7
                             \
                              9

First, the pivot 4 partitions the array into left=[2,3,1] and right=[6,5,7,9]. Then for the left part (all numbers less than 4), the new pivot 2 divides it into left=[1] and right=[3], and so on and so forth. So each quicksort is implicitly building a BST!

This “buggy” version, with the extra pairs of brackets around the two recursive calls, effectively extracted the hidden BST in this format:

[left_tree, root, right_tree]

where root is a number and left_tree is a similarly encoded BST where all numbers are less than root, and right_tree is also a similarly encoded BST where all numbers are greater than or equal to root. If you found the nested list format hard to parse, we can write a simple “pretty-print” function to visualize the tree using indentation:

def pp(tree, dep=0):
    if tree == []:
        return
    left, root, right = tree
    pp(left, dep+1)
    print(" |" * dep, root)
    pp(right, dep+1)

For example, calling pp(qsort2([4,2,6,3,5,7,1,9]) would print this representation of the BST above:

 | | 1
 | 2
 | | 3
 4
 | | 5
 | 6
 | | 7
 | | | 9

This pp function is a standard in-order traversal which visits the left subtree first, then node, then the right subtree. But if we switch the order of pp(left, ...) and pp(right, ...) (which is called reverse in-order, right-node-left traversal), it will print:

 | | | 9
 | | 7
 | 6
 | | 5
 4
 | | 3
 | 2
 | | 1

which is a 90-degree counterclock-wise rotation of our usual tree above (you just need to turn your head to see it).

This particular BST is balanced, meaning for each node, the heights of the left and right subtrees differ by at most 1. Note that we can also write a recursive definition: A BST is balanced if both subtrees are balanced, and their heights differ by at most 1. Balanced BSTs are great, because their height is \(O(\log n)\) and therefore searching can be done in \(O(\log n)\) time

\[ T(n) = T(n/2) + 1 = O(\log n)\]

just like binary search (in a sorted array).

However, not all BSTs are balanced, and they can be extremely unbalanced when the pivot happens to be the smallest or largest element. The most extreme cases of unbalanced BST become linear chains, e.g., when performing quicksort on already-sorted or inversely-sorted arrays (pp2 is our reverse in-order traversal above):

pp2(qsort2([7,6,5,4,3,2,1]))

 7
 | 6
 | | 5
 | | | 4
 | | | | 3
 | | | | | 2
 | | | | | | 1

Searching in this kind of extreme unbalanced BSTs takes worst-case \(O(n)\) time because each iteration you can only discard one element (as opposed to half of the elements in the balanced case):

\[ T(n) = T(n-1) + 1 = O(n) \]

Summary

Now that we have seen a deep but hidden connection between quicksort and BSTs, I hope you have a much deeper understanding of both topics. Here is a summary:

		balanced	extreme unbalanced
quicksort	pivot	\(O(n \log n)\) time	\(O(n^2)\) time
BST	root	\(O(\log n)\) height	\(O(n)\) height

Small caveat: searching in BST has a best-case complexity of \(O(1)\) since you can be lucky (the root is a match). So we often need to be more specific: “searching for an element not in the BST” has best-case \(O(\log n)\) and worst-case \(O(n)\) complexities.

Historical Notes

Quicksort was invented by the legendary British computer scientist and Turing Award winner Tony Hoare (who also invented many other things such as Hoare Logic, and amazingly is still alive as of this writing!). But interestingly, he did it while studying machine translation as a visiting student in Moscow in 1959 under the legendary Soviet mathematician Andrey Kolmogrov. Hoare published this algorithm in 1961 after returning to the UK.

The “buggy qsort” was accidentally discovered by my former student Netta Doron in 2006 when she took my CSE 399 Python Programming course at the University of Pennsylvania. This was such a great discovery. I don’t think any one could’ve discovered it intentionally.