Quicksort

History

While studying at Moscow State University, Tony Hoare received an offer of employment from the National Physical Laboratory (NPL) to work on a new project for machine translation from Russian to English. However, because dictionaries were stored on magnetic tape, he would have needed to sort the words of a sentence into alphabetical order before translation.

Hoare thought of two methods to solve this problem. The first method would have taken an amount of time proportional to the square of the length of the sentence. The second method would later manifest as quicksort. At that time, he only knew one language, Mercury Autocode. Unfortunately, he was not able to successfully code quicksort using Mercury Autocode.

In 1961, Hoare attended an Algol 60 class in Brighton. Algol 60 allowed for recursion (the ability of a procedure to call itself). During this course, Hoare programmed an ultra-fast sorting algorithm now known as quicksort. His first paper on quicksort was also published in 1961, with another following in 1962.

The Algorithm

Some modifications have been made to quicksort since its development. However, the basic algorithm works as follows:

  1. Choose an element in the list - this element serves as the pivot. Set it aside (e.g. move it to the beginning or end).
  2. Partition the array of elements into two sets - those less than the pivot and those greater than the pivot (note: less than and greater refers to integers, but quicksort can also be used to sort other types of elements such as strings).
  3. Repeat steps 1 and 2 on each of the two resulting partitions until each set has one or less elements.
In his paper, Hoare establishes a method to partition each array using lower and upper pointers. The lower pointer starts at the left and moves right until it finds an element greater than the pivot, while the upper pointer starts at the right and moves left until it finds an element less than the pivot. When each pointer has stopped, the two elements are swapped. The process continues until the two pointers cross each other. Once that has happened, the number pointed to by the lower pointer is swapped with the pivot. The two partitions are then sorted again using quicksort.

Classification

Divide-and-conquer
Quicksort is a divide-and-conquer sorting algorithm - that is, it takes a sorting problem and breaks it down into sub-problems, which are in turn broken down into more sub-problems. This is achieved using recursive programming, where a procedure calls itself.

In-place
Quicksort does not need additional memory to store elements as they are sorted, and only requires additional memory to keep track of the recursion and the pointers.

Not stable
Stability refers to an algorithm's ability to maintain the relative order of elements with equal value. Algorithms that do not do so, such as quicksort, are considered not stable.

Complexity

Big O (O stands for "order of") notation is used to approximate the relationship between the number of elements and resource usage (time or space). Big O is only concerned with what happens for large values of n (e.g. in an n^2 - n algorithm, the n is dropped and the algorithm is classified as O(n^2)).

The average case scenario for quicksort is O(n log n). However, the worst case scenario is O(n^2). Click here for a diagram. Source

Example

For a walk-through example, click here.
For a graphical example, click here.

Ideal Uses

When a data set is already sorted and either the first or last element is selected as the pivot, the worst case scenario of O(n^2) occurs. Therefore, if one needs guaranteed performance, quicksort may not be the best choice. However, the probability of O(n^2) occurring is very rare. The quadric time case usually results from poor pivot choice.

However, due to this limitation, quicksort shouldn't be used in applications that require a guaranteed response time such as in life-critical and mission-critical situations.

Quicksort is ideal for use with large data sets and/or when memory is constrained. Generally, commercial applications use quicksort because it is speedy and does not require much additional memory.

Modifications

Quicksort as introduced by Hoare initially was not perfect. Over the years, many have fidgeted with quicksort to adjust for its weaknesses:

Sorting small partitions
Some have argued that when partitions are small enough, using quicksort is actually less efficient than using another sorting method such as insertion sort. What qualifies as “small enough” has been debated, but a quicksort algorithm could be programmed to switch to another sorting algorithm when partitions reach a certain constant size m.

Choosing a pivot value
The worse case scenario of O(n^2) can happen when a bad pivot is chosen. One method is to always use the same element in a set (e.g. the first, middle, or last element), but there is a risk that the chosen element is the first or last value in a set (this results in a sort that takes O(n^2) time – the worst case scenario). Another possible solution is picking the median from a set, but this adds complexity. A compromise is to choose three elements from the set (e.g. the first, middle and last) and use the median element as the pivot.

James Barbetti analyzes some other issues with quicksort, such as handling duplicated values and building a stable sort, in his essay, "Quicksort is broken. Is it worth fixing?".

Comparisons

For a test between insertion sort, shell sort, heap sort, merge sort and quick sort, click here.

Sources & Further Reading

The text and images above are a compilation from the following sources:
Sorting Algorithms: Quicksort
Sorting (Chapter 8)
Quicksort

For further reading, please visit the resources page.