Chapter 12.1 Why study Sorting algorithms?

Given a collection, the goal is to rearrange the elements so that they are ordered from smallest to largest

kkk \nless k(Irreflexive)
if  k1<k2,  and  k2<k3,  then  k1<k3 \text{if} \; k_{1} < k_{2}, \; \text{and} \; k_{2} < k_{3}, \; \text{then} \; k_{1} < k_{3} (Transitive)

12.2 Merge-Sort

Divide and Conquer

The first two algorithms in this chapter use recursion in an algorithmic design patterns called divide and conquer. This pattern consists of the following three steps:

  1. Divide: If the input size is smaller than a certain threshold, solve the problem directly using a straightforward method and return the solution so obtained. Otherwise, divide the input data into two or more disjoint subsets
  2. Conquer: Recursively solve the subproblems associated with the subsets
  3. Combine: Take the solutions to the subproblems and merge them into a solution to the original problem.

Implementing into an algorithm:

  1. Divide: If SS has zero or one elements, return SS (as it is already sorted). Otherwise, remove all the elements from SS and put them into two sequences, S1S_{1} and S2S_{2}, each containing about half of the elements of S.
  2. Conquer: Recursively sort sequences of S1S_{1} and S2S_{2}.
  3. Combine: Put back the elements into SS by merging the sorted sequences S1S_{1} and S2S_{2} into a sorted sequence, i.e. n/2\lfloor n/2 \rfloor elements of S1S_{1} and n/2\lceil n/2 \rceil elements of S2S_{2}.

Each node of TT represents a recursive invocation of the merge-sort algorithm. We associate with each node vv of TT the sequence SS that is processed by the invocation associated with vv.

The algorithm visualisation in terms of the merge-sort tree helps us analyse the running time of the merge-sort algorithm. Each node of the tree represents a recursive call of merge-sort.

Alt text

Alt text

Implementation of Merge-Sort

def merge(S1, S2, S): """Merge two sorted Python lists S1 and S2 into properly sized list S""" i = j = 0 while i + j < len(S): if j == len(S2) or (i < len(S1) and S1[i] < S2[j]): S[i + j] = S1[i] i += 1 else: S[i + j] = S2[j] j += 1 def merge_sort(S): """Sort the elements of Python list S using the merge-sort algorithm.""" n = len(S) if n < 2: return # list is already sorted # divide mid = n // 2 S1 = S[0:mid] # copy of first half S2 = S[mid:n] # copy of second half # conquer (with recursion) merge_sort(S1) # sort copy of first half merge_sort(S2) # sort copy of second half # merge results merge(S1, S2, S) # merge sorted halves back into S

A step of the merge process is illustrated in the following figure:

Alt text

Running time of Merge-Sort

Let n1n_{1} and n2n_{2} be the number of elements of S1S_{1} and S2S_{2}, respectively. It is clear that the operations performed inside each pass of the while loop (of function merge) take O(1)O(1) time.

To determine the the complexity of the merge-sort algorithm, we account for the time to divide the sequence into two subsequences, and the call to merge to combine the two sorted sequences, but we exclude the two recursive calls to merge-sort.

Observations:

Given our definition of 'time spent at a node', the running time of merge-sort is equal to the sum of the times spent at the nodes of TT. Observe TT has exactly 2i2^{i} nodes at depth ii.

Alt text

Using recurrence equations to determine the time complexity of merge-sort

Let the function t(n)t(n) denote the worst-case running time of merge-sort on an input sequence of size n. We can characterise function t(n)t(n) by means of an equation where the function t(n)t(n) is recursively expressed in terms of itself.

In order to simplify the characterisation of t(n)t(n), let us restrict our attention to the case when nn is a power of 2. Thus:

t(n)={bif n12t(n/2)+cn)otherwise, for some constants b,cZ+ t(n) = \begin{cases} b & \text{if } n \le 1 \\ 2t(n/2) + cn) & \text{otherwise, for some constants } b, c \in \mathbb{Z^{+}} \end{cases}

Observe that the function t(n)t(n) appears on the left and right hand sides of the equal sign, due to the design of the recurrence relation. However, we truly desire a Big-O characterisation of t(n)t(n), i.e. a closed form characterisation of t(n)t(n).

This can be obtained by applying the definition of a recurrence relation, assuming nn is relatively large.

t(n)=2(2t(n/22)+cn/2)+cn=22t(n/22)+2cn/2+cn=22t(n/22)+2cn\begin{align} t(n) &= 2(2t(n/2^{2})+cn/2) + cn \\ &= 2^{2}t(n/2^{2})+2cn/2 + cn = 2^{2}t(n/2^{2}) + 2cn \\ \end{align}

If we apply the equation again, we get:

t(n)=23t(n/23)+3cn\begin{align} t(n) &= 2^{3}t(n/2^{3}) + 3cn \\ \end{align}

And applying the equation three times, obtain:

t(n)=2it(n/2i)+icn\begin{align} t(n) &= 2^{i}t(n/2^{i}) + icn \\ \end{align}

The issue remains, then, to determine when to stop this process. To see when to stop, recall that we switch to the closed for t(n)=bt(n) = b when n<1n<1, which will occur when i=logni= \log n. Thus,

t(n)=2lognt(n/2logn)+(logn)cn=nt(1)+(logn)cn=nb+cn(logn)\begin{align} t(n) &= 2^{\log n}t(n/2^{\log n }) + (\log n) cn \\ &= nt(1) + (\log n) cn \\ &= nb + cn (\log n) \\ \end{align}

Thus, we get an alternative justification of the fact that t(n)t(n) is O(nlogn)O(n \log n).

Quick Sort

High level description of Quick-Sort:

The main idea is to apply a divide-and-conquer technique, and then combine the sorted subsequences by a simple concatenation.

  1. Divide: If S has at least two elements, select a specific elements xx from SS, which is called the pivot. As is common practise, choose the pivot xx to be the last element in SS. Remove all elements from SS and put them into three sequences
    • LL, storing the elements in SS less than xx
    • EE, storing the elements in SS equal to xx
    • GG, storing the elements in SS greater than xx If the elements of SS are distinct, then EE only holds one element - the pivot itself.
  2. Conquer: Recursively sort sequences LL and GG
  3. Combine: Put back the elements into SS in order by first inserting the elements of LL, then those of EE, and finally those of GG.

Alt text

Unlike merge sort however, the height of the quick sort tree is linear in the worst case. This happens if the sequence consists of nn distinct elements and is already sorted.

Running time of quick sort

Randomised quick sort

Picking pivots at random

Additional Optimisations for Quick sort

An algorithm is in-place if it uses only a small amount of memory in addition to that needed for the original input.

Our implementation of quick-sort does not qualify as in-place because we use additional containers LL, EE and GG when dividing a sequence SS within each recursive call.

Quick-sort of an array-based seuqence can be adapted to be in-place, and such an optimisation is used in most deployed implementations.

def inplace quick sort(S, a, b): """Sort the list from S[a] to S[b] inclusive using the quick-sort algorithm.""" if a >= b: return # range is trivially sorted pivot = S[b] # last element of range is pivot left = a # will scan rightward right = b−1 # will scan leftward while left <= right: # scan until reaching value equal or larger than pivot (or right marker) while left <= right and S[left] < pivot: left += 1 # scan until reaching value equal or smaller than pivot (or left marker) while left <= right and pivot < S[right]: right −= 1 if left <= right: # scans did not strictly cross S[left], S[right] = S[right], S[left] # swap values left, right = left + 1, right − 1 # shrink range # put pivot into its final place (currently marked by left index) S[left], S[b] = S[b], S[left] # make recursive calls inplace quick sort(S, a, left − 1) inplace quick sort(S, left + 1, b)

The in-place quick-sort modifies the input sequence using element swapping and does not explicitly create subsequences. Instead, the subsequence is implicitly represented by a range of positions specified by a leftmost and rightmost index aa and bb, respectively.

Although the implementation described in the section for dividing the sequence into two pieces is in-place, we note that the complete quick-sort algorithm needs space for a stack proportional to the depth of the recursion tree.