C5 Linear time Sorting

Dated: 13-01-2025

Counting Sort

Imagine 3 arrays:

\(A[1 .. n]\) containing the initial inputs.
\(B[1 .. n]\) containing the sorted outputs.
\(C[1 .. k]\) containing the ranks such that \(C[x]\) is the rank of \(x\) in \(A\) where \(x \in [1 .. k]\).

The \(C[1 .. k]\) is constructed in following way:

The \(C[1 .. k]\) is initialized with \(0\).
We run across \(A\) using index \(j \in [1 .. n]\).
Increment each \(C[A[j]]\).

Now index of each entry in \(C\) represents the value in \(A\) and the entry represents the rank of that value.

Run across \(C\) and do \(C[i] = C[i] + C[i - 1]\).

Structure of \(C\)

Consider an example where

\[C = \{2, 2, 2, 2, 1, 0, 2\}\]

\[\rightarrow\]

\[C = \{2, 4, 6, 8, 9, 9, 11\}\]

The algorithm works as follows: - Run across \(A\) using index \(j \in [1 .. n]\) backwards, let each value be \(x\). - \(C[x]\) is the indexed cell within \(B\) which we want to select. - \(x\) is the value placed inside the selected cell. - Decrement \(C[x]\) after placing \(x\) in \(B[C[x]]\).

counting_sort(array A, array B, int k) {
    for i <- 1 to k
    do C[i] <- 0 // k times
    for j <- 1 to length[A]
    do C[A[j]] <- C[A[j]] + 1 // n times
    // C[i] now contains the number of elements = i
    for i <- 2 to k
    do C[i] <- C[i] + C[i - 1] // k times
    // C[i] now contains the number of elements <= i
    for j <- length[A] downto 1
    do B[A[j]] <- A[j]
        C[A[j]] <- C[A[j]] - 1 // n times
}

Pasted image 20250114073236.png — Initial `A` and `C` `array`

Pasted image 20250114073405.png — A[1] = 7 processed

Pasted image 20250114073449.png — A[2] = 1 processed

Pasted image 20250114073750.png — A[3] = 3 processed

Pasted image 20250114073949.png — A[4] = 1 processed

Pasted image 20250114074023.png — A[5] = 2 processed

Pasted image 20250114074130.png — `C` now contains count of elements of `A`

Pasted image 20250114074254.png — `C` set to `rank` each number of `A`

Pasted image 20250114074400.png — A[11] placed in output of `B`

Pasted image 20250114074452.png — A[10] placed in output `array` `B`

Pasted image 20250114074557.png — A[9] placed in output `array` `B`

Pasted image 20250114074653.png — A[8] placed in output `array` `B`

Pasted image 20250114074734.png — A[7] placed in output `array` `B`

Pasted image 20250114074813.png — A[6] placed in output `array` `B`

Pasted image 20250114074903.png — A[5] placed in output `array` `B`

Pasted image 20250114074955.png — A[4] placed in output `array` `B`

Pasted image 20250114075036.png — A[3] placed in output `array` `B`

Pasted image 20250114075151.png — A[2] placed in output `array` `B`

Pasted image 20250114075235.png — `B` now contains the final sorted data

Counting Sort is not in-place¹ but is stable.¹ Here is an illustration.

Pasted image 20250130194929.png — `4` at right side within \(A\) is also sorted in right side in \(B\).

Pasted image 20250130195057.png — `4` at left side within \(A\) is also sorted in left side in \(B\).

Pasted image 20250130195149.png — `Stability`¹ of counting sort.

Bucket or Bin Sort

BucketSort(array A, int n, int M) {
// Pre-condition: for 1 <= i <= n, 0 <= a[i] < M
// Mark all the bins empty
for i <- 1 to M
do bin[i] <- Empty
for i <- 1 to n
do bin[A[i]] <- A[i]
}

If there are \(n\) keys then placing all the keys in appropriate bins is going to take \(O(n)\) operations.
If there were \(m\) bins then we would also need to figure out if there are elements in each bin or not.
This would take \(O(m)\) operations.
So the whole algorithm's time is:

\[T(n) = c_1n + c_2m\]

If \(m \le n\) then the running time is \(O(n)\).
If \(m >> n\) then the running time is \(O(m)\).

Pasted image 20250130200018.png — Placing keys in bins in sorted order

Pasted image 20250130200107.png — Concatenate the lists

Pasted image 20250130200159.png — Final sorted sequence

Radix Sort

The limitation with counting sort is that it works great if \(k\) is small but if \(k\) is in millions then the rank would also be in millions but radix sort provides a nice work around.

Pasted image 20250130200835.png — Radix sort

RadixSort(array A, int n, int d) {
    for i <- 1 to d
    do stably sort A with respect to ith lowest order digit
}

References

Read more about in-place and stable algorithms. ↩↩↩