C5 Linear time Sorting
Dated: 13-01-2025
Counting Sort
Imagine 3 arrays
:
- \(A[1 .. n]\) containing the initial inputs.
- \(B[1 .. n]\) containing the sorted outputs.
- \(C[1 .. k]\) containing the
ranks
such that \(C[x]\) is therank
of \(x\) in \(A\) where \(x \in [1 .. k]\).
The \(C[1 .. k]\) is constructed in following way:
- The \(C[1 .. k]\) is initialized with \(0\).
- We run across \(A\) using index \(j \in [1 .. n]\).
- Increment each \(C[A[j]]\).
Now index of each entry in \(C\) represents the value in \(A\) and the entry represents the rank
of that value.
- Run across \(C\) and do \(C[i] = C[i] + C[i - 1]\).
Structure of \(C\)
Consider an example where
The algorithm works as follows: - Run across \(A\) using index \(j \in [1 .. n]\) backwards, let each value be \(x\). - \(C[x]\) is the indexed cell within \(B\) which we want to select. - \(x\) is the value placed inside the selected cell. - Decrement \(C[x]\) after placing \(x\) in \(B[C[x]]\).
counting_sort(array A, array B, int k) {
for i <- 1 to k
do C[i] <- 0 // k times
for j <- 1 to length[A]
do C[A[j]] <- C[A[j]] + 1 // n times
// C[i] now contains the number of elements = i
for i <- 2 to k
do C[i] <- C[i] + C[i - 1] // k times
// C[i] now contains the number of elements <= i
for j <- length[A] downto 1
do B[A[j]] <- A[j]
C[A[j]] <- C[A[j]] - 1 // n times
}
Initial A
and C
array
A[1] = 7 processed
A[2] = 1 processed
A[3] = 3 processed
A[4] = 1 processed
A[5] = 2 processed
C
now contains count of elements of A
C
set to rank
each number of A
A[11] placed in output of B
A[10] placed in output array
B
A[9] placed in output array
B
A[8] placed in output array
B
A[7] placed in output array
B
A[6] placed in output array
B
A[5] placed in output array
B
A[4] placed in output array
B
A[3] placed in output array
B
A[2] placed in output array
B
B
now contains the final sorted data
Counting Sort is not in-place
1 but is stable
.1 Here is an illustration.
4
at right side within \(A\) is also sorted in right side in \(B\).
4
at left side within \(A\) is also sorted in left side in \(B\).
Stability
1 of counting sort.
Bucket or Bin Sort
BucketSort(array A, int n, int M) {
// Pre-condition: for 1 <= i <= n, 0 <= a[i] < M
// Mark all the bins empty
for i <- 1 to M
do bin[i] <- Empty
for i <- 1 to n
do bin[A[i]] <- A[i]
}
If there are \(n\) keys then placing all the keys in appropriate bins is going to take \(O(n)\) operations.
If there were \(m\) bins then we would also need to figure out if there are elements in each bin or not.
This would take \(O(m)\) operations.
So the whole algorithm's time is:
If \(m \le n\) then the running time is \(O(n)\).
If \(m >> n\) then the running time is \(O(m)\).
Placing keys in bins in sorted order
Concatenate the lists
Final sorted sequence
Radix Sort
The limitation with counting sort is that it works great if \(k\) is small but if \(k\) is in millions then the rank
would also be in millions but radix sort
provides a nice work around.
Radix sort
RadixSort(array A, int n, int d) {
for i <- 1 to d
do stably sort A with respect to ith lowest order digit
}