Sorting is the most studied problem in computer science. Efficient
sorting underpins databases, file systems, search engines, compression
algorithms, and dynamic programming. Understanding when to use O(n²)
insertion sort vs. O(n log n) merge sort vs. O(n) counting sort — and
what stability and in-place mean — is essential for writing performant
code.
1. Comparison Sort Lower Bound
A comparison sort decides ordering solely by comparing elements (a
< b?). Decision tree argument: • n! possible orderings of n
elements • Binary decision tree has height ≥ ⌈log₂(n!)⌉ • By Stirling:
log₂(n!) ≈ n log₂ n - n log₂ e ≈ Θ(n log n) Therefore: ANY
comparison-based sorting algorithm requires at least Ω(n log n)
comparisons in the worst case. Merge sort and heap sort achieve O(n
log n) worst-case. Quicksort is O(n log n) average, O(n²) worst case.
Definition: stable sort preserves relative order of equal elements.
Example: sorting by last name should keep John Smith before Jane Smith
if they were already in that order.
2. Simple O(n²) Sorts
Insertion Sort
Best suited for: Nearly-sorted data, small arrays (n
< 50), online data (sort as elements arrive). O(n) best case on
sorted input.
functioninsertionSort(arr) {
for (let i = 1; i < arr.length; i++) {
const key = arr[i];
let j = i -1;
while (j >= 0&& arr[j] > key) {
arr[j +1] = arr[j];
j--;
}
arr[j +1] = key;
}
return arr;
}
Bubble Sort
O(n²) worst/average, O(n) best. Simple but rarely used in production —
insertion sort has better constant factors for small n.
Selection Sort
Finds the minimum element and places it at the front. O(n²) always.
Notable for minimizing swaps (only n swaps total) — useful
when writes are expensive (e.g., flash memory).
3. Merge Sort
Divide-and-conquer: split in half, recursively sort each half, merge
two sorted halves.
Recurrence: T(n) = 2T(n/2) + O(n) By Master Theorem (a=2, b=2, f(n)=n,
n^log_b(a) = n^1): Case 2: T(n) = O(n log n) Properties: • O(n log n)
guaranteed — best, average, worst • Stable (equal elements preserve
relative order) • NOT in-place: requires O(n) auxiliary space for
merge buffer • Cache-friendly sequential access patterns Merge
operation: function merge(arr, left, mid, right): copy arr[left..mid]
to L[], arr[mid+1..right] to R[] i=0, j=0, k=left while i <
L.length and j < R.length: if L[i] <= R[j]: arr[k++] = L[i++]
else: arr[k++] = R[j++] copy remaining elements
Bottom-up merge sort avoids recursion overhead by
merging runs of size 1, 2, 4, 8, ... — used by Java's
Arrays.sort and Python's Timsort (which exploits existing
sorted runs for O(n) on nearly-sorted input).
4. Quicksort
Idea: choose a pivot; partition array so all smaller elements are
left, all larger are right; recurse on each side. T(n) = T(k) +
T(n-1-k) + O(n) where k = pivot rank Best/average case (pivot near
median, k ≈ n/2): T(n) = 2T(n/2) + O(n) = O(n log n) Worst case (pivot
is always min or max, e.g. sorted input, bad pivot): T(n) = T(n-1) +
O(n) = O(n²) Mitigation strategies: • Median-of-three: compare first,
middle, last element; use median as pivot • Random pivot: random
permutation before sorting, or random pivot selection → expected O(n
log n) regardless of input • Introsort (C++ std::sort): quicksort +
fallback to heapsort if recursion depth exceeds 2 log n → guaranteed
O(n log n) worst case Partition (Lomuto scheme): pivot = arr[right] i
= left - 1 for j = left to right-1: if arr[j] <= pivot: i++,
swap(arr[i], arr[j]) swap(arr[i+1], arr[right]) return i+1 Quicksort
virtues: • O(1) extra space (in-place) • Excellent cache behavior
(sequential access) • Small constant: ~2× faster than merge sort in
practice for random data
5. Heap Sort
Uses a binary max-heap (parent ≥ children). In an array, parent of i =
⌊(i-1)/2⌋, children: 2i+1, 2i+2. Algorithm: 1. Build max-heap from
array: O(n) (using Floyd's buildHeap, not n inserts) Start from last
non-leaf ⌊n/2⌋-1 down to 0, sift down each node. 2. Repeatedly extract
max (swap root with last, shrink heap, sift down): O(n log n)
Sift-down operation: O(log n) — compare with children, swap if smaller
Properties: • O(n log n) worst case — guaranteed, unlike quicksort •
In-place — O(1) space • NOT stable (long-range swaps disrupt relative
order) • Poor cache behavior (random access pattern across array) • In
practice 2–5× slower than quicksort due to cache misses Build-heap
O(n) proof: ∑_{k=0}^{⌊log n⌋} ⌈n/2^(k+1)⌉ × k ≤ n ∑_{k=0}^∞ k/2^k = 2n
= O(n)
6. Linear-Time Sorts
By avoiding element-to-element comparisons and exploiting structure in
the data, some algorithms sort in O(n) time — beating the Ω(n log n)
comparison lower bound:
Counting Sort — O(n + k)
For integers in range [0, k]. Count occurrences, then compute prefix
sums to determine final positions. O(n + k) time and space. Stable.
Practical when k = O(n).
Radix Sort — O(d·(n + k))
Sort by least-significant digit first (LSD radix sort), using a stable
sort (counting sort) at each of d digit positions. Digits in base b: d
= ⌈log_b(max value)⌉. Choose b = n for O(n) total with b-digit
representation. Used in GPU sorting, distributed systems.
Bucket Sort — O(n) average
For uniformly distributed floats in [0,1). Create n buckets covering
[0,1); scatter each element into its bucket; sort each bucket with
insertion sort; concatenate. O(n) expected if input is uniform. Worst
case O(n²) if all elements fall in one bucket.
7. Comparison Table
Algorithm Best Average Worst Space Stable In-place
─────────────────────────────────────────────────────────────────────
Bubble O(n) O(n²) O(n²) O(1) Yes Yes Selection O(n²) O(n²) O(n²) O(1)
No Yes Insertion O(n) O(n²) O(n²) O(1) Yes Yes Merge O(n lg n) O(n lg
n) O(n lg n) O(n) Yes No Quicksort O(n lg n) O(n lg n) O(n²) O(lg n)No
Yes Heap Sort O(n lg n) O(n lg n) O(n lg n) O(1) No Yes Timsort O(n)
O(n lg n) O(n lg n) O(n) Yes No Counting O(n+k) O(n+k) O(n+k) O(k) Yes
No Radix (LSD) O(dn) O(dn) O(dn) O(n+k) Yes No Bucket O(n) O(n) O(n²)
O(n) Yes No When to use what: n < 50: Insertion sort (simple, fast
for small n) General purpose: Timsort (Python, Java) / Introsort (C++)
Stability needed: Merge sort / Timsort Memory constrained:Heap sort /
In-place merge Integer keys: Radix sort / Counting sort Nearly sorted:
Insertion sort / Timsort