Search lands in PR-5.1 (Pagefind).

Explanation Intermediate

Chapter 1 Updated

Heaps — Fundamentals

Heap property, array-level-order layout, and the mental shortcuts behind every priority-queue problem.

  • Full 14m
  • Revision
  • Flow

What is a Heap, Really?

A heap is a specialized tree-based data structure that satisfies one simple rule — the heap property — at every single node:

VariantParent–child ruleRoot always holds
Min-Heapevery parent both childrenthe smallest element
Max-Heapevery parent both childrenthe largest element

Critically, a heap is not a sorted structure. Siblings have no defined order between them; the only ordering guarantee is between a parent and its children. That weaker invariant is exactly why heaps are fast — we only maintain enough order to find the extremum, nothing more.

Structural property

A binary heap is always a complete binary tree: every level is fully filled except possibly the last, which fills left-to-right. This compactness is what lets us store a heap in a plain array with no pointers at all.

The priority-queue connection

When an interviewer says “priority queue”, they usually mean “heap”. A priority queue is the abstract concept — “give me the highest-priority thing next” — and a binary heap is the concrete, efficient implementation. Java exposes PriorityQueue; Python exposes heapq; the mechanism underneath is the same.

Index Arithmetic — The One Mental Model

Because a heap is a complete binary tree, we can flatten it into an array and navigate between parents and children using only arithmetic — no pointers, no node objects.

Min-heap laid out two ways
               [1]  i=0
              /   \
            [3]    [6]           array (level-order):
           i=1    i=2
           /  \   /  \             idx:  0  1  2  3  4  5  6
         [5] [9] [8] [7]           val:  1  3  6  5  9  8  7
         i=3 i=4 i=5 i=6                  ^root
Formula (0-indexed)Expression
parent(i)(i − 1) // 2
left(i)2i + 1
right(i)2i + 2
is leaf?i ≥ n // 2
height⌊log₂(n)⌋

Verify with the diagram: node at index 1 (value 3) has parent at (1−1)//2 = 0 (value 1). Its children sit at 2·1+1 = 3 (value 5) and 2·1+2 = 4 (value 9). Every parent is its children, so the min-heap property holds.

Heap Operations & Complexities

OperationWhat it doesTimeMechanism
peek() / top()Look at root without removingO(1)Read index 0
push(x)Add while preserving invariantO(log n)Append at end, sift up
pop()Remove and return the rootO(log n)Swap root with last, shrink, sift down
heapify(arr)Turn arbitrary array into a heapO(n)Sift-down from last non-leaf to root (Floyd’s method)
replace(x)Pop root and push new in one shotO(log n)Overwrite root, single sift-down
search for arbitrary valueFind a non-root elementO(n)No sibling ordering — full scan

Sift-up vs sift-down

All heap mutation reduces to two primitives. When a value arrives at the bottom and might be too small (for a min-heap), we sift up — compare with parent, swap if smaller, repeat. When a value arrives at the top and might be too large, we sift down — compare with the smaller child, swap, repeat. Each sift costs at most O(log n) because a complete binary tree has that height.

How to Identify a Heap Problem

The single strongest signal is the letter K paired with a superlative word. If the problem asks for “top K”, “K-th something”, “K closest”, “K most frequent”, or any variation where k is a small, specific count — reach for a heap first.

Keyword radar
Strong signals:   Kth largest   Kth smallest   top K
                  K closest     K most frequent
 
Supporting cues:  smallest   largest   closest   most/least frequent
                  priority   schedule  merge K   median of stream
                  minimum cost to combine

Choosing the right heap — counterintuitive but crucial

The rule most beginners get backwards: to find the K largest, you use a min-heap of size K. It feels wrong. Here’s why it’s right. If you maintain a min-heap of size K as you stream through all n elements, the root is always the smallest of your current top K. When a new element arrives bigger than the root, it belongs in the top K — so you pop the root and push the new one. At the end, the heap contains exactly the K largest, and the root is the K-th largest. Symmetrically, to find the K smallest, use a max-heap of size K.

Problem asks for…UseWhy
K largest / top KMin-heap of size KRoot is smallest of best-so-far; easy to evict
K smallestMax-heap of size KRoot is largest of best-so-far; easy to evict
K closest to targetMax-heap of size K on distanceTreat distance as “smallness”; farthest at root
K most frequentMin-heap of size K on frequencyLow-frequency root gets evicted
Repeatedly combine smallest pairMin-heap (no size cap)Always grab the two smallest in O(log n)
Running medianTwo heaps (max-left, min-right)Balance halves; medians are at both roots

Quick decision flow

Why Heap Beats Sorting

Sorting is O(n log n). A heap-of-size-K approach is O(n log k). When k is much smaller than n, the gap is enormous.

Sort vs heap-of-K
Sorting, then slicing:            Heap-of-K:
  nums.sort()         # O(n log n)  for x in nums:     # n iterations
  answer = nums[-k:]  # last k        heappush(h, x)   # O(log k)
                                      if len(h) > k:
                                          heappop(h)    # O(log k)

Python & TypeScript Tooling

Python’s heapq

Python ships with a min-heap implementation in the standard library. Everything operates on a plain list, in place.

CallEffectComplexity
heapq.heappush(h, x)Add x to heap hO(log n)
heapq.heappop(h)Remove and return the smallestO(log n)
heapq.heapify(h)Turn list into a heap, in placeO(n)
heapq.heappushpop(h, x)Push then pop — one operationO(log n)
heapq.heapreplace(h, x)Pop then push — item must existO(log n)
heapq.nlargest(k, it)K largest from iterableO(n log k)
heapq.nsmallest(k, it)K smallest from iterableO(n log k)

TypeScript — rolling your own

TypeScript/JavaScript has no built-in priority queue. A reusable Heap class with a custom comparator is interview-essential — it handles min-heap, max-heap, or any ordering over tuples. Every pattern in this season reuses this class.

heap.ts (shared across every pattern)
// Comparator returns < 0 if `a` has higher priority than `b`.
class Heap<T> {
  private data: T[] = [];
  constructor(private cmp: (a: T, b: T) => number, init: T[] = []) {
    this.data = [...init];
    for (let i = (this.data.length >> 1) - 1; i >= 0; i--) this.siftDown(i);
  }
  get size(): number { return this.data.length; }
  peek(): T | undefined { return this.data[0]; }
 
  push(v: T): void {
    this.data.push(v);
    this.siftUp(this.data.length - 1);
  }
  pop(): T | undefined {
    if (this.data.length === 0) return undefined;
    const top = this.data[0];
    const last = this.data.pop()!;
    if (this.data.length > 0) { this.data[0] = last; this.siftDown(0); }
    return top;
  }
  private siftUp(i: number): void {
    while (i > 0) {
      const p = (i - 1) >> 1;
      if (this.cmp(this.data[i], this.data[p]) < 0) {
        [this.data[i], this.data[p]] = [this.data[p], this.data[i]];
        i = p;
      } else break;
    }
  }
  private siftDown(i: number): void {
    const n = this.data.length;
    while (true) {
      const l = 2 * i + 1, r = 2 * i + 2;
      let best = i;
      if (l < n && this.cmp(this.data[l], this.data[best]) < 0) best = l;
      if (r < n && this.cmp(this.data[r], this.data[best]) < 0) best = r;
      if (best === i) break;
      [this.data[i], this.data[best]] = [this.data[best], this.data[i]];
      i = best;
    }
  }
}
 
const minHeap = <T extends number | string>() =>
  new Heap<T>((a, b) => a < b ? -1 : a > b ? 1 : 0);
const maxHeap = <T extends number | string>() =>
  new Heap<T>((a, b) => a < b ? 1 : a > b ? -1 : 0);

The Five Canonical Templates

Every heap problem is a recombination of five templates. Learn these cold — everything else is decoration.

① Top-K

Maintain a heap capped at size K, with the opposite polarity of what you want (min-heap for K-largest, max-heap for K-smallest). Time O(n log k), space O(k).

Top-K template
h = []
for x in stream:
    heapq.heappush(h, x)
    if len(h) > k:
        heapq.heappop(h)          # evict the worst candidate
# h now holds the answer; h[0] is the K-th boundary element

② Two Heaps

Split a stream into a smaller half (max-heap) and a larger half (min-heap). Keep sizes within 1. Median + percentile-style problems. Each insert is O(log n); peek is O(1).

Two-heaps template
low, high = [], []                 # max-heap (negated), min-heap
def add(x):
    if not low or x <= -low[0]:
        heapq.heappush(low, -x)
    else:
        heapq.heappush(high, x)
    if len(low) > len(high) + 1:
        heapq.heappush(high, -heapq.heappop(low))
    elif len(high) > len(low):
        heapq.heappush(low, -heapq.heappop(high))

③ K-Way Merge

Merge K sorted sources. Push the head of each into a min-heap. Pop the smallest, advance that source, push its next. Classic for merging lists, matrix rows, or streams.

K-way merge template
h = [(src[0], i, 0) for i, src in enumerate(sources) if src]
heapq.heapify(h)
out = []
while h:
    val, si, idx = heapq.heappop(h)
    out.append(val)
    if idx + 1 < len(sources[si]):
        heapq.heappush(h, (sources[si][idx + 1], si, idx + 1))

④ Greedy-with-Heap

Each step, greedily pick the best (or worst) remaining option. Heap is a priority inbox that keeps updating. Covers Huffman-style combining and many optimization problems.

Greedy-combine template
heapq.heapify(h)
while len(h) >= 2:
    a = heapq.heappop(h)
    b = heapq.heappop(h)
    combined = f(a, b)             # problem-specific
    heapq.heappush(h, combined)

⑤ Scheduling-with-Deadline

Sort events by deadline; as you scan, keep a heap of the K best things taken so far where K is constrained by the current time. Evict the weakest when capacity is exceeded.

Scheduling-with-deadline template
events.sort(key=lambda e: e.deadline)
h = []
for e in events:
    heapq.heappush(h, e.value)
    if len(h) > e.deadline:        # capacity exceeded
        heapq.heappop(h)           # drop least-valuable
answer = sum(h)

Tips, Tricks & Pitfalls

Tricks the pros use

  • Negate for max-heap in Python. -val on push, -heappop() on pop. For tuple keys, negate only the comparison key: (-freq, word).
  • Use tuples for custom priority. heappush(h, (priority, tie_breaker, payload)). Tie-breakers matter — Python will compare the next tuple element on a tie; make sure that’s meaningful or use an incrementing counter.
  • Prefer heappushpop over push+pop. When the heap is at size K and a new element arrives, heappushpop(h, x) is one operation and avoids an unnecessary sift.
  • Build with heapify, not individual pushes. If you have all data upfront, heapify(list) is O(n) — far better than n pushes at O(n log n).
  • Lazy deletion for middle-removal. If you need to “remove” an element that isn’t the root, don’t; mark it stale in a hash-map and skip stale items when they reach the top. Standard trick in sliding-window heap problems.
  • Heaps + hash maps beat heaps alone when you need frequency or index lookups. Hash map answers “how many?” in O(1); heap answers “who’s on top?” in O(log n).

Common pitfalls

  • Wrong heap polarity. “Find K largest” → min-heap (not max-heap). If you catch yourself pushing everything without bounding the heap, you’re probably inverting the logic.
  • Forgetting the size cap. Without if len(h) > k: heappop(h) you’re essentially sorting — you’ve lost the whole point.
  • Tuple comparisons that crash. If two tuples tie on the first element, Python compares the next. If that next element is an unorderable object, you’ll get TypeError: '<' not supported. Always insert a unique tie-breaker.
  • Mutating the heap directly. Don’t del h[3] or h.sort() — you’ve corrupted the invariant. Only use heapq functions.
  • Reaching for a heap when O(n) exists. If K == 1, use min/max. If K == N, just sort. Heap shines in the middle.
  • Counting heap space as O(n) when it’s really O(k). Bounded heaps have bounded space; this matters for memory-constrained problems.

The 16 Pattern Map

Every heap problem in this season maps to one of three archetypes. Use this table as your index.

#PatternLC #HeapTimeArchetype
1Kth Largest Element#215Min-heap (size k)O(n log k)Top-K
2Kth Smallest ElementMax-heap (size k)O(n log k)Top-K
3Sort Nearly Sorted ArrayMin-heap (size k+1)O(n log k)Top-K (sliding)
4K Closest to X#658Max-heap on distanceO(n log k)Top-K
5Top K Frequent Elements#347Counter + min-heapO(n log k)Top-K
6Frequency Sort#451Counter + max-heapO(n log u)Greedy-emit
7K Closest Points to Origin#973Max-heap on x²+y²O(n log k)Top-K
8Connect Ropes (Min Cost)#1167Min-heapO(n log n)Greedy-combine
9Maximum Beauty w/ DeadlinesSort + min-heapO(n log n)Scheduling-w/-deadline
10Merge K Sorted Lists#23Min-heap of headsO(N log k)K-way merge
11Find Median from Data Stream#295Two heapsO(log n) / addTwo heaps
12Task Scheduler#621Max-heap + queueO(T log U)Greedy scheduling
13Meeting Rooms II#253Sort + min-heapO(n log n)Greedy-w/-heap
14Reorganize String#767Max-heap by frequencyO(n log u)Greedy defer-most-recent
15Kth Smallest in Sorted Matrix#378Min-heap (K-way merge)O(k log min(k,n))K-way merge
16Sliding Window Median#480Two heaps + lazy deleteO(n log k)Two heaps

Thirty-Second Pre-Interview Mantra

Comments

Comments are disabled in this environment. Set PUBLIC_GISCUS_REPO, PUBLIC_GISCUS_REPO_ID, and PUBLIC_GISCUS_CATEGORY_ID to enable.