Search lands in PR-5.1 (Pagefind).

How-to Intermediate

Chapter 1 Updated

Top K Frequent Elements

Counter + min-heap of size K on frequency; evict the low-frequency root.

  • Full 6m
  • Revision
  • Flow

Problem Statement

Given an integer array nums and an integer k, return the k most frequent elements. Results may be returned in any order.

Input

An integer array nums and the count k.

Example input
nums = [1, 1, 1, 2, 2, 3]
k    = 2

Output

An array of the k most-frequent values.

Expected output
[1, 2]
 
// frequencies: {1: 3, 2: 2, 3: 1}
// top 2: 1 (freq 3), 2 (freq 2)

Constraints

  • 1 ≤ nums.length ≤ 10⁵
  • k is always valid — at least k distinct values exist.

Intuition

Two stages: (1) count frequencies with a hash-map in O(n), (2) take the top K of those by frequency. The second stage is a textbook Top-K — min-heap of size K on (frequency, value) tuples. When size exceeds K, evict the root — the element with the lowest frequency among the top-K-so-far.

Walkthrough — nums=[1,1,1,2,2,3], k=2
freq:  {1:3, 2:2, 3:1}
 
push (3,1) → heap = [(3,1)]
push (2,2) → heap = [(2,2),(3,1)]    size == k, no evict
push (1,3) → size exceeds k → pop smallest → (1,3) evicted
                 heap = [(2,2),(3,1)]
 
result values = [2, 1]   // any order ok

Solution

top_k_frequent.py
import heapq
from collections import Counter
 
def topKFrequent(nums: list[int], k: int) -> list[int]:
    freq = Counter(nums)                               # O(n)
    min_heap: list[tuple[int, int]] = []               # (frequency, value)
    for val, cnt in freq.items():
        heapq.heappush(min_heap, (cnt, val))
        if len(min_heap) > k:
            heapq.heappop(min_heap)
    return [val for _, val in min_heap]

Complexity

MetricValueReason
TimeO(n log k)Counting is O(n). With u distinct values (u ≤ n), the heap phase is u log k. Total dominated by O(n log k).
SpaceO(n)Frequency map can hold up to n distinct keys. Heap itself is O(k); dominated by the map.

Edge Case Thinking

ScenarioHandling
All distinct valuesFrequency map has n entries each with count 1; any k arbitrary values are valid.
All identicalSingle entry in the map; if k == 1, return that value.
k == distinct countReturn all distinct values.
Tie-breakingUsually any stable choice is accepted. If a deterministic order is required, add the value as a secondary tuple element.
Very skewed distributionBucket sort is especially appealing when n is huge and frequencies are concentrated.

Comments

Comments are disabled in this environment. Set PUBLIC_GISCUS_REPO, PUBLIC_GISCUS_REPO_ID, and PUBLIC_GISCUS_CATEGORY_ID to enable.