A heap is the data structure for “give me the next-best thing repeatedly, while new candidates keep arriving.” Priority queue is the abstract interface; binary heap is the usual concrete implementation.

When to reach for it

  • Top-k. Maintain a min-heap of size k; on each new element, push and pop if size exceeds k. O(n log k) overall, beats sorting when k ≪ n.
  • Streaming median. Two heaps — a max-heap of the lower half, a min-heap of the upper half. Rebalance after each insert. Median is the root(s). O(log n) insert, O(1) query.
  • Merging k sorted streams. Min-heap of (value, stream_id); pop and pull from the corresponding stream. O(n log k).
  • Dijkstra / Prim / A*. Frontier as a min-heap keyed by distance / cost.
  • Scheduled job execution. Heap keyed by next_fire_time; pop, run, reschedule.
  • Event simulation. Same as scheduling — events keyed by simulated time.

Operations and costs

OpCost
peek (root)O(1)
pushO(log n)
popO(log n)
heapify (build from list)O(n) — bottom-up sift-down beats n pushes
kth largestO(n + k log n) via heap, or O(n log k) via size-k heap

heapify being O(n), not O(n log n), is the non-obvious one — and the reason you should heapify once, not push n times to build.

Python specifics

heapq is min-heap only. For max-heap, negate keys. For tuples like (priority, item), ensure items are comparable or wrap with a counter to prevent comparison on item:

heapq.heappush(h, (priority, counter, item))

heapq.nlargest(k, iterable, key=...) is O(n log k) and idiomatic — reach for it before rolling your own.

What heaps don’t do

  • No efficient arbitrary deletion or update. “Decrease-key” is O(n) to find, then O(log n) to fix. The standard workaround in Dijkstra is lazy deletion: push the new entry, ignore stale entries on pop (if visited[node]: continue). Cleaner than a real Fibonacci heap for almost any practical scale.
  • No order beyond the root. The internal layout is partial order, not sorted. Don’t iterate a heap expecting sorted output without popping.
  • Not a search structure. “Is X in the heap” is O(n). If you need that, you also need a side index.