Heap · LC #295

Find Median from Data Stream

Maintain two heaps - a max-heap for the lower half and a min-heap for the upper half - keeping their sizes balanced. The median is either the top of the larger heap or the average of both tops.

hardLC #295AMZ★★★GOO★★META★★MSFT★★AAPL
Mark as done
Confidence

Statement

Problem & Approach

Problem statement, sample I/O, and key reasoning.

Design a data structure that supports adding numbers and finding the median at any time.

Sample I/O

MedianFinder()
addNum(1), addNum(2)
findMedian() → 1.5
addNum(3)
findMedian() → 2.0

Intuition

How to Think About It

Intuition

If we keep the smaller half in a max-heap and the larger half in a min-heap, the median is always at the boundary. The heaps self-organise the split; we just need to maintain the size invariant (differ by at most 1).

Core Concept

lo = max-heap (lower half), hi = min-heap (upper half). addNum: push to lo, move lo.top to hi, rebalance if lo.size < hi.size. findMedian: if sizes equal → average of both tops; else → top of larger heap. O(log n) add, O(1) find.

When to use

Running median of a data stream; sliding window median; any "find middle" on dynamic sorted set.

When NOT to use

When you only need the final median of a static array (just sort); when memory is severely constrained (two heaps store all elements).

Key Insights

What to Know Cold

  • Two-heap invariant: max(lo) <= min(hi) AND |lo.size - hi.size| <= 1.
  • Always push new number to lo first, then move lo.top to hi - maintains ordering invariant.
  • Rebalance: if hi grows larger than lo, pull hi.top back to lo.
  • Python requires negation for max-heap simulation with heapq.
  • This pattern generalises to kth-percentile tracking with adjusted size ratios.

1 example

Worked Examples

Streaming median calculator

Stream: add 1, add 2, query (→1.5), add 3, query (→2.0).

Two heaps with size rebalancing on each insert.

Output: 1.5 then 2.0

Used in real-time analytics dashboards, sliding-window statistics, and online learning algorithms.