Heap · LC #295
Find Median from Data Stream
Maintain two heaps - a max-heap for the lower half and a min-heap for the upper half - keeping their sizes balanced. The median is either the top of the larger heap or the average of both tops.
Statement
Problem & Approach
Problem statement, sample I/O, and key reasoning.
Design a data structure that supports adding numbers and finding the median at any time.
Sample I/O
MedianFinder() addNum(1), addNum(2) findMedian() → 1.5 addNum(3) findMedian() → 2.0
Intuition
How to Think About It
Intuition
If we keep the smaller half in a max-heap and the larger half in a min-heap, the median is always at the boundary. The heaps self-organise the split; we just need to maintain the size invariant (differ by at most 1).
Core Concept
lo = max-heap (lower half), hi = min-heap (upper half). addNum: push to lo, move lo.top to hi, rebalance if lo.size < hi.size. findMedian: if sizes equal → average of both tops; else → top of larger heap. O(log n) add, O(1) find.
When to use
Running median of a data stream; sliding window median; any "find middle" on dynamic sorted set.
When NOT to use
When you only need the final median of a static array (just sort); when memory is severely constrained (two heaps store all elements).
Key Insights
What to Know Cold
- Two-heap invariant: max(lo) <= min(hi) AND |lo.size - hi.size| <= 1.
- Always push new number to lo first, then move lo.top to hi - maintains ordering invariant.
- Rebalance: if hi grows larger than lo, pull hi.top back to lo.
- Python requires negation for max-heap simulation with heapq.
- This pattern generalises to kth-percentile tracking with adjusted size ratios.
1 example
Worked Examples
Streaming median calculator
Stream: add 1, add 2, query (→1.5), add 3, query (→2.0).
Two heaps with size rebalancing on each insert.
Output: 1.5 then 2.0
Used in real-time analytics dashboards, sliding-window statistics, and online learning algorithms.