Arrays & Hashing · LC #128
Longest Consecutive Sequence
Find the length of the longest sequence of consecutive integers in an unsorted array. The O(n) solution uses a hash set and only starts counting from sequence boundaries - a non-obvious insight.
Statement
Problem & Approach
Problem statement, sample I/O, and key reasoning.
Given an unsorted integer array `nums`, return the length of the **longest consecutive elements sequence**. Must run in O(n) time.
Sample I/O
Input: nums = [100,4,200,1,3,2] Output: 4 (sequence: 1,2,3,4) Input: nums = [0,3,7,2,5,8,4,6,0,1] Output: 9
Intuition
How to Think About It
Intuition
If we start counting from every element, we will recount the same sequence multiple times (O(n²)). The key insight is: only start counting from an element n if n−1 is NOT in the set - i.e., n is the left boundary of a sequence. This guarantees each element is visited at most twice across all sequences, yielding O(n) amortised.
Core Concept
Build a hash set from nums in O(n). For each element n, check if n−1 is in the set. If not, n is a sequence start - extend it by checking n+1, n+2, … until a gap. Track the maximum length found. Every element participates in at most one sequence-start scan and one inner-while pass, so total work is O(n).
When to use
Finding the longest run of consecutive integers in an unsorted array. Any problem requiring "group elements into consecutive chains" without sorting. Can generalise to finding all consecutive groups, not just the longest.
When NOT to use
When the array is already sorted - a single linear scan comparing adjacent elements is simpler. When consecutive means "appear consecutively in the array" rather than "are numerically consecutive" - different problem. When duplicate values are meaningful (the set discards them silently).
Key Insights
What to Know Cold
- Sorting achieves O(n log n) - correct but explicitly violates the problem constraint. Interviewers test whether you know the O(n) hash-set approach.
- The "only start at sequence boundaries" invariant is what reduces repeated work from O(n²) to O(n).
- Duplicates are harmlessly de-duplicated by the hash set - important for [0,3,7,2,5,8,4,6,0,1].
- The inner while loop iterates at most n total times across ALL outer iterations combined (amortised O(n)).
- This problem is closely related to Union-Find: each consecutive pair is a union operation; the longest component is the answer.
1 example
Worked Examples
Boundary-Start Sequence Scan
nums = [100,4,200,1,3,2]. Longest consecutive run is 1→2→3→4 = length 4.
Set = {100,4,200,1,3,2}. Sequence starts: 100 (99 not in set, extends to 100, len 1), 200 (len 1), 1 (0 not in set, extends to 1,2,3,4, len 4). Max = 4.
nums = [100, 4, 200, 1, 3, 2]
num_set = set(nums)
longest = 0
for n in num_set:
if n - 1 not in num_set: # sequence boundary
length = 1
while n + length in num_set:
length += 1
longest = max(longest, length)
print(longest) # 4This problem demonstrates amortised O(n) analysis - a technique that separates junior from senior candidates. The "start only at boundaries" insight appears in interval merging and graph connected-component problems.