Big-O & complexity reasoning
Big-O describes how runtime/space grows as input grows. Master the ordering of the common classes and the difference between best/average/worst and amortized cost.
Big-O is a vocabulary for talking about scale: it tells you how an algorithm’s runtime or memory grows as the input grows, ignoring the constants that depend on your machine. The point isn’t to compute exact times — it’s to predict whether something stays fast at 10× or 1000× the data. Think of it as the slope of the line you’d draw if you plotted “work done” against “input size”: Big-O names that slope and ignores everything else.
The growth ladder
Big-O describes the shape of the curve, not its height. We drop constants and lower-order terms because as n gets large only the fastest-growing term matters: 3n² + 50n + 200 is just O(n²). Two algorithms can both be O(n) while one is twice as fast in practice — Big-O deliberately doesn’t see that. What it sees is the order of magnitude, which is what decides whether your code survives growth.
Memorize the ladder from flattest to explosive:
O(1) < O(log n) < O(n) < O(n log n) < O(n^2) < O(2^n) < O(n!)
The practical intuition: feed in 100× more data and an O(n) operation slows roughly 100×, but an O(n^2) one slows roughly 100² = 10,000×. That gap is why nested loops over large inputs quietly become outages, while a single pass barely notices.
To make the gap concrete, here is roughly how many steps each class costs at a few input sizes — the cells where the numbers explode are exactly the algorithms that fall over in production:
| Class | n = 10 | n = 100 | n = 1,000 | Feel |
|---|---|---|---|---|
O(1) | 1 | 1 | 1 | instant, any size |
O(log n) | ~3 | ~7 | ~10 | flat — doubling adds one step |
O(n) | 10 | 100 | 1,000 | scales linearly |
O(n log n) | ~33 | ~664 | ~9,966 | the sort ceiling — fine |
O(n²) | 100 | 10,000 | 1,000,000 | danger zone past ~10k |
O(2ⁿ) | 1,024 | 10³⁰ | uncountable | only tiny n; needs DP/pruning |
Best, average, worst — and amortized
A single Big-O label hides three cases. Best is the lucky input, average is the typical one, and worst is the adversarial one — and the worst case is usually what you design around. Hash-map lookup is average O(1) but worst-case O(n) when every key collides; quicksort is average O(n log n) but worst-case O(n^2) on already-sorted input.
Amortized cost is different and easy to confuse with average. It’s the cost per operation averaged over a whole sequence, where an occasional expensive step is paid down by many cheap ones. The distinction matters: an average case assumes a probability distribution over inputs, while an amortized bound is a guarantee that holds for any sequence of operations — no luck required.
A dynamic array (JS Array, Java ArrayList, C++ vector) appends in O(1) — until it fills up, when it allocates a bigger backing buffer and copies everything over, an O(n) step.
// Conceptually: capacity doubles when full.
// Appends 1..8 into an array that starts with capacity 1:
// push 1 -> resize, copy 0 (cost ~1)
// push 2 -> resize, copy 1 (cost ~1)
// push 3 -> resize, copy 2 (cost ~2)
// push 4 -> cheap
// push 5 -> resize, copy 4 (cost ~4)
// push 6,7,8 -> cheapBecause capacity doubles, resizes get rarer as the array grows. Summing all the copy work across n appends totals about 2n — so the cost per append, averaged over the sequence is O(1). Any single append can still spike to O(n); amortized analysis says that spike is rare enough to ignore in aggregate.
The doubling factor is load-bearing. If you grew the buffer by a fixed amount (say +1 each time) instead of multiplying, you’d resize on every append and the total copy work would be 1 + 2 + … + n ≈ n²/2 — appends would be amortized O(n), and a loop of pushes would be O(n²). Geometric growth is what makes the amortization work.
Time vs space
Every approach has two Big-O numbers, and they often trade against each other. Memoizing a recursive function turns exponential time into linear time — by spending linear space on a cache. Sorting in place is O(1) extra space; merge sort buys stability with O(n) scratch space. When you reason about an algorithm, state both: “O(n) time, O(1) space” is a complete answer; “it’s O(n)” is half of one.
You rarely need formal proofs — you read Big-O off the structure. Loops multiply; sequential blocks add (and the biggest term wins); halving the problem is the log n tell.
// (a) one pass over n -> O(n) time, O(1) space
let sum = 0;
for (const x of arr) sum += x;
// (b) nested loop over the same n -> O(n^2)
for (let i = 0; i < arr.length; i++)
for (let j = 0; j < arr.length; j++)
if (arr[i] + arr[j] === target) return [i, j];
// (c) halving each step -> O(log n)
let lo = 0, hi = arr.length - 1; // arr is sorted
while (lo <= hi) {
const mid = (lo + hi) >> 1;
if (arr[mid] === target) return mid;
if (arr[mid] < target) lo = mid + 1; else hi = mid - 1;
}Two sequential loops are O(n) + O(n) = O(n), not O(n²) — nesting is what multiplies. And building a result array of size n is O(n) space even if the time is O(n); count the allocations, not just the loops.
01 Learning objectives
0 / 4 done02 Curated reading
03 Knowledge check
- 01easy
Order these from slowest-growing to fastest-growing.
- 02easy
Appending to a dynamic array (amortized) is:
- 03easy
Big-O keeps constant factors and lower-order terms.
- 04medium
An O(n²) algorithm takes 1s on 1,000 items. Roughly how long on 10,000 items?
04 Interview questions
browse all ↗What gets asked on this topic — tap a card for how to approach it, the follow-ups, and the trap. Company tags are best-effort & sourced.
-
Why do we drop constants and lower-order terms in Big-O?
Big-O describes asymptotic growth as n grows large, so the dominant term decides how the cost scales; constants and lower-order terms become negligible in that limit.
3n + 5andnboth grow linearly, so both are O(n). The point is to compare how algorithms scale, not to predict exact wall-clock time.What a strong answer coversBig-O measures asymptotic growth as
n → ∞, not real wall-clock time.The fastest-growing term dominates; constants and lower-order terms vanish in the limit.
3n + 5,n, and500nare all O(n) — the same growth class.The goal is comparing how algorithms scale, not benchmarking a specific machine.
Quick self-checkWhich expression is NOT O(n)?
-
Linear: the constant 2 and the +7 drop, leaving O(n).
-
n dominates log n, so this is O(n).
-
Correct — dividing by a constant doesn't change the class; it's still O(n²), which grows faster than O(n).
-
A constant multiple of n is still O(n).
Follow-ups they push on- When do constants actually matter in practice?
- Is an O(n) algorithm always faster than an O(n log n) one?
Red flag Treating O(2n) or O(n + 100) as meaningfully different from O(n), or implying Big-O predicts real runtime rather than scaling.
source: Tech Interview Handbook — Algorithms / Complexity ↗ -
Order these from fastest- to slowest-growing: O(n log n), O(1), O(n!), O(log n), O(n^2), O(n), O(2^n).
O(1) < O(log n) < O(n) < O(n log n) < O(n^2) < O(2^n) < O(n!). The split that matters most in interviews is polynomial (everything up to O(n^2)) versus exponential/factorial (O(2^n), O(n!)), which become intractable for even modest n. Knowing where a candidate algorithm sits on this ladder is usually the first thing an interviewer wants.
Follow-ups they push on- Give a concrete algorithm that lands in each class.
Red flag Putting O(n log n) above O(n^2), or thinking O(2^n) and O(n^2) are close because both 'have an n and a power'.
source: Big-O Cheat Sheet ↗ -
If 100x more data slows an operation ~100x, what's its complexity? What if it slows ~10,000x?
~100x slowdown for 100x data is linear, O(n). A ~10,000x slowdown is 100^2, i.e. O(n^2) — the quadratic term means scaling the input scales the cost by the square. This back-of-envelope reasoning is exactly how you sanity-check whether a measured slowdown matches your assumed complexity.
Red flag Confusing the input multiplier with the time multiplier, or assuming any slowdown larger than linear must be exponential.
source: GeeksforGeeks — Big-O Notation Interview Questions ↗ -
Explain best, average, and worst case. Which one does Big-O usually refer to, and why?
Best/average/worst describe how cost varies across different inputs of the same size. Interviewers usually mean worst-case Big-O because it's the guarantee that holds regardless of input, but average case matters for things like quicksort (avg O(n log n), worst O(n^2)) and hash maps (avg O(1), worst O(n)). Good practice: state worst-case first, then add the expected/average case with the assumption it relies on.
Follow-ups they push on- What assumption makes a hash-map lookup average O(1)?
- Why might average case be the more honest number for quicksort?
Red flag Quoting average case as if it were a worst-case guarantee, especially for hashing or randomized algorithms.
source: Tech Interview Handbook — Algorithms / Complexity ↗ -
What is amortized complexity? Why is appending to a dynamic array amortized O(1) if a resize is O(n)?
Amortized cost is the average cost per operation across a long sequence, even if individual operations vary. A dynamic array doubles capacity on resize, so a resize costs O(n) but only happens after ~n cheap appends; spreading that O(n) over the n appends gives O(1) per append on average. The doubling (geometric growth) is what makes the total work across n appends O(n), not O(n^2).
Follow-ups they push on- What breaks if you grow the array by a fixed +1 instead of doubling?
- Is amortized O(1) the same as worst-case O(1)?
Red flag Calling a single resizing append O(1), or claiming linear (+1) growth still gives amortized O(1) appends.
source: InterviewPlus — Understanding Amortized Time Complexity ↗ -
What's the time and space complexity of a recursive function that recurses on n/2 and does O(1) work per call?
Halving n each call with O(1) work per level gives O(log n) time (about log2(n) levels). Space is O(log n) too because the call stack holds one frame per level until the base case unwinds — a point candidates often miss when they say O(1) space. This is the binary-search recursion shape.
Follow-ups they push on- How does an iterative version change the space complexity?
Red flag Reporting O(1) space for a recursive solution by forgetting the call stack costs O(depth).
source: InterviewPrep — Algorithm Complexity Interview Questions ↗ -
Two nested loops over the same array of size n look O(n^2). When can nested loops still be O(n)?
Nesting doesn't automatically mean O(n^2) — what matters is total iterations. In a sliding-window or two-pointer pass, the inner pointer advances monotonically and never resets, so across the whole run it moves at most n times total: the two loops combined do O(n) work. Always count how many times the inner body actually runs, not how deeply the loops nest.
Follow-ups they push on- What's the complexity of the sliding-window longest-substring solution?
Red flag Mechanically multiplying loop depths instead of bounding the total number of inner iterations.
source: GeeksforGeeks — Big-O Notation Interview Questions ↗ -
You can solve a problem in O(n) time with O(n) extra space, or O(n log n) time with O(1) space. How do you decide?
It's a time/space trade-off driven by constraints: if memory is tight (embedded, huge inputs, streaming) favour the O(1)-space version; if latency dominates and memory is cheap, take the O(n)-time version. The strong-signal answer names the constraints out loud, states the assumption (e.g. input fits in memory), and asks the interviewer about input size and environment rather than guessing.
Follow-ups they push on- When would O(n) extra space be a non-starter even if it's faster?
Red flag Optimizing only time and never mentioning the space cost, or picking one without surfacing the trade-off.
source: InterviewPrep — Algorithm Analysis Interview Questions ↗ -
A loop runs `for (i = 1; i < n; i *= 2)`. What's its time complexity, and why?
It's O(log n). Multiplying
iby 2 each iteration meansitakes the values 1, 2, 4, 8, …, so the loop body runs about log2(n) times beforeireaches n. Any loop where the counter is multiplied or divided by a constant factor (rather than added to) is logarithmic — this is the same shape as binary search.What a strong answer coversCounter multiplied/divided by a constant ⇒ logarithmic, not linear.
1, 2, 4, … n means ~
log2(n)iterations.Contrast with
i += 1(ori += c), which is O(n).Nesting this inside an O(n) loop gives O(n log n).
Quick self-checkTime complexity of `for (i = n; i > 1; i /= 2)`?
-
Wrong — the counter is halved each step, not decremented by one.
-
Correct — halving n repeatedly takes ~log2(n) steps to reach 1.
-
Wrong — there's only one loop, no multiplicative nesting.
-
Wrong — the number of iterations grows with n, just slowly.
Follow-ups they push on- What's the complexity if the inner loop instead did `j *= 3`?
- What if you nest this logarithmic loop inside a `for i in 0..n`?
Red flag Calling it O(n) because it's 'a loop up to n', ignoring that the counter grows geometrically, not by one.
source: GeeksforGeeks — Big-O Notation Interview Questions ↗ -
What does Big-O actually bound — Big-O vs Big-Theta vs Big-Omega — and why do people say O(n) when they mean Θ(n)?
Big-O is an upper bound (grows no faster than), Big-Omega (Ω) is a lower bound (grows no slower than), and Big-Theta (Θ) is a tight bound (both at once). Strictly, an O(n) algorithm is also O(n^2) because O is only an upper bound, so the precise claim is usually Θ(n). In interviews people say 'O(n)' loosely to mean the tight bound; it's fine, but knowing the distinction signals rigor.
What a strong answer coversO = upper bound, Ω = lower bound, Θ = tight (both).
An O(n) algorithm is technically also O(n^2) — O doesn't have to be tight.
Θ(n) is the precise statement people usually intend by 'O(n)'.
Worst-case Big-O is the common interview default unless stated otherwise.
Quick self-checkWhich statement is technically correct for an algorithm that always does exactly n steps?
-
Wrong — Θ(n) is correct, but it IS also O(n²) since O is just an upper bound.
-
Correct — it's bounded above by both n and n², and tightly bounded by n.
-
Wrong — n steps does not grow at least as fast as n², so it's not Ω(n²).
-
Wrong — any larger upper bound like O(n²) also holds.
Follow-ups they push on- Give an algorithm whose best and worst cases differ in Big-Theta.
- Is saying 'quicksort is O(n^2)' wrong?
Red flag Insisting O must be a tight bound, or conflating worst-case with Big-O (they're independent axes).
source: MIT OCW 6.006 — Asymptotic notation ↗ -
A function loops `i` from 0 to n and, inside, loops `j` from 0 to `i`. Is that O(n^2)?
Yes, it's O(n^2) — even though the inner loop doesn't always run n times. The total iterations are 0 + 1 + 2 + … + (n-1) = n(n-1)/2, which is ~n^2/2; dropping the constant gives O(n^2). The lesson: a triangular nested loop is still quadratic, because half of a square is still proportional to n^2.
What a strong answer coversTotal work is the arithmetic series
0+1+…+(n-1) = n(n-1)/2.That's
~n²/2⇒ O(n²) after dropping the constant.'Inner loop shorter each time' does not save an order of magnitude.
Same total as comparing all unique pairs of n items.
Quick self-checkTotal iterations of the inner body across the whole run?
-
Wrong — that would be a single linear pass, not a nested loop.
-
Wrong — no halving happens; both loops are arithmetic.
-
Correct — it's the sum 0+1+…+(n-1) = n(n-1)/2 ≈ n²/2.
-
Wrong — that's the full square; the triangular loop runs about half as many times (same O(n²) class though).
Follow-ups they push on- How many distinct pairs `(i, j)` with `i < j` exist among n items?
- What if the inner loop ran to `i*i` instead of `i`?
Red flag Claiming it's O(n) or 'O(n²/2)' — the triangular shape is still Θ(n²) and constants are dropped.
source: GeeksforGeeks — Big-O Notation Interview Questions ↗ -
An algorithm has two separate phases: one O(n) and one O(n^2). What's the overall complexity, and what if a third phase is O(m)?
Sequential phases add, then you keep the dominant term: O(n) + O(n^2) = O(n^2), because the quadratic term swamps the linear one as n grows. When a phase depends on a different input size
m, you cannot fold it into n — the honest answer is O(n^2 + m), keeping both variables because either could dominate depending on the inputs.What a strong answer coversSequential (non-nested) phases add; nested phases multiply.
After adding, drop dominated terms:
O(n) + O(n²) = O(n²).Independent input sizes stay separate:
O(n² + m), notO(n²).Only collapse
mintonif you can provem ≤ n(or similar).
Follow-ups they push on- When is it wrong to assume m ≈ n in a graph problem (V vs E)?
- If the phases were nested instead of sequential, what changes?
Red flag Silently assuming a second input variable equals n, or multiplying sequential phases that should be added.
source: Big-O Cheat Sheet ↗ -
What's the time complexity of the naive recursive Fibonacci, and why is it so bad?
Naive
fib(n) = fib(n-1) + fib(n-2)is O(2^n) (more precisely O(φ^n)) because each call spawns two more and the recursion tree's size roughly doubles per level, recomputing the same subproblems exponentially many times. Memoizing the results collapses it to O(n) time (eachfib(k)computed once), and an iterative version is O(n) time with O(1) space. This is the canonical 'overlapping subproblems ⇒ use DP' example.What a strong answer coversBranching factor 2 with depth n ⇒ ~
2^ncalls.Same subproblems recomputed repeatedly (overlapping subproblems).
Memoization ⇒ O(n) time; iterative ⇒ O(n) time, O(1) space.
Recursion tree size, not depth, drives the exponential cost.
Quick self-checkWhy is naive recursive Fibonacci exponential while memoized is linear?
-
Wrong — the recurrence is the same; the difference is caching, not branching.
-
Correct — overlapping subproblems are recomputed exponentially without a cache.
-
Wrong — recursion isn't inherently exponential; the redundant recomputation is the cause.
-
Wrong — both versions still add fib(n-1) + fib(n-2).
Follow-ups they push on- How much space does the memoized version use?
- Can you compute Fibonacci faster than O(n)?
Red flag Estimating it as O(n) by counting recursion depth instead of the exponential number of nodes in the call tree.
source: Tech Interview Handbook — Dynamic Programming cheatsheet ↗ -
You sort an array (O(n log n)) and then do a single linear scan. What's the combined complexity, and is the sort 'free' because the scan is O(n)?
Combined it's O(n log n) — the sort dominates the linear scan, so the scan is effectively absorbed, but the sort is certainly not free; it sets the overall complexity. A common mistake is to advertise 'an O(n) two-pointer solution' while quietly sorting first, which makes the real cost O(n log n). Always fold the preprocessing cost into the bound you quote.
What a strong answer coversO(n log n) + O(n) = O(n log n)— the larger term wins.The sort is the bottleneck, not 'free'.
A two-pointer pass after a sort is an O(n log n) solution overall.
Quote the cost including preprocessing, not just the hot loop.
Quick self-checkSort (O(n log n)) followed by a separate O(n) scan — overall?
-
Wrong — ignores the dominant sorting step.
-
Correct — the sort dominates; adding O(n) doesn't raise the class.
-
Wrong — the phases are sequential (added), not nested (multiplied).
-
Wrong — there's at least linear work, far more than logarithmic.
Follow-ups they push on- When would an O(n)-space hash approach beat the sort-then-scan approach?
- If the array were already sorted, what would change?
Red flag Claiming an 'O(n) solution' that secretly sorts the input first — the honest bound is O(n log n).
source: Tech Interview Handbook — Algorithms / Sorting ↗ -
When do constant factors and Big-O 'lie' in practice — i.e. when is a higher-Big-O algorithm actually faster?
Big-O hides constants and cache effects, so for small or medium n a higher-Big-O algorithm with a tiny constant often wins. Classic cases: insertion sort (O(n^2)) beats quicksort on tiny arrays — which is why Timsort/introsort fall back to it; linear scan of a contiguous array can beat a hash map for small n because of cache locality and no hashing overhead; and an O(n log n) algorithm with huge constants can lose to a well-tuned O(n^2) until n is large. The honest senior answer is 'Big-O tells you scaling behavior; profile to know the crossover point for your actual n.'
What a strong answer coversBig-O drops constants and ignores cache locality / memory hierarchy.
Insertion sort beats quicksort for tiny n ⇒ hybrid sorts switch over.
Contiguous linear scan can beat a hash map for small n (locality, no hashing).
There's a crossover n; profile rather than assume the lower class always wins.
Follow-ups they push on- Why does Timsort use insertion sort on small runs?
- How does cache locality favor arrays over linked lists despite equal Big-O?
Red flag Treating Big-O as a real-runtime ranking for all n, ignoring constants, locality, and the crossover point.
source: Tech Interview Handbook — Algorithms / Complexity ↗