Primer

Big-O & Recursion

Master complexity analysis and recursion fundamentals

What Big-O is (and where it came from)

What: Big-O describes the asymptotic upper bound on how an algorithm's cost grows with input size n. It hides constant factors and lower-order terms.

Why: Lets you compare scalability independent of hardware and implementation details.

Origin (one-liner): The notation comes from number theory (often called Bachmann–Landau notation): introduced in late-19th/early-20th century work and popularized by Edmund Landau. The "O" is for Order (growth order).

The five siblings: O, Θ, Ω, o, ω
  • O(f(n)): ≤ c·f(n) eventually (upper bound).
  • Ω(f(n)): ≥ c·f(n) eventually (lower bound).
  • Θ(f(n)): both O and Ω (tight bound).
  • o(f(n)): grows strictly slower than f(n).
  • ω(f(n)): grows strictly faster than f(n).

"Eventually" = there exists an n₀ after which the inequality holds.

Cost models (what counts as one 'step'?)

Unit-cost RAM model (interview default): arithmetic/compare/assign on fixed-width ints → O(1). Array index/ptr hop → O(1).

Bit complexity (theory): cost depends on operand bit-length (e.g., big integers).

Data-structure assumptions: Hash map ops often assumed O(1) amortized (but can degrade).

Unless a problem says otherwise, assume unit-cost RAM and fixed-width integers.

How to calculate Big-O (recipe)

Step A — Count operations structurally

  • Sequential code: add costs.
  • Loops: translate to a sum; evaluate or bound it.
  • Nesting: multiply the dominant terms.
  • Conditionals: take the worst branch (for worst-case Big-O).
  • Function calls/recursion: write a recurrence, then solve/bound.

Step B — Use dominance rules

  • Drop lower-order terms: n² + n log n + 42 → Θ(n²).
  • Drop constants: 7n log n → Θ(n log n).
  • Know common series (below).

Common series & bounds

  • Arithmetic: ∑ᵢ₌₁ⁿ i = n(n+1)/2 = Θ(n²).
  • Squares: ∑ i² = Θ(n³); higher powers similar.
  • Geometric: ∑ₖ₌₀ᵐ rᵏ = (rᵐ⁺¹-1)/(r-1)
    • If 0<r<1: sum → constant.
    • If r>1 and m=Θ(log n): sum = Θ(n^(log_r r)) → often Θ(n) or Θ(log n) depending on context.
  • Harmonic: Hₙ = ∑ 1/i = Θ(log n).

Mini-examples

Single loop:
for i in 1..n: cost += 1    →  Θ(n)

Nested loops (triangular):
for i in 1..n:
  for j in 1..i: work
→ ∑ᵢ₌₁ⁿ i = Θ(n²)

"Halving" loop:
i = n
while i > 1: i //= 2        →  number of iterations = ⌊log₂ n⌋ = Θ(log n)

Doubling index in loop:
for (i=1; i<=n; i*=2)       →  Θ(log n)
Solving recurrences (three quick tools)

A. Recursion tree / iteration method Expand a few levels, spot pattern, sum per-level cost.

B. Substitution (a.k.a. induction) Guess the bound, prove by induction, solve for constants.

C. Master Theorem (divide-and-conquer of the form T(n)=a T(n/b)+f(n))

Let n^(log_b a) be the "work from recursion".

  1. If f(n) = O(n^(log_b a - ε)) → T(n) = Θ(n^(log_b a)).
  2. If f(n) = Θ(n^(log_b a) log^k n) → T(n) = Θ(n^(log_b a) log^(k+1) n).
  3. If f(n) = Ω(n^(log_b a + ε)) and regularity holds → T(n) = Θ(f(n)).

Canonical worked examples

  • Binary search: T(n)=T(n/2)+O(1) → Θ(log n).
  • Mergesort: T(n)=2T(n/2)+Θ(n) → Θ(n log n).
  • Quicksort (worst): T(n)=T(n-1)+Θ(n) → Θ(n²); (avg): Θ(n log n).
  • Grid BFS: work is proportional to explored cells + edges → Θ(V+E) = Θ(n) on an n-cell grid with O(1) neighbors.

When Master Theorem doesn't fit (e.g., T(n)=T(n/2)+T(n/3)+n), try recursion tree or Akra–Bazzi (advanced).

Space complexity (what exactly are we measuring?)
  • Auxiliary space: extra memory beyond inputs/outputs.
  • Call stack: depth of recursion (e.g., DFS on a path → O(n) stack).
  • Data-structure space: e.g., DP tables, queues (BFS frontier ≈ largest layer).
  • Output sensitivity: some problems scale with the size of the result (e.g., listing all paths).
Amortized analysis (when worst-case per op is misleading)

Dynamic array push (double capacity on full):

  • Most push = O(1); occasional resize = O(n).
  • Total over m pushes ≤ 2m moves → O(1) amortized.

Stack with occasional multi-pop: each element is pushed and popped once → O(1) amortized per op.

Potential method: define a potential Φ(state) so that amortized_cost = actual_cost + ΔΦ and show the series telescopes.

Recursion essentials (how to write & reason)

Base, progress, combine

  • Base case(s) handle smallest inputs.
  • Progress: each call strictly reduces a measure (size, index, set).
  • Combine: merge results from subcalls (sum, max, concatenate…).

Tail recursion vs iteration

A function is tail-recursive if the recursive call is the last thing it does. Some languages optimize tail calls; for interview safety, convert deep tail recursion to a loop.

Memoization vs tabulation (DP)

  • Memo (top-down): cache f(state) results; great when the state space is sparse or heavily pruned.
  • Tabulation (bottom-up): explicit order of evaluation; often faster and easier to space-optimize (rolling arrays).

Recursion pitfalls

  • Missing base case or non-shrinking parameter → infinite recursion.
  • Recomputing subproblems → exponential blow-up (fix with memo/DP).
  • Stack overflow with deep recursion (prefer iterative DFS/BFS on tall graphs/trees).
Quick 'complexity budget' table (rules of thumb)

| Input size n | Safe time target | |--------------|------------------| | ~10⁵ – 2·10⁵ | O(n), O(n log n) | | ~10⁴ – 5·10⁴ | O(n log n), some O(n√n), light O(n²) | | ≤ 3·10³ | O(n²) fine; avoid O(n³) unless tiny | | ≤ 20–22 | 2ⁿ / n! backtracking feasible with pruning |

Practice: compute these (with answers)

1. for i in 1..n: for j in 1..n: for k=1; k<n; k=2* → n × n × log n = Θ(n² log n)

2. T(n)=3T(n/2)+n n^(log₂ 3) ≈ n^1.585 dominates n → Θ(n^(log₂ 3))

3. i=1; s=0; while i<=n: s+=i; i+=i i doubles → iterations ≈ ⌊log₂ n⌋+1 → Θ(log n)

4. Grid n×n, BFS starting from one corner Each cell visited at most once, each edge processed O(1) → Θ(V+E)=Θ(n²).

What Big-O does not tell you
  • Constants & cache effects (two O(n) algorithms can differ by 10× in practice).
  • Average vs worst case (state which one you analyze).
  • Hidden assumptions (e.g., hash maps O(1) amortized, not guaranteed).
  • Actual wall-clock (map your target complexity to time budgets using constraints).

Ready to practice?

Test your understanding with a quiz or explore Fundamentals patterns