CodeMosa

Master LeetCode Patterns

Load Balancing

Core Scale & Availability

Distribute incoming traffic across multiple servers to improve availability and performance

Core Idea

#

Load balancing distributes network traffic across multiple servers to ensure no single server bears too much load. It improves application responsiveness, increases availability, and enables horizontal scaling.

When to Use

#

Use load balancing when you need to handle high traffic volumes, ensure high availability, enable zero-downtime deployments, or scale horizontally by adding more servers.

Recognition Cues

#
Indicators that this pattern might be the right solution
  • Application needs to handle thousands of concurrent requests
  • Single server becomes a bottleneck or single point of failure
  • Need to perform rolling updates without downtime
  • Traffic patterns are unpredictable or have significant spikes
  • Geographic distribution of users requires regional routing

Pattern Variants & Approaches

#

Layer 4 (Transport Layer) Load Balancing

#
Operates at the TCP/UDP level, making routing decisions based on IP addresses and ports without inspecting packet content

Layer 4 (Transport Layer) Load Balancing Architecture

TCP/UDPRound Robin👤Client⚖️Load Balancer🖥️Server 1🖥️Server 2🖥️Server 3💾Database

When to Use This Variant

  • Need maximum performance and lowest latency
  • Protocol-agnostic load balancing required
  • Handling non-HTTP protocols (databases, game servers)
  • Simple routing logic based on connection count

Use Case

High-performance scenarios like database connection pooling, game servers, or IoT device connections

Advantages

  • Faster than Layer 7 (no packet inspection)
  • Lower CPU and memory overhead
  • Protocol-agnostic (works with any TCP/UDP traffic)
  • Can handle millions of connections

Implementation Example

# Layer 4 Load Balancer Configuration (HAProxy example)
frontend tcp_front
    bind *:3306
    mode tcp
    default_backend mysql_servers

backend mysql_servers
    mode tcp
    balance leastconn
    option tcp-check
    server mysql1 10.0.1.10:3306 check
    server mysql2 10.0.1.11:3306 check
    server mysql3 10.0.1.12:3306 check

Layer 7 (Application Layer) Load Balancing

#
Operates at the HTTP/HTTPS level, making intelligent routing decisions based on request content, headers, cookies, and URLs

Layer 7 (Application Layer) Load Balancing Architecture

HTTPS/v1/*/v2/*👤Client⚖️L7 Load Balancer⚙️API v1⚙️API v2Cache💾Database

When to Use This Variant

  • Need content-based routing (URL path, headers, cookies)
  • Require SSL termination at load balancer
  • Want to implement A/B testing or canary deployments
  • Need to route different API versions to different backends

Use Case

Web applications requiring intelligent routing, microservices architectures, API gateways, and content-based traffic distribution

Advantages

  • Content-aware routing decisions
  • Can modify requests/responses (add headers, rewrite URLs)
  • SSL/TLS termination reduces backend load
  • Better for HTTP-specific optimizations

Implementation Example

# Layer 7 Load Balancer Configuration (NGINX example)
upstream api_v1 {
    least_conn;
    server api1.example.com:8080;
    server api2.example.com:8080;
}

upstream api_v2 {
    least_conn;
    server api3.example.com:8080;
    server api4.example.com:8080;
}

server {
    listen 443 ssl;
    server_name api.example.com;

    # Route based on URL path
    location /v1/ {
        proxy_pass http://api_v1;
    }

    location /v2/ {
        proxy_pass http://api_v2;
    }

    # Health check endpoint
    location /health {
        access_log off;
        return 200 "healthy\n";
    }
}

Global Server Load Balancing (GSLB)

#
Distributes traffic across multiple geographic regions based on user location, server health, and latency

Global Server Load Balancing (GSLB) Architecture

DNS QueryDNS QueryGeo RouteGeo Route👤US Client👤EU Client⚖️DNS/GSLB⚖️US LB⚖️EU LB🖥️US Servers🖥️EU Servers

When to Use This Variant

  • Users distributed across multiple continents
  • Need disaster recovery across regions
  • Want to minimize latency for global users
  • Require compliance with data residency laws

Use Case

Global applications serving users worldwide, multi-region disaster recovery, and compliance with geographic data regulations

Advantages

  • Reduced latency by routing to nearest region
  • Geographic redundancy and disaster recovery
  • Compliance with data residency requirements
  • Can route based on real-time performance metrics

Implementation Example

# GSLB Configuration (Conceptual)
# DNS-based global load balancing

# Route 53 Geolocation Routing Policy
resource "aws_route53_record" "api_us" {
  zone_id = aws_route53_zone.main.zone_id
  name    = "api.example.com"
  type    = "A"
  
  geolocation_routing_policy {
    continent = "NA"
  }
  
  alias {
    name    = aws_lb.us_east.dns_name
    zone_id = aws_lb.us_east.zone_id
  }
  
  health_check_id = aws_route53_health_check.us.id
}

resource "aws_route53_record" "api_eu" {
  zone_id = aws_route53_zone.main.zone_id
  name    = "api.example.com"
  type    = "A"
  
  geolocation_routing_policy {
    continent = "EU"
  }
  
  alias {
    name    = aws_lb.eu_west.dns_name
    zone_id = aws_lb.eu_west.zone_id
  }
  
  health_check_id = aws_route53_health_check.eu.id
}

Tradeoffs

#

Pros

  • Improved application availability and fault tolerance
  • Horizontal scalability by adding more servers
  • Better resource utilization across server fleet
  • Enables zero-downtime deployments
  • Can provide SSL termination and DDoS protection

Cons

  • Adds complexity to infrastructure
  • Load balancer itself can become a bottleneck
  • Sticky sessions can lead to uneven load distribution
  • Additional cost for load balancer infrastructure
  • Requires careful configuration and monitoring

Common Pitfalls

#
  • Not implementing health checks, leading to traffic sent to failed servers
  • Using sticky sessions without considering session replication
  • Ignoring SSL/TLS termination placement (at load balancer vs. backend)
  • Not monitoring load balancer itself as a potential bottleneck
  • Misconfiguring timeout values causing cascading failures

Design Considerations

#
  • Choose between Layer 4 (TCP/UDP) and Layer 7 (HTTP/HTTPS) load balancing
  • Implement proper health check mechanisms (active vs. passive)
  • Decide on load balancing algorithm (round-robin, least connections, IP hash)
  • Plan for SSL/TLS termination strategy
  • Consider session persistence requirements (sticky sessions)
  • Design for load balancer redundancy (avoid single point of failure)

Real-World Examples

#
Netflix

Uses AWS ELB and custom Zuul gateway for routing 200+ million requests per day

Petabytes of traffic, millions of concurrent streams
Uber

Multi-region load balancing with automatic failover for ride matching service

Millions of rides per day across 70+ countries
Cloudflare

Global anycast network with load balancing across 300+ data centers

Handles 20%+ of all internet traffic

Complexity Analysis

#
Scalability

Horizontal - Add more backend servers

Implementation Complexity

Medium - Requires proper configuration and monitoring

Cost

Medium - Load balancer costs + increased infrastructure