Saga Pattern

Data & Consistency

Distributed transactions coordinated via local commits and compensating actions

Core Idea

Break long-lived, cross-service transactions into a sequence of local steps. Use compensations to undo effects when steps fail, coordinated by orchestration or choreography.

When to Use

When a business process spans multiple services with distinct data stores and you cannot use two-phase commit.

Recognition Cues
Indicators that this pattern might be the right solution
  • Multi-step workflows across services
  • Need to maintain invariants without XA/2PC
  • Partial failures require business-specific rollbacks

Pattern Variants & Approaches

Orchestration Overview
A central orchestrator coordinates local transactions across services and triggers compensations on failure.

Orchestration Overview Architecture

Step 1Step 2Fail -> Compensate⚙️Orchestrator⚙️Service A⚙️Service B

When to Use This Variant

  • Distributed transactions
  • Multi-step workflows
  • Compensation on partial failure

Use Case

Orders/payments/booking flows spanning multiple services.

Advantages

  • Clear control flow
  • Easier reasoning than choreography
  • Explicit compensations

Implementation Example

# Orchestrator outline
orchestrate():
  step1 = svcA()
  step2 = svcB()
  if fail: compensate()
Tradeoffs

Pros

  • Business-level consistency without XA
  • Resilient to partial failures
  • Clear audit trail of process steps

Cons

  • Complexity in compensation and state
  • Eventual consistency and user-facing delays
  • Operational overhead for orchestration
Common Pitfalls
  • Missing or incorrect compensation logic
  • No idempotency causing duplicate effects
  • Sagas stuck without timeouts/escapes
  • Tight coupling in choreography leading to spaghetti events
Design Considerations
  • Choose orchestration vs choreography
  • Persist saga state with correlation IDs
  • Define timeouts and DLQs for stuck steps
  • Idempotency keys and exactly-once illusions
  • Observability across steps and compensations
Real-World Examples
Uber

Driver onboarding and trip workflows

Global multi-step processes
Travel sites

Booking flights/hotels/cars with compensations

Millions of bookings
eCommerce

Order, payment, inventory reservations

Peak seasonal traffic
Complexity Analysis
Scalability

Orchestrator scales; steps are local

Implementation Complexity

High - Compensation and coordination

Cost

Medium - Infra + engineering effort