Integration & APIs/
Lesson

When you move from a monolithWhat is monolith?A software architecture where the entire application lives in a single codebase and deploys as one unit. Simpler to build and debug than microservices. to distributed services, you lose the safety net of a single database transactionWhat is transaction?A group of database operations that either all succeed together or all fail together, preventing partial updates.. You can no longer wrap "create order + reserve inventory + charge payment" in one BEGIN...COMMIT. These three patterns -- SagaWhat is saga?A pattern for coordinating multi-service operations where each step has a compensating undo action that runs if a later step fails., Outbox, and CQRSWhat is cqrs?Command Query Responsibility Segregation - using separate models for read and write operations so each can be optimized independently. -- exist to handle the distributed coordination problems that inevitably arise.

AI pitfall
AI-generated saga implementations almost always handle the happy path perfectly and the compensation path poorly. The orchestrator catches errors and calls compensating actions, but what happens when the compensating action itself fails? AI rarely generates retry logic or alerting for failed compensation, and that is where production incidents start.

The sagaWhat is saga?A pattern for coordinating multi-service operations where each step has a compensating undo action that runs if a later step fails. pattern

A saga is a sequence of local transactions where each step either succeeds (triggering the next step) or fails (triggering compensating actions to undo previous steps). It replaces a distributed transactionWhat is transaction?A group of database operations that either all succeed together or all fail together, preventing partial updates. with a chain of smaller, local ones.

Consider an order flow that spans three services:

1. Order Service    -> Create order (status: pending)
2. Payment Service  -> Charge credit card
3. Inventory Service -> Reserve items

If payment succeeds but inventory reservation fails, you need to refund the payment and cancel the order. A saga defines both the forward steps and the compensating actions.

Orchestration: central coordinator

In orchestration, a single saga orchestrator controls the flow. It sends commands to each service and decides what to do based on their responses.

class OrderSaga {
  async execute(orderData: OrderData) {
    const sagaLog: SagaStep[] = [];

    try {
      // Step 1: Create order
      const order = await orderService.createOrder(orderData);
      sagaLog.push({ service: 'order', action: 'create', id: order.id });

      // Step 2: Process payment
      const payment = await paymentService.charge({
        amount: order.total,
        customerId: order.customerId
      });
      sagaLog.push({ service: 'payment', action: 'charge', id: payment.id });

      // Step 3: Reserve inventory
      await inventoryService.reserve({
        items: order.items,
        orderId: order.id
      });
      sagaLog.push({ service: 'inventory', action: 'reserve', id: order.id });

      // All steps succeeded
      await orderService.updateStatus(order.id, 'confirmed');
      return { success: true, orderId: order.id };

    } catch (error) {
      // Compensate in reverse order
      await this.compensate(sagaLog);
      return { success: false, error: error.message };
    }
  }

  private async compensate(sagaLog: SagaStep[]) {
    for (const step of sagaLog.reverse()) {
      try {
        switch (step.service) {
          case 'payment':
            await paymentService.refund(step.id);
            break;
          case 'order':
            await orderService.cancel(step.id);
            break;
          case 'inventory':
            await inventoryService.release(step.id);
            break;
        }
      } catch (compensationError) {
        // Log and alert: compensation failed, needs manual intervention
        console.error(`Compensation failed for ${step.service}:`, compensationError);
        await alertOps({ step, error: compensationError });
      }
    }
  }
}

The orchestrator has a clear picture of the entire flow, making it easier to understand, debug, and modify. But it becomes a single point of failure and can become a "god service" that knows too much.

Choreography: decentralized reactions

In choreography, there is no central coordinator. Each service listens for events and reacts by performing its work, then emitting its own event.

// Order Service: create order and emit event
async function placeOrder(orderData: OrderData) {
  const order = await db.orders.create({ ...orderData, status: 'pending' });
  await eventBus.publish({
    type: 'OrderPlaced',
    data: { orderId: order.id, items: order.items, total: order.total }
  });
}

// Payment Service: listens for OrderPlaced
eventBus.subscribe('OrderPlaced', async (event) => {
  try {
    const payment = await chargeCard(event.data.total, event.data.customerId);
    await eventBus.publish({
      type: 'PaymentCompleted',
      data: { orderId: event.data.orderId, paymentId: payment.id }
    });
  } catch (error) {
    await eventBus.publish({
      type: 'PaymentFailed',
      data: { orderId: event.data.orderId, reason: error.message }
    });
  }
});

// Inventory Service: listens for PaymentCompleted
eventBus.subscribe('PaymentCompleted', async (event) => {
  try {
    await reserveItems(event.data.orderId);
    await eventBus.publish({
      type: 'InventoryReserved',
      data: { orderId: event.data.orderId }
    });
  } catch (error) {
    await eventBus.publish({
      type: 'InventoryReservationFailed',
      data: { orderId: event.data.orderId }
    });
  }
});

// Order Service: also listens for failure events to compensate
eventBus.subscribe('PaymentFailed', async (event) => {
  await db.orders.update(event.data.orderId, { status: 'cancelled' });
});

Choreography keeps services truly independent -- no service knows about the others. But the flow becomes implicit and hard to trace. Debugging "why did this order fail?" means reading logs across multiple services.

Orchestration vs choreography

AspectOrchestrationChoreography
Flow visibilityCentralized, easy to followDistributed, hard to trace
CouplingOrchestrator knows all servicesServices only know events
ComplexityGrows in the orchestratorGrows across all services
Single point of failureYes (the orchestrator)No
DebuggingRead one service's logsCorrelate logs across services
Best forComplex flows with many stepsSimple flows with 2-3 steps
Adding new stepsModify orchestratorAdd new subscriber (no changes to existing)
RiskGod serviceSpaghetti events

In practice, many teams use a hybrid: orchestration for complex multi-step flows (order processing) and choreography for simple fan-out scenarios (send notification when user signs up).

Good to know
Most real systems use a hybrid approach: orchestration for complex flows with many steps (order processing with payments, inventory, shipping) and choreography for simple fan-out scenarios (send a welcome email when a user signs up). Do not force one pattern on everything.
02

The outbox patternWhat is outbox pattern?A reliability pattern where events are written to a database table in the same transaction as business data, then published separately - guaranteeing delivery.

Here is a nasty problem: your service needs to save data to its database AND publish an event. If you do them as two separate operations, either can fail independently.

// BROKEN: two separate operations
async function createOrder(orderData: OrderData) {
  // Step 1: save to database
  const order = await db.orders.create(orderData);

  // Step 2: publish event
  await eventBus.publish({ type: 'OrderPlaced', data: order });
  // What if this fails? Order exists but no event was published.
  // What if step 1 fails after step 2? Event published but no order.
}

The Outbox pattern solves this by writing the event to an outbox table in the same database transactionWhat is transaction?A group of database operations that either all succeed together or all fail together, preventing partial updates. as the business data.

// CORRECT: Outbox pattern
async function createOrder(orderData: OrderData) {
  await db.transaction(async (trx) => {
    // Step 1: save order
    const order = await trx('orders').insert(orderData).returning('*');

    // Step 2: write event to outbox (same transaction!)
    await trx('outbox').insert({
      id: crypto.randomUUID(),
      event_type: 'OrderPlaced',
      payload: JSON.stringify({
        orderId: order[0].id,
        customerId: order[0].customerId,
        total: order[0].total
      }),
      created_at: new Date(),
      published: false
    });
  });
  // Both writes succeed or both fail. Atomicity guaranteed.
}

A separate process (often called a "relay" or "publisher") polls the outbox table and publishes unpublished events to the message broker.

// Outbox relay: runs on a timer or listens for DB changes
async function publishOutboxEvents() {
  const unpublished = await db('outbox')
    .where('published', false)
    .orderBy('created_at', 'asc')
    .limit(100);

  for (const entry of unpublished) {
    await eventBus.publish({
      type: entry.event_type,
      data: JSON.parse(entry.payload)
    });
    await db('outbox').where('id', entry.id).update({ published: true });
  }
}

// Run every 5 seconds
setInterval(publishOutboxEvents, 5000);

The outbox guarantees at-least-once delivery: if the relay crashes mid-publish, it will retry on the next run. This means consumers must be idempotentWhat is idempotent?An operation that produces the same result whether you perform it once or multiple times, making retries safe. (we cover this in lesson 4).

Edge case
The outbox relay can publish the same event multiple times if it crashes between publishing and marking the event as published. This is why the outbox guarantees at-least-once delivery, not exactly-once. Your consumers must be idempotent, there is no getting around this in distributed systems.
03

CQRSWhat is cqrs?Command Query Responsibility Segregation - using separate models for read and write operations so each can be optimized independently.: command query responsibility segregation

CQRS splits your application into two separate models: one optimized for writes (commands) and one optimized for reads (queries).

In a typical CRUDWhat is crud?Create, Read, Update, Delete - the four basic operations almost every application performs on data. application, the same database model handles both reading and writing. This works fine until your read and write patterns diverge significantly.

// Without CQRS: same model for reads and writes
// Write: normalized, validated, transactional
// Read: also normalized... but we need joins across 5 tables for one dashboard view

// With CQRS: separate models
// Write model: normalized, enforces business rules
async function placeOrder(command: PlaceOrderCommand) {
  const order = await writeDb.orders.create({
    customerId: command.customerId,
    items: command.items,
    status: 'pending'
  });
  await eventBus.publish({ type: 'OrderPlaced', data: order });
}

// Read model: denormalized, optimized for the dashboard query
eventBus.subscribe('OrderPlaced', async (event) => {
  await readDb.orderSummaries.upsert({
    orderId: event.data.id,
    customerName: event.data.customerName, // pre-joined
    itemCount: event.data.items.length,
    total: event.data.total,
    status: 'pending'
  });
});

// Query: fast single-table read, no joins needed
async function getOrderDashboard(customerId: string) {
  return readDb.orderSummaries
    .where('customerId', customerId)
    .orderBy('createdAt', 'desc')
    .limit(50);
}
AspectTraditional CRUDCQRS
Read modelSame as write (normalized)Separate (denormalized for queries)
Write modelSame as readSeparate (normalized for integrity)
Read performanceJoins on every queryPre-computed, fast
ConsistencyImmediateEventually consistent
ComplexityLowHigh (two models to maintain)
Best forSimple apps with balanced reads/writesRead-heavy apps with complex queries

CQRS pairs naturally with event-driven architecture: writes produce events, and those events update the read model. But it introduces eventual consistencyWhat is eventual consistency?A guarantee that all copies of data will converge to the same value given enough time, rather than being instantly synchronized after every write. -- the read model lags behind the write model by the time it takes to process events.

04

When to use each pattern

PatternUse whenAvoid when
Saga (orchestration)Complex multi-service workflows with many stepsSimple 2-service interactions
Saga (choreography)Loose coupling matters, few stepsMany steps where tracing is important
OutboxYou need guaranteed event publishing with DB writesYou are using an event-sourced store (events are already the source of truth)
CQRSReads and writes have very different performance needsSimple CRUD with balanced access patterns

Start simple. Add these patterns when the pain of not having them becomes real.