The moment you distribute state across services communicating via events, you lose the guarantees that a single database gives you for free. No more ACIDWhat is acid?Four guarantees a database makes about transactions: changes are all-or-nothing, data stays valid, concurrent users don't interfere, and saved data survives crashes. transactions spanning your entire system. No more "read your own write" by default.
What eventual consistencyWhat is eventual consistency?A guarantee that all copies of data will converge to the same value given enough time, rather than being instantly synchronized after every write. actually means
In a monolithWhat is monolith?A software architecture where the entire application lives in a single codebase and deploys as one unit. Simpler to build and debug than microservices., when you update an order status to "shipped", every subsequent query sees "shipped" immediately. In a distributed system, the order service updates its database, publishes an OrderShipped event, and each downstream service processes that event at different times.
Timeline:
T0: Order service updates status to "shipped"
T1: OrderShipped event published to broker (50ms later)
T2: Notification service processes event (200ms later)
T3: Analytics service processes event (500ms later)
T4: Customer dashboard updated (800ms later)
Between T0 and T4:
- Order service says: "shipped"
- Analytics service says: "processing" (hasn't gotten the event yet)
- Customer sees: "processing" on dashboard (hasn't been updated yet)After T4, all services agree. But during the window, they disagree. Your application needs to handle this gracefully.
Practical implications
| Scenario | What the user sees | Why |
|---|---|---|
| User places order, immediately checks status | "Processing" instead of "Confirmed" | Read model hasn't been updated yet |
| User cancels, but confirmation email already sent | Gets both "confirmed" and "cancelled" emails | Email service processed before cancellation propagated |
| Admin views dashboard | Numbers slightly behind real-time | Analytics read model is seconds behind |
| Two services check inventory simultaneously | Both see "1 in stock" | Neither has processed the other's reservation event yet |
Out-of-order events
Events from a single producer usually arrive in order within a single partition. But across services, retries, and network hiccups, ordering can break.
// What you expect:
// 1. OrderPlaced (orderId: 123)
// 2. OrderPaid (orderId: 123)
// 3. OrderShipped (orderId: 123)
// What might actually arrive:
// 1. OrderPlaced (orderId: 123)
// 2. OrderShipped (orderId: 123) <-- arrived before OrderPaid!
// 3. OrderPaid (orderId: 123)Solution 1: Sequence numbers
Attach a version or sequence number to each event. Consumers only process events in order.
interface VersionedEvent {
type: string;
entityId: string;
version: number; // Monotonically increasing per entity
data: unknown;
}
async function handleEvent(event: VersionedEvent) {
const current = await db.entities.findOne({ id: event.entityId });
// Only process if this is the next expected version
if (current && event.version <= current.lastProcessedVersion) {
console.log(`Skipping event v${event.version}, already at v${current.lastProcessedVersion}`);
return; // Already processed or out of order
}
if (current && event.version > current.lastProcessedVersion + 1) {
// Gap detected: we missed an event. Buffer this one and wait.
await db.bufferedEvents.insert({
entityId: event.entityId,
version: event.version,
payload: event
});
console.log(`Buffering event v${event.version}, waiting for v${current.lastProcessedVersion + 1}`);
return;
}
// Process the event
await processEvent(event);
await db.entities.update(event.entityId, { lastProcessedVersion: event.version });
// Check if any buffered events can now be processed
await processBufferedEvents(event.entityId, event.version + 1);
}Solution 2: Last-write-wins by timestamp
For simple state replacements, always keep the latest version based on timestamp.
async function handleProfileUpdate(event: { userId: string; updatedAt: string; name: string }) {
const current = await db.profiles.findOne({ userId: event.userId });
// Only update if this event is newer than what we have
if (current && new Date(event.updatedAt) <= new Date(current.updatedAt)) {
console.log('Ignoring older event');
return;
}
await db.profiles.upsert({
userId: event.userId,
name: event.name,
updatedAt: event.updatedAt
});
}This fails when events represent incremental changes (like "add item to cart").
Solution 3: Partition by entity key
Use the entity ID as the partition key. All events for the same entity go to the same partition, preserving order.
// Kafka: use orderId as partition key
await producer.send({
topic: 'order-events',
messages: [{
key: order.id, // All events for this order go to the same partition
value: JSON.stringify(event)
}]
});
// Partition guarantees: OrderPlaced -> OrderPaid -> OrderShipped (in order for this orderId)This is the most common approach in Kafka-based systems. It does not solve cross-entity ordering, but that is rarely needed.
IdempotentWhat is idempotent?An operation that produces the same result whether you perform it once or multiple times, making retries safe. consumers
Messages can be delivered more than once due to broker retries, consumer crashes before acknowledging, or network timeouts. Your consumers must produce the same result whether they process a message once or five times.
// NOT idempotent: processes payment every time the event is received
async function handleOrderPlaced(event: OrderPlacedEvent) {
await paymentService.charge(event.data.customerId, event.data.total);
// If this event is delivered twice, the customer is charged twice!
}
// Idempotent: checks before processing
async function handleOrderPlaced(event: OrderPlacedEvent) {
// Check if we already processed this event
const alreadyProcessed = await db.processedEvents.findOne({ eventId: event.id });
if (alreadyProcessed) {
console.log(`Event ${event.id} already processed, skipping`);
return;
}
// Process within a transaction
await db.transaction(async (trx) => {
// Record that we processed this event
await trx('processed_events').insert({
event_id: event.id,
processed_at: new Date()
});
// Idempotent payment: use orderId as idempotency key
await paymentService.charge(event.data.customerId, event.data.total, {
idempotencyKey: event.data.orderId
});
});
}| Idempotency technique | How it works | Best for |
|---|---|---|
| Event ID deduplication | Store processed event IDs, skip duplicates | General purpose |
| Idempotency keys | Pass a unique key to downstream APIs | Payment processing, external calls |
| UPSERT (ON CONFLICT) | Insert or update, same result either way | Database writes |
| Conditional updates | UPDATE ... WHERE version = X | Optimistic concurrency |
| Natural idempotency | Operation is inherently safe to repeat (SET status = 'shipped') | Status changes, flag setting |
Compensating transactions
You cannot roll back across service boundaries. Instead, you issue compensating actions that logically reverse previous steps.
// Forward action: charge payment
async function chargePayment(orderId: string, amount: number) {
const payment = await stripe.charges.create({ amount, currency: 'usd' });
return payment.id;
}
// Compensating action: refund payment
async function refundPayment(paymentId: string) {
await stripe.refunds.create({ charge: paymentId });
}
// Forward action: reserve inventory
async function reserveInventory(orderId: string, items: Item[]) {
for (const item of items) {
await db('inventory')
.where('product_id', item.productId)
.decrement('available', item.quantity);
}
}
// Compensating action: release inventory
async function releaseInventory(orderId: string, items: Item[]) {
for (const item of items) {
await db('inventory')
.where('product_id', item.productId)
.increment('available', item.quantity);
}
}Compensating transactions are not true rollbacks. A refund is not the same as "the charge never happened" -- the bank statement shows both.
Consistency challenges and solutions
| Challenge | Problem | Solution | Tradeoff |
|---|---|---|---|
| Stale reads | User sees old data after a write | Read from write model for critical reads, accept lag for dashboards | Bypasses CQRS separation |
| Out-of-order events | Consumer gets events in wrong sequence | Partition by entity key, use version numbers | Added complexity, potential buffering delays |
| Duplicate delivery | Same event processed multiple times | Idempotent consumers with event ID deduplication | Storage cost for tracking processed events |
| Partial failures | Saga step 3 of 5 fails | Compensating transactions for completed steps | Compensation may fail too (needs alerting) |
| Lost events | Event published but never delivered | Outbox pattern, dead letter queues, monitoring | Operational overhead |
| Phantom reads | Service reads data that another service is about to change | Accept eventual consistency, use correlation IDs for tracing | Design around the possibility |
| Clock skew | Timestamps from different services disagree | Use logical clocks (version numbers) instead of wall clocks | More complex event schemas |
Practical rules
- Make every consumer idempotentWhat is idempotent?An operation that produces the same result whether you perform it once or multiple times, making retries safe.. This is the single most important rule in event-driven systems.
- Partition by entity key for ordering guarantees within an entity.
- Use the Outbox patternWhat is outbox pattern?A reliability pattern where events are written to a database table in the same transaction as business data, then published separately - guaranteeing delivery. to write to a database and publish an event atomically.
- Design compensating actions for every forward action in a sagaWhat is saga?A pattern for coordinating multi-service operations where each step has a compensating undo action that runs if a later step fails..
- Monitor the lag between write and read models. Alert when it exceeds your SLAWhat is sla?A formal commitment defining the minimum uptime or performance level a service promises to deliver, usually expressed as a percentage like 99.9%..
- Accept eventual consistencyWhat is eventual consistency?A guarantee that all copies of data will converge to the same value given enough time, rather than being instantly synchronized after every write.. Design your UI for it: show "processing" states, use optimistic updates, avoid showing stale data as current.