System Design/
Lesson

Once you split into services, the biggest question is: how do they talk? In a monolithWhat is monolith?A software architecture where the entire application lives in a single codebase and deploys as one unit. Simpler to build and debug than microservices., it's a function call (nanoseconds, never fails). In a distributed system, it's a network call that can fail, be slow, or timeout.

Synchronous communication

The caller sends a request and waits for a response.

RESTWhat is rest?An architectural style for web APIs where URLs represent resources (nouns) and HTTP methods (GET, POST, PUT, DELETE) represent actions on those resources. (HTTPWhat is http?The protocol browsers and servers use to exchange web pages, API data, and other resources, defining how requests and responses are formatted./JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it.)

The default choice. Simple, well-understood, every language has HTTP libraries:

// Order service calls the user service
async function getUser(userId: string): Promise<User> {
  const response = await fetch(`http://user-service:3000/users/${userId}`);

  if (!response.ok) {
    throw new Error(`User service returned ${response.status}`);
  }

  return response.json();
}

gRPCWhat is grpc?A high-performance protocol for service-to-service communication that sends data in a compact binary format instead of JSON text.

gRPC uses ProtocolWhat is protocol?An agreed-upon set of rules for how two systems communicate, defining the format of messages and the expected sequence of exchanges. Buffers (binaryWhat is binary?A ready-to-run file produced by the compiler. You can send it to any computer and it just works - no install needed. serializationWhat is serialization?Converting data from a program's internal format into a string or byte sequence that can be stored or sent over a network.) and HTTP/2What is http/2?The second major version of HTTP - multiplexes multiple requests over a single connection and requires HTTPS, improving load speed over HTTP/1.1. for transport. Faster than REST for high-throughputWhat is throughput?The number of requests or operations a system can handle per unit of time, like requests per second. internal communication: smaller payloads, multiplexed connections.

// user.proto - define the contract
syntax = "proto3";

service UserService {
  rpc GetUser(GetUserRequest) returns (User);
  rpc ListUsers(ListUsersRequest) returns (stream User);
}

message GetUserRequest {
  string user_id = 1;
}

message User {
  string id = 1;
  string name = 2;
  string email = 3;
}

When to use each

FactorRESTgRPC
Ease of useSimple, any HTTP clientRequires protobuf tooling
PerformanceGood (JSON parsing overhead)Excellent (binary, multiplexed)
Browser supportNativeRequires gRPC-Web proxy
Public APIsStandard choiceUnusual, poor tooling
Internal servicesWorks fineBetter for high-throughput
StreamingPossible but awkwardFirst-class support
Type safetyOpenAPI/Swagger (optional)Built into protobuf (enforced)
DebuggingEasy (curl, Postman)Harder (need grpcurl)
02

Asynchronous communication

The sender publishes a message and moves on. Services don't need to be online simultaneously.

Message queues

A queue holds messages until a consumer processes them. Each message goes to exactly one consumer.

// Order service publishes to a queue
await rabbitmq.publish('order-processing', {
  orderId: order.id,
  userId: order.userId,
  items: order.items,
  total: order.total,
});

// Response to user is immediate - processing happens later
res.json({ orderId: order.id, status: 'processing' });
// Payment service consumes from the queue
rabbitmq.consume('order-processing', async (message) => {
  const order = JSON.parse(message.content);
  await processPayment(order);
  message.ack();  // Tell the queue this message is handled
});

Event streaming

Events are published to a topic and fan out to all subscribers (unlike queues where each message goes to one consumer).

// Order service publishes an event
await kafka.produce('order.created', {
  orderId: order.id,
  userId: order.userId,
  total: order.total,
  timestamp: new Date().toISOString(),
});

// Multiple services consume the same event independently:
// - Inventory service: deduct stock
// - Notification service: send confirmation email
// - Analytics service: update dashboard
// - Loyalty service: award points

The order service doesn't know or care how many consumers exist. It publishes the fact; each consumer reacts independently.

03

Communication patterns compared

PatternCouplingLatencyReliabilityComplexityBest for
REST (sync)Temporal + spatialLow (if service is up)Fails if service is downLowSimple request/response
gRPC (sync)Temporal + spatialVery lowFails if service is downMediumHigh-throughput internal
Message queue (async)Spatial onlyHigher (queued)Survives service downtimeMediumTask processing, job queues
Event streaming (async)None (fire and forget)Higher (eventual)Survives downtime, replayableHighEvent-driven architectures

"Temporal coupling" = both services must be running simultaneously. "Spatial coupling" = the caller must know the receiver's address.

04

Service discovery

Hardcoding URLs works for 3 services but breaks at 30.

DNSWhat is dns?The system that translates human-readable domain names like google.com into the numerical IP addresses computers use to find each other.-based discovery

// Instead of hardcoding the address
const response = await fetch('http://user-service/users/123');
// "user-service" resolves to the current IP via DNS

Kubernetes does this automatically. Every service gets a DNS name.

Service meshWhat is service mesh?Infrastructure that manages communication between microservices, handling routing, retries, and encryption without changing application code.

A service mesh (Istio, Linkerd) adds a sidecar proxy next to each service that handles discovery, load balancing, retries, timeouts, and mutual TLSWhat is ssl/tls?Encryption protocols that secure the connection between a browser and a server, preventing eavesdropping on data in transit.:

┌──────────────────┐      ┌──────────────────┐
│  Order Service    │      │  User Service     │
│  ┌─────────────┐  │      │  ┌─────────────┐  │
│  │  Your Code   │  │      │  │  Your Code   │  │
│  └──────┬──────┘  │      │  └──────┬──────┘  │
│  ┌──────┴──────┐  │      │  ┌──────┴──────┐  │
│  │ Envoy Proxy │──┼──────┼──│ Envoy Proxy │  │
│  └─────────────┘  │      │  └─────────────┘  │
└──────────────────┘      └──────────────────┘

Worth it at 20+ services. For 5 services, DNS discovery and application-level retries are sufficient.

05

The API gatewayWhat is api gateway?A single entry point that sits in front of multiple backend services, routing requests to the right one and handling shared concerns like authentication and rate limiting. pattern

A single entry point for all client requests. Routes traffic, handles cross-cutting concerns, presents a unified APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses.:

Client → API Gateway → /users/*    → User Service
                     → /orders/*   → Order Service
                     → /payments/* → Payment Service

The gateway handles routing, authenticationWhat is authentication?Verifying who a user is, typically through credentials like a password or token. (verify once), rate limitingWhat is rate limiting?Restricting how many requests a client can make within a time window. Prevents brute-force attacks and protects your API from being overwhelmed., protocolWhat is protocol?An agreed-upon set of rules for how two systems communicate, defining the format of messages and the expected sequence of exchanges. translation (RESTWhat is rest?An architectural style for web APIs where URLs represent resources (nouns) and HTTP methods (GET, POST, PUT, DELETE) represent actions on those resources. externally, gRPCWhat is grpc?A high-performance protocol for service-to-service communication that sends data in a compact binary format instead of JSON text. internally), and response aggregation.

06

The distributed monolithWhat is monolith?A software architecture where the entire application lives in a single codebase and deploys as one unit. Simpler to build and debug than microservices. anti-pattern

Services that are technically separate but functionally inseparable. Signs:

  1. Services must deploy together: changing the order service requires deploying the user service too
  2. Shared database: multiple services read and write to the same tables
  3. Long synchronous call chains: service A calls B calls C calls D, and if D is slow, everything is slow
  4. Shared libraries with business logic: a "common" package that every service depends on, creating lockstep versioning
  5. No independent team ownership: one team owns multiple services
Distributed monolith (anti-pattern):

Order Service ──sync──→ User Service ──sync──→ Auth Service
     │                                              │
     └──────sync──→ Payment Service ──sync──────────┘
                        │
                        └──── All share the same database ────→ 💥

The fix: either merge back into a monolith or properly decouple (own database per service, async communication).

07

Quick reference

DecisionRecommended pattern
Frontend to backendREST (or GraphQL)
Service needs immediate responseREST or gRPC
Fire-and-forget processingMessage queue
Multiple consumers react to same eventEvent streaming
< 10 servicesDNS discovery
> 20 services with security needsService mesh
Single external APIAPI gateway
AI pitfall
AI recommends gRPC for all service-to-service communication. gRPC adds complexity (protobuf schemas, code generation) that is only worth it for high-throughput internal services. For fewer than 10 services, REST is simpler to build, debug, and monitor.
Good to know
Service meshes solve real problems but add significant operational complexity. If you have fewer than 20 services, simple HTTP clients with retries and timeouts get you most of the way there.
Edge case
Network calls compound latency. Service A calls B calls C calls D = 3 round trips. A request that took 5ms as a function call now takes 30-50ms. This "latency tax" is invisible in architecture diagrams but dominates real-world performance.