System Design/
Lesson

Most bad architectures come from an incorrect understanding of the constraints, not from picking the wrong database. Before drawing a single box on a diagram, you must understand the problem.

The 4 critical dimensions

Every system lives inside four dimensions. Miss one, and your architecture collapses under a force you did not account for.

Scale

Questions to ask:

  • How many daily active users (DAU)?
  • How many requests per second (RPS) at peak?
  • How much data stored (GB, TB, PB)?
  • What growth is expected (6 months, 1 year, 5 years)?
  • What is the read-to-write ratio?

Example:

SystemDAURPS (peak)StorageRead:Write ratio
Personal blog10011 GB100:1
E-commerce10K100100 GB10:1
Social network1M10K10 TB5:1
Video platform100M1M1 PB50:1

Each order of magnitude changes the architecture. At 100 RPS, a single server with PostgreSQL works fine. At 10K RPS, you need load balancing, caching, and read replicas. At 100K RPS, you need shardingWhat is sharding?Splitting a database across multiple servers by distributing rows based on a key, so each server handles only a portion of the total data., CDNs, and distributed caching.

AI pitfall
When you ask AI to design a system, it defaults to big-tech scale. Always anchor your prompts with explicit numbers, "500 users in month one, 5,000 by month six", or you will end up with a massively over-engineered system.

The read-to-write ratio determines your caching strategy. A blog (100:1) benefits enormously from caching. A chat application (1:1) gets almost no benefit from traditional caching.

LatencyWhat is latency?The time delay between sending a request and receiving the first byte of the response, usually measured in milliseconds.

Expected response times:

TypeAcceptable latencyExampleUser perception
Interactive UI< 100msButton click, hoverInstantaneous
REST API< 500msPage load, form submitResponsive
Search< 2sFull-text queryNoticeable but acceptable
Background jobSeconds to minutesEmail send, reportUser does not wait
Batch/ReportMinutes to hoursDaily analyticsScheduled, not interactive

Response time law (Jakob Nielsen):

  • 0.1s: Limit of instantaneous perception
  • 1s: Limit of uninterrupted thought flow
  • 10s: Limit of user attention
Good to know
Total user-perceived latency includes DNS resolution (50-200ms), TLS handshake (50-100ms), network round trip (10-300ms), server processing, and client rendering. A server that responds in 50ms can still feel slow on a 3G connection.

Availability

SLAWhat is sla?A formal commitment defining the minimum uptime or performance level a service promises to deliver, usually expressed as a percentage like 99.9%. (Service Level Agreement):

SLADowntime/yearDowntime/monthUse case
99%3.65 days7.3 hoursPrototype, internal tool
99.9%8.76 hours43.8 minutesMost B2B apps
99.99%52.6 minutes4.4 minutesE-commerce, fintech
99.999%5.26 minutes26.3 secondsHospital systems, air traffic

Each additional 9 costs roughly 10x more. Going from 99.9% to 99.99% means multi-region deployment, automated failoverWhat is failover?Automatically switching traffic from a failed server or service to a healthy backup to keep the system running., chaos engineeringWhat is chaos engineering?The practice of deliberately injecting failures into a system to discover weaknesses before they cause real incidents., and dedicated SRE teams.

Unless you are in healthcare, finance, or aviation, 99.9% is probably fine. Aiming for 99.99% without the team and budget to support it is a recipe for burnout.

Budget and team

Real-world constraints that architects often forget:

  • What is the monthly infrastructure budget?
  • How many developers on the project?
  • What expertise (DevOps, SRE) is available?
  • What is the time-to-market?
  • What does the team already know? (Choosing unfamiliar tech adds months)

The "team" constraint is the most frequently underestimated. A team of 2 full-stack developers cannot operate a microservicesWhat is microservices?An architecture where an application is split into small, independently deployed services that communicate over the network, each owning its own data. architecture. Running 10 services with independent databases and deployment pipelines requires at least a dedicated DevOps engineer. If you do not have one, choose a monolithWhat is monolith?A software architecture where the entire application lives in a single codebase and deploys as one unit. Simpler to build and debug than microservices..

02

Back-of-envelope math

Convert vague requirements into concrete numbers. This is called back-of-envelope estimation.

Example: Estimating storage for a photo-sharing app

Users: 50,000 DAU
Photos per user per day: 2 (average)
Photo size: 3 MB (after compression)
Thumbnails: 100 KB each (3 sizes)

Daily new storage:
  50,000 × 2 × 3 MB = 300 GB/day for originals
  50,000 × 2 × 3 × 100 KB = 30 GB/day for thumbnails
  Total: ~330 GB/day

Monthly: ~10 TB
Yearly: ~120 TB

This estimation takes 2 minutes and immediately tells you that you need object storage (like S3) rather than a database for images, and that storage costs will grow linearly.

03

Framework: REQUIREMENTS

For each feature, work through this checklist:

  • Reads: How many reads? Read pattern (random vs. sequential)?
  • Ecritures (Writes): How many writes? Burst or steady?
  • Query patterns: How is the data queried? Simple lookups or complex joins?
  • Users: Who uses the system? Where are they geographically?
  • Integrations: What external APIs/services?
  • Reliability: What SLAWhat is sla?A formal commitment defining the minimum uptime or performance level a service promises to deliver, usually expressed as a percentage like 99.9%.? What data is critical vs. losable?
  • Evolution: Expected growth? How frequently does the schemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required. change?
  • Money: Budget? Feature ROI?
  • Existing: What existing systems? MigrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync. path?
  • Now: What deadlines? MVP vs v2?
  • Team: What skills are available? Learning curve budget?
  • Security: Sensitive data? Compliance (GDPRWhat is gdpr?A European regulation that gives users control over their personal data, including the right to access, delete, and export it., HIPAA, SOC2)?

You do not need to answer every question for every feature. But skipping a dimension entirely is how you get surprised three months into development.

04

Practical example

Feature: "Real-time notification system"

Identified constraints:

  • Scale: 100K DAU, 1000 notifications/sec at peak
  • LatencyWhat is latency?The time delay between sending a request and receiving the first byte of the response, usually measured in milliseconds.: < 1s between event and notification
  • Availability: 99.9% (not critical, but important)
  • Budget: $500/month max
  • Team: 2 fullstack developers, no dedicated DevOps

Architectural implications:

  • Managed service (Pusher, Ably) rather than self-hosted (Socket.io cluster), the team cannot operate WebSocket infrastructure
  • WebSockets for real-time, fallback HTTPWhat is http?The protocol browsers and servers use to exchange web pages, API data, and other resources, defining how requests and responses are formatted. long-pollingWhat is polling?Repeatedly asking a server at regular intervals if anything has changed, which works but wastes resources when nothing is new. for environments that block WebSockets
  • No need for multi-region (too expensive for the budget)
  • Simple monitoring (Uptime Robot or similar) rather than complex distributed tracingWhat is distributed tracing?Tracking a single request as it travels through multiple services, showing timing and dependencies at each step.
  • Notification delivery is best-effort, not guaranteed, if a notification is lost, the user can check in-app

Each constraint narrows the solution space. You do not pick a technology and hope it fits, you define the box, then find the technology that fits inside it.