Most bad architectures come from an incorrect understanding of the constraints, not from picking the wrong database. Before drawing a single box on a diagram, you must understand the problem.
The 4 critical dimensions
Every system lives inside four dimensions. Miss one, and your architecture collapses under a force you did not account for.
Scale
Questions to ask:
- How many daily active users (DAU)?
- How many requests per second (RPS) at peak?
- How much data stored (GB, TB, PB)?
- What growth is expected (6 months, 1 year, 5 years)?
- What is the read-to-write ratio?
Example:
| System | DAU | RPS (peak) | Storage | Read:Write ratio |
|---|---|---|---|---|
| Personal blog | 100 | 1 | 1 GB | 100:1 |
| E-commerce | 10K | 100 | 100 GB | 10:1 |
| Social network | 1M | 10K | 10 TB | 5:1 |
| Video platform | 100M | 1M | 1 PB | 50:1 |
Each order of magnitude changes the architecture. At 100 RPS, a single server with PostgreSQL works fine. At 10K RPS, you need load balancing, caching, and read replicas. At 100K RPS, you need shardingWhat is sharding?Splitting a database across multiple servers by distributing rows based on a key, so each server handles only a portion of the total data., CDNs, and distributed caching.
The read-to-write ratio determines your caching strategy. A blog (100:1) benefits enormously from caching. A chat application (1:1) gets almost no benefit from traditional caching.
LatencyWhat is latency?The time delay between sending a request and receiving the first byte of the response, usually measured in milliseconds.
Expected response times:
| Type | Acceptable latency | Example | User perception |
|---|---|---|---|
| Interactive UI | < 100ms | Button click, hover | Instantaneous |
| REST API | < 500ms | Page load, form submit | Responsive |
| Search | < 2s | Full-text query | Noticeable but acceptable |
| Background job | Seconds to minutes | Email send, report | User does not wait |
| Batch/Report | Minutes to hours | Daily analytics | Scheduled, not interactive |
Response time law (Jakob Nielsen):
- 0.1s: Limit of instantaneous perception
- 1s: Limit of uninterrupted thought flow
- 10s: Limit of user attention
Availability
SLAWhat is sla?A formal commitment defining the minimum uptime or performance level a service promises to deliver, usually expressed as a percentage like 99.9%. (Service Level Agreement):
| SLA | Downtime/year | Downtime/month | Use case |
|---|---|---|---|
| 99% | 3.65 days | 7.3 hours | Prototype, internal tool |
| 99.9% | 8.76 hours | 43.8 minutes | Most B2B apps |
| 99.99% | 52.6 minutes | 4.4 minutes | E-commerce, fintech |
| 99.999% | 5.26 minutes | 26.3 seconds | Hospital systems, air traffic |
Each additional 9 costs roughly 10x more. Going from 99.9% to 99.99% means multi-region deployment, automated failoverWhat is failover?Automatically switching traffic from a failed server or service to a healthy backup to keep the system running., chaos engineeringWhat is chaos engineering?The practice of deliberately injecting failures into a system to discover weaknesses before they cause real incidents., and dedicated SRE teams.
Unless you are in healthcare, finance, or aviation, 99.9% is probably fine. Aiming for 99.99% without the team and budget to support it is a recipe for burnout.
Budget and team
Real-world constraints that architects often forget:
- What is the monthly infrastructure budget?
- How many developers on the project?
- What expertise (DevOps, SRE) is available?
- What is the time-to-market?
- What does the team already know? (Choosing unfamiliar tech adds months)
The "team" constraint is the most frequently underestimated. A team of 2 full-stack developers cannot operate a microservicesWhat is microservices?An architecture where an application is split into small, independently deployed services that communicate over the network, each owning its own data. architecture. Running 10 services with independent databases and deployment pipelines requires at least a dedicated DevOps engineer. If you do not have one, choose a monolithWhat is monolith?A software architecture where the entire application lives in a single codebase and deploys as one unit. Simpler to build and debug than microservices..
Back-of-envelope math
Convert vague requirements into concrete numbers. This is called back-of-envelope estimation.
Example: Estimating storage for a photo-sharing app
Users: 50,000 DAU
Photos per user per day: 2 (average)
Photo size: 3 MB (after compression)
Thumbnails: 100 KB each (3 sizes)
Daily new storage:
50,000 × 2 × 3 MB = 300 GB/day for originals
50,000 × 2 × 3 × 100 KB = 30 GB/day for thumbnails
Total: ~330 GB/day
Monthly: ~10 TB
Yearly: ~120 TBThis estimation takes 2 minutes and immediately tells you that you need object storage (like S3) rather than a database for images, and that storage costs will grow linearly.
Framework: REQUIREMENTS
For each feature, work through this checklist:
- Reads: How many reads? Read pattern (random vs. sequential)?
- Ecritures (Writes): How many writes? Burst or steady?
- Query patterns: How is the data queried? Simple lookups or complex joins?
- Users: Who uses the system? Where are they geographically?
- Integrations: What external APIs/services?
- Reliability: What SLAWhat is sla?A formal commitment defining the minimum uptime or performance level a service promises to deliver, usually expressed as a percentage like 99.9%.? What data is critical vs. losable?
- Evolution: Expected growth? How frequently does the schemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required. change?
- Money: Budget? Feature ROI?
- Existing: What existing systems? MigrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync. path?
- Now: What deadlines? MVP vs v2?
- Team: What skills are available? Learning curve budget?
- Security: Sensitive data? Compliance (GDPRWhat is gdpr?A European regulation that gives users control over their personal data, including the right to access, delete, and export it., HIPAA, SOC2)?
You do not need to answer every question for every feature. But skipping a dimension entirely is how you get surprised three months into development.
Practical example
Feature: "Real-time notification system"
Identified constraints:
- Scale: 100K DAU, 1000 notifications/sec at peak
- LatencyWhat is latency?The time delay between sending a request and receiving the first byte of the response, usually measured in milliseconds.: < 1s between event and notification
- Availability: 99.9% (not critical, but important)
- Budget: $500/month max
- Team: 2 fullstack developers, no dedicated DevOps
Architectural implications:
- Managed service (Pusher, Ably) rather than self-hosted (Socket.io cluster), the team cannot operate WebSocket infrastructure
- WebSockets for real-time, fallback HTTPWhat is http?The protocol browsers and servers use to exchange web pages, API data, and other resources, defining how requests and responses are formatted. long-pollingWhat is polling?Repeatedly asking a server at regular intervals if anything has changed, which works but wastes resources when nothing is new. for environments that block WebSockets
- No need for multi-region (too expensive for the budget)
- Simple monitoring (Uptime Robot or similar) rather than complex distributed tracingWhat is distributed tracing?Tracking a single request as it travels through multiple services, showing timing and dependencies at each step.
- Notification delivery is best-effort, not guaranteed, if a notification is lost, the user can check in-app
Each constraint narrows the solution space. You do not pick a technology and hope it fits, you define the box, then find the technology that fits inside it.