System Design - Data Access Layer

Create a free account to save your progress

Earn XP, track streaks, and sync your dashboard across devices.

Lesson

The data access layer sits at the bottom of your application stack. It is the only code that talks to the database. Everything above it, business logic, controllers, the entire HTTPWhat is http?The protocol browsers and servers use to exchange web pages, API data, and other resources, defining how requests and responses are formatted. layer, interacts with the data access layer through clean method calls, never touching SQLWhat is sql?A language for querying and managing data in relational databases, letting you insert, read, update, and delete rows across tables. directly.

This separation is not just about code organization. It is about making your database a replaceable implementation detail rather than something that seeps into every corner of your codebase.

The repositoryWhat is repository?A project folder tracked by Git that stores your files along with the complete history of every change, inside a hidden .git directory. pattern

The repository pattern gives your business logic a clean interface to work with. Instead of the service calling db.query('SELECT * FROM users WHERE id = INLINE_CODE', [id]), it calls userRepo.findById(id). The SQLWhat is sql?A language for querying and managing data in relational databases, letting you insert, read, update, and delete rows across tables. is hidden behind the repository.

// repositories/productRepository.ts
export class ProductRepository {
  constructor(private db: Database) {}

  async findById(id: string): Promise<Product | null> {
    const row = await this.db.query(
      'SELECT id, name, price, stock, category FROM products WHERE id = CODE_BLOCK',
      [id]
    );
    return row ? this.toDomain(row) : null;
  }

  async findByCategory(category: string): Promise<Product[]> {
    const rows = await this.db.query(
      'SELECT id, name, price, stock, category FROM products WHERE category = CODE_BLOCK ORDER BY name',
      [category]
    );
    return rows.map(this.toDomain);
  }

  async updateStock(id: string, newStock: number): Promise<void> {
    await this.db.query(
      'UPDATE products SET stock = CODE_BLOCK, updated_at = NOW() WHERE id = $2',
      [newStock, id]
    );
  }

  async create(data: CreateProductData): Promise<Product> {
    const row = await this.db.query(
      `INSERT INTO products (name, price, stock, category)
       VALUES (CODE_BLOCK, $2, $3, $4) RETURNING *`,
      [data.name, data.price, data.stock, data.category]
    );
    return this.toDomain(row);
  }

  private toDomain(row: any): Product {
    return {
      id: row.id,
      name: row.name,
      price: parseFloat(row.price),
      stock: row.stock,
      category: row.category,
    };
  }
}

The toDomain method is important. Database rows are raw data, prices might come back as strings, dates as ISO strings, booleans as 0/1. The repository translates these into proper domain objects that the restWhat is rest?An architectural style for web APIs where URLs represent resources (nouns) and HTTP methods (GET, POST, PUT, DELETE) represent actions on those resources. of your application can trust.

Why not just pass `db` to the service?

You could. Many projects do. But the repository gives you:

A single place to change queries: when you optimize a slow query, you change one file
A clear interface to mockWhat is mock?A fake replacement for a real dependency in tests that records how it was called so you can verify interactions. in tests: { findById: jest.fn() } is easier than mocking a database driver
Protection against SQL leaking upward: the service never sees a SQL string

ORMWhat is orm?Object-Relational Mapping - a library that lets you interact with a database using your programming language's objects instead of writing raw SQL. vs query builderWhat is query builder?A library that lets you construct SQL queries by chaining method calls in code, giving more control than an ORM but less boilerplate than raw SQL. vs raw SQLWhat is sql?A language for querying and managing data in relational databases, letting you insert, read, update, and delete rows across tables.

This is one of the most debated choices in backend development. Here is an honest comparison:

Aspect	ORM (Prisma, Drizzle)	Query Builder (Knex)	Raw SQL (pg, mysql2)
Learning curve	Low, reads like English	Medium, SQL-like but with JS API	High, you must know SQL well
Type safety	Excellent (auto-generated types)	Good with TypeScript wrappers	Manual, you type the results yourself
Complex queries	Painful, joins, subqueries, CTEs fight the abstraction	Comfortable, composes well	Natural, SQL was designed for this
Performance control	Limited, ORM decides the query	Good, you see what runs	Full, you write what runs
Migrations	Built in (Prisma Migrate, Drizzle Kit)	Built in (knex migrate)	Manual (write SQL files)
N+1 query risk	High, eager/lazy loading is confusing	Medium, you write explicit joins	Low, you control every query
Switching databases	Easy, change the provider config	Moderate, Knex abstracts dialect	Hard, SQL dialects differ
Best for	CRUD-heavy apps, rapid prototyping	Medium complexity, need more control	Complex reporting, performance-critical

ORM example (Prisma)

// Prisma - declarative, type-safe, high abstraction
const user = await prisma.user.findUnique({
  where: { id: userId },
  include: {
    orders: {
      where: { status: 'completed' },
      orderBy: { createdAt: 'desc' },
      take: 5,
    },
  },
});

Prisma generates the SQL behind the scenes. You get full TypeScript types for user and user.orders automatically. The tradeoff: when you need a complex aggregation or a window functionWhat is window function?An SQL function that performs a calculation across a set of related rows without collapsing them into a single output row., Prisma fights you.

Query builder example (Knex)

// Knex - SQL-like, composable, medium abstraction
const users = await knex('users')
  .select('users.id', 'users.name', knex.raw('COUNT(orders.id) as order_count'))
  .leftJoin('orders', 'users.id', 'orders.user_id')
  .where('orders.status', 'completed')
  .groupBy('users.id')
  .orderBy('order_count', 'desc')
  .limit(10);

Knex lets you compose queries programmatically while staying close to SQL. You can see exactly what query will run. The tradeoff: types are not generated automatically, you need to define them yourself.

Raw SQL example

// Raw SQL - full control, no abstraction
const { rows } = await pool.query(`
  SELECT
    u.id,
    u.name,
    COUNT(o.id) as order_count,
    SUM(o.total) as lifetime_value
  FROM users u
  LEFT JOIN orders o ON u.id = o.user_id AND o.status = 'completed'
  GROUP BY u.id
  HAVING COUNT(o.id) > 0
  ORDER BY lifetime_value DESC
  LIMIT CODE_BLOCK
`, [10]);

Raw SQL gives you complete control. Complex joins, CTEs, window functions, recursive queries, nothing fights you. The tradeoff: no type safety, no migrations, and SQL injectionWhat is sql injection?An attack where user input is inserted directly into a database query, letting the attacker read, modify, or delete data. Parameterized queries prevent it. is one string concatenation away.

AI pitfall

AI-generated Prisma code loves to use include for nested relations, which can trigger the N+1 query problem. If AI generates include: { orders: { include: { items: true } } } for a list endpoint that returns 100 users, you are looking at 200+ database queries instead of one well-written JOIN.

Connection pooling

Every database query needs a connection. Opening a new connection for each request is expensive, it involves TCP handshakes, authenticationWhat is authentication?Verifying who a user is, typically through credentials like a password or token., and memory allocation. Under load, you exhaust the database's connection limit and everything crashes.

A connection poolWhat is connection pool?A set of pre-opened database connections that your app reuses instead of opening and closing a new one for every request. maintains a set of reusable connections. Instead of opening/closing connections, you borrow one from the pool and return it when done.

import { Pool } from 'pg';

// Create pool once at application startup
const pool = new Pool({
  host: process.env.DB_HOST,
  port: 5432,
  database: 'myapp',
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,
  max: 20,          // Maximum connections in the pool
  idleTimeoutMillis: 30000,  // Close idle connections after 30s
  connectionTimeoutMillis: 2000, // Fail if no connection available in 2s
});

// Use the pool for every query
export async function query(text: string, params?: any[]) {
  const client = await pool.connect();
  try {
    return await client.query(text, params);
  } finally {
    client.release(); // Return connection to pool, do not close it
  }
}

Good to know

Prisma and Knex handle connection pooling internally. If you are using raw pg, you must set up the pool yourself. The max setting depends on your database plan, most managed databases (Neon, Supabase, RDS) cap connections at 20-100. Set your pool to ~80% of that limit to leave room for migrations and admin tools.

TransactionWhat is transaction?A group of database operations that either all succeed together or all fail together, preventing partial updates. management

Transactions ensure that a group of database operations either all succeed or all fail. Without transactions, a crash mid-operation can leave your data in an inconsistent state.

Classic example: transferring money between accounts.

async function transferFunds(
  fromId: string,
  toId: string,
  amount: number
): Promise<void> {
  const client = await pool.connect();

  try {
    await client.query('BEGIN');

    // Debit source account
    const { rows } = await client.query(
      'UPDATE accounts SET balance = balance - CODE_BLOCK WHERE id = $2 AND balance >= CODE_BLOCK RETURNING balance',
      [amount, fromId]
    );
    if (rows.length === 0) {
      throw new Error('Insufficient funds');
    }

    // Credit destination account
    await client.query(
      'UPDATE accounts SET balance = balance + CODE_BLOCK WHERE id = $2',
      [amount, toId]
    );

    // Record the transfer
    await client.query(
      'INSERT INTO transfers (from_id, to_id, amount) VALUES (CODE_BLOCK, $2, $3)',
      [fromId, toId, amount]
    );

    await client.query('COMMIT');
  } catch (err) {
    await client.query('ROLLBACK');
    throw err;
  } finally {
    client.release();
  }
}

If any step fails, the debit, the credit, or the record, the entire transaction rolls back. The source account keeps its money, the destination gets nothing, and no partial transfer is recorded.

Transactions with Prisma

// Prisma handles the BEGIN/COMMIT/ROLLBACK for you
const transfer = await prisma.$transaction(async (tx) => {
  const source = await tx.account.update({
    where: { id: fromId },
    data: { balance: { decrement: amount } },
  });

  if (source.balance < 0) {
    throw new Error('Insufficient funds');
  }

  await tx.account.update({
    where: { id: toId },
    data: { balance: { increment: amount } },
  });

  return tx.transfer.create({
    data: { fromId, toId, amount },
  });
});

When to use transactions

Not every database operation needs a transaction. Use them when:

Multiple tables must stay in sync (orders + order items + inventory)
A failure mid-operation would corrupt data (money transfers, booking systems)
You need read-then-write consistency (check balance, then debit)

Single INSERT or UPDATE operations are already atomic by default, wrapping them in a transaction adds overhead without benefit.

Edge case

Serverless environments (AWS Lambda, Cloudflare Workers) make connection pooling tricky. Each invocation might spawn a new connection, quickly exhausting the database limit. Use an external connection pooler like PgBouncer or Neon's built-in pooler to manage this.

Good to know

Transactions hold database locks. Long-running transactions can block other operations and cause timeouts. Keep transactions as short as possible, do all your computation before starting the transaction, then execute only the database writes inside it.

Done

Complete & Next