Production Engineering/
Lesson

Logs are your app talking to you from production. When something goes wrong at 2am, well-written logs are the difference between a 10-minute fix and a 3-hour investigation. The goal is not to log everything, it's to log the right things in the right format so you can answer questions quickly.

Think of logs like annotations on a map. Too few and you're lost; too many and the map is unreadable. Good logging is about choosing which landmarks actually help you navigate.

Log levels

Choosing the right level

Log levels exist so you can filter signal from noise. Using them consistently across your codebase is one of the highest-leverage habits you can build:

LevelSeverityUse for
debugLowestDetailed internal state, disabled in production
infoNormalSuccessful operations, user actions, startup events
warnElevatedRecoverable problems, deprecated usage, rate limits
errorHighFailures that need attention but app is still running
fatalCriticalApp cannot continue, shutting down

A good rule of thumb: if you'd want to know about it during an incident but it doesn't require immediate action, it's a warn. If it means something broke for a user, it's an error.

import { createLogger, transports, format } from 'winston';

const logger = createLogger({
  level: process.env.NODE_ENV === 'production' ? 'info' : 'debug',
  format: format.combine(
    format.timestamp(),
    format.json()
  ),
  transports: [new transports.Console()],
});

logger.info('Server started', { port: 3000 });
logger.warn('Rate limit approaching', { userId: '123', remaining: 5 });
logger.error('Database connection failed', { error: err.message });
Set your minimum log level via an environment variable, not hardcoded. This lets you enable debug logs in staging without rebuilding and deploying.
02

Structured loggingWhat is structured logging?Writing log entries as machine-readable JSON objects with consistent fields instead of plain text, making them searchable by log analysis tools.

Why JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it. beats plain strings

Plain string logs look friendly but are a nightmare to query:

// Bad: hard to parse programmatically
console.log(`User 123 failed to login at 2024-01-15 14:32:01`);

// Good: structured, queryable, consistent
logger.warn('Login failed', {
  userId: '123',
  timestamp: new Date().toISOString(),
  reason: 'invalid_password',
  ipAddress: req.ip,
});

With structured logs, your log aggregator can instantly answer questions like "how many login failures came from IP 192.168.1.1 in the last hour?", with plain strings you'd be running regexWhat is regex?A compact pattern language for matching, searching, and replacing text, built into nearly every programming language and code editor. across gigabytes of text.

What to include in every log

Build a set of standard fields that appear in every log entry. This makes logs consistent and dramatically easier to filter:

// Create a request-scoped logger that auto-includes context
function createRequestLogger(req: Request) {
  return logger.child({
    requestId: req.headers['x-request-id'],
    userId: req.user?.id,
    method: req.method,
    path: req.path,
  });
}

// In your route handler
app.post('/checkout', (req, res) => {
  const log = createRequestLogger(req);

  log.info('Checkout initiated', { cartItems: req.body.items.length });
  // ... process checkout
  log.info('Checkout completed', { orderId: newOrder.id, total: newOrder.total });
});
03

Correlation IDs

Tracing a request through your system

When a user reports a problem, you need to find all the logs related to their specific request, not just the error, but every step leading up to it. Correlation IDs make this possible.

Generate a unique ID at the edge of your system (your load balancerWhat is load balancer?A server that distributes incoming traffic across multiple backend servers so no single server gets overwhelmed. or API gatewayWhat is api gateway?A single entry point that sits in front of multiple backend services, routing requests to the right one and handling shared concerns like authentication and rate limiting.) and pass it through every log and service call:

import { randomUUID } from 'crypto';

// Middleware: assign a correlation ID to every request
app.use((req, res, next) => {
  req.correlationId = req.headers['x-correlation-id'] as string || randomUUID();
  res.setHeader('x-correlation-id', req.correlationId);
  next();
});

// Use it in every log
logger.info('Processing payment', {
  correlationId: req.correlationId,
  orderId: order.id,
});

Now you can search your log aggregator for a single correlation ID and see the complete story of what happened, across multiple services, multiple machines.

04

What not to log

Sensitive data

This is a compliance and security requirement, not just a best practice. Logging sensitive data means that data ends up in your log aggregation service, accessible to anyone with log access, and stored potentially forever:

// Never log these fields
const REDACTED_FIELDS = ['password', 'token', 'creditCard', 'ssn', 'secret'];

function sanitize(obj: Record<string, unknown>): Record<string, unknown> {
  return Object.fromEntries(
    Object.entries(obj).map(([key, value]) => [
      key,
      REDACTED_FIELDS.some(f => key.toLowerCase().includes(f)) ? '[REDACTED]' : value,
    ])
  );
}

logger.info('User updated profile', sanitize(req.body));
Even in error logs, never include raw request bodies without sanitizing first. A single log line containing a password can cause a security incident.
05

Log aggregation

Centralizing logs from multiple servers

If you're running more than one server, logs are scattered across machines. Log aggregation tools collect them all into one searchable interface:

ToolBest forPricing model
Datadog LogsFull observability suitePer GB ingested
Logtail (Better Stack)Simple, affordablePer GB ingested
AWS CloudWatchAWS-native appsPer GB ingested + storage
Grafana LokiSelf-hosted, cost-sensitiveFree (self-hosted)
Elasticsearch + KibanaHigh-volume, custom setupInfrastructure cost

Most of these work by having your app send JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it. logs to stdout, and then a log shipper (like Fluent Bit or a platform-native agent) forwards them to the aggregation service.

06

Quick reference

PracticeDo thisAvoid this
Log levelMatch severity to levelUsing error for everything
FormatStructured JSONPlain string concatenation
Sensitive dataRedact before loggingLogging raw request bodies
ContextInclude request ID, user IDBare messages with no context
VolumeLog meaningful eventsDebug logs in production
AggregationCentralize with a toolSSH-ing into servers to read logs