Shipping Python APIs - Structured logging

Create a free account to save your progress

Earn XP, track streaks, and sync your dashboard across devices.

Lesson

Your APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. works locally. You deploy it. A user reports that something is broken. You SSH into the server and look at the logs. There are none, or worse, there are thousands of print() lines with no timestamps, no levels, and no way to search. This is the reality of most AI-generated Python code, and it is the first thing to fix before anything goes to production.

Logging is not about recording what your code does. It is about making production problems solvable. Good logs answer questions: what happened, when, to whom, and in what order. Bad logs are just noise.

Why `print()` is not logging

Every Python tutorial starts with print(). AI code generators follow the same pattern, they scatter print() statements everywhere because that is what the training data looks like. But print() and logging solve fundamentally different problems.

# What AI generates - print() everywhere
def create_order(user_id: str, items: list):
    print(f"Creating order for user {user_id}")
    print(f"Items: {items}")
    order = process_payment(user_id, items)
    print(f"Order created: {order.id}")
    return order

This looks fine during development. In production, it falls apart.

Capability	`print()`	`logging` module
Log levels (DEBUG, ERROR, etc.)	No	Yes
Timestamps	No (manual only)	Automatic
Output routing (file, service, etc.)	stdout only	Multiple handlers
Structured data (JSON)	No	Yes, with formatters
Filtering by severity	No	Yes, per handler
Disable in production	No (must delete lines)	Yes, set level to WARNING
Correlation IDs	No	Yes, with filters

The core problem: print() writes to stdout with no metadata. When your APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. handles 500 requests per second, you cannot search print() output for the one request that failed. You need timestamps, levels, request IDs, and structure.

AI pitfall

AI generates print() for debugging feedback, not for production operations. When you ask it to "add logging," it often adds logging.basicConfig() at the top and logging.info() calls that are just print() with extra steps, no structure, no correlation, no handler configuration.

The Python `logging` moduleWhat is module?A self-contained file of code with its own scope that explicitly exports values for other files to import, preventing name collisions.

Python's built-in logging module is the foundation. It separates what you log from how and where the logs are processed.

Log levels

Every log message has a severity level. The logger checks whether the message's level meets the minimum threshold before processing it.

Level	Numeric value	When to use
`DEBUG`	10	Internal state details, disabled in production
`INFO`	20	Normal operations: startup, request handled, job completed
`WARNING`	30	Something unexpected but recoverable: retry needed, deprecation used
`ERROR`	40	A failure that affects a user but the app keeps running
`CRITICAL`	50	The app cannot continue, database gone, disk full

import logging

logger = logging.getLogger(__name__)

# In production, set to INFO or WARNING
# In development, set to DEBUG
logger.setLevel(logging.INFO)

logger.debug("Processing item %s", item_id)     # Skipped when level is INFO
logger.info("Order %s created for user %s", order_id, user_id)
logger.warning("Rate limit approaching for user %s", user_id)
logger.error("Payment failed for order %s: %s", order_id, str(e))
logger.critical("Database connection pool exhausted")

A common mistake: setting everything to ERROR. Then when you need to investigate a problem, you have no trail of what happened before the error. INFO is your bread and butter, it records the normal flow so you can reconstruct what led to a failure.

Loggers, handlers, and formatters

The logging module has a three-part architecture that separates concerns cleanly.

Logger (what to log)
  |
  v
Handler (where to send it)
  |
  v
Formatter (how it looks)

Logger: the object you call .info(), .error() on. Named by module (__name__).
Handler: decides where the log goes. StreamHandler writes to stdout. FileHandler writes to a file. SysLogHandler sends to a log aggregator. You can attach multiple handlers to one logger.
Formatter: controls the output format. Plain text for development, JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it. for production.

import logging
import sys

logger = logging.getLogger("myapp")
logger.setLevel(logging.DEBUG)

# Console handler - shows INFO and above
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(logging.INFO)
console_handler.setFormatter(logging.Formatter(
    "%(asctime)s [%(levelname)s] %(name)s: %(message)s"
))

# File handler - captures everything including DEBUG
file_handler = logging.FileHandler("app.log")
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(logging.Formatter(
    "%(asctime)s [%(levelname)s] %(name)s (%(filename)s:%(lineno)d): %(message)s"
))

logger.addHandler(console_handler)
logger.addHandler(file_handler)

This setup lets you see concise logs in your terminalWhat is terminal?A text-based interface where you type commands to interact with your computer. Also called the command line or shell. while saving detailed logs to a file, without changing any of your log statements.

Structured JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it. logging

Plain text logs work for reading in a terminalWhat is terminal?A text-based interface where you type commands to interact with your computer. Also called the command line or shell.. They break down when you need to search across millions of lines in a log aggregator like Datadog, Grafana Loki, or CloudWatch.

The problem with plain text

2024-03-15 14:32:01 [ERROR] myapp: Payment failed for order ord_abc123

To find all payment failures for a specific user, you would need to regexWhat is regex?A compact pattern language for matching, searching, and replacing text, built into nearly every programming language and code editor.-parse every line. If the log format changes slightly, your queries break.

JSON gives you queryable fields

json

{"timestamp": "2024-03-15T14:32:01Z", "level": "ERROR", "logger": "myapp", "message": "Payment failed", "order_id": "ord_abc123", "user_id": "usr_456", "error": "card_declined"}

Now you can query: level=ERROR AND user_id=usr_456, instant results across millions of log entries.

Setting up `python-json-logger`

pip install python-json-logger

import logging
from pythonjsonlogger import jsonlogger

logger = logging.getLogger("myapp")
handler = logging.StreamHandler()

formatter = jsonlogger.JsonFormatter(
    "%(asctime)s %(levelname)s %(name)s %(message)s",
    rename_fields={"asctime": "timestamp", "levelname": "level"}
)
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.INFO)

# Now logs are JSON
logger.info("Order created", extra={
    "order_id": "ord_abc123",
    "user_id": "usr_456",
    "total": 59.99,
    "items_count": 3
})

Output:

json

{"timestamp": "2024-03-15T14:32:01", "level": "INFO", "name": "myapp", "message": "Order created", "order_id": "ord_abc123", "user_id": "usr_456", "total": 59.99, "items_count": 3}

The extra dictionary is where structured loggingWhat is structured logging?Writing log entries as machine-readable JSON objects with consistent fields instead of plain text, making them searchable by log analysis tools. gets its power. Every field you add becomes a searchable, filterable dimension in your log aggregator.

Correlation IDs for request tracing

When your FastAPI handles hundreds of concurrent requests, the logs from different requests interleave. Without a way to group them, reading logs is like listening to 50 phone conversations at once.

A correlation ID is a unique identifier assigned to every incoming request. Every log entry for that request includes the same ID.

import uuid
from contextvars import ContextVar
from fastapi import FastAPI, Request

request_id_var: ContextVar[str] = ContextVar("request_id", default="")

app = FastAPI()

@app.middleware("http")
async def add_correlation_id(request: Request, call_next):
    # Use the client's ID if provided, otherwise generate one
    request_id = request.headers.get("X-Request-ID", str(uuid.uuid4()))
    request_id_var.set(request_id)

    response = await call_next(request)
    response.headers["X-Request-ID"] = request_id
    return response

Then add a logging filter that injects the correlation ID into every log entry automatically:

class CorrelationIdFilter(logging.Filter):
    def filter(self, record):
        record.request_id = request_id_var.get("")
        return True

logger.addFilter(CorrelationIdFilter())

Now every log line from a single request shares the same request_id. When a user reports a bug, they send you the request ID from the response header, and you search your logs for that single string. Every step of the request appears in order.

AI pitfall

AI often implements correlation IDs by passing the ID as a function parameter through every layer of your code. This works but creates coupling everywhere. The contextvars approach shown above is invisible to your business logic, the ID is set once at the middleware level and automatically included in every log.

Logging in FastAPI, putting it together

Here is a production-ready logging setup for a FastAPI application:

import logging
import sys
import os
from pythonjsonlogger import jsonlogger

def setup_logging():
    log_level = os.environ.get("LOG_LEVEL", "INFO").upper()

    # Root logger configuration
    root_logger = logging.getLogger()
    root_logger.setLevel(log_level)

    # JSON formatter for production
    json_handler = logging.StreamHandler(sys.stdout)
    json_formatter = jsonlogger.JsonFormatter(
        "%(asctime)s %(levelname)s %(name)s %(message)s",
        rename_fields={"asctime": "timestamp", "levelname": "level"}
    )
    json_handler.setFormatter(json_formatter)
    root_logger.addHandler(json_handler)

    # Silence noisy third-party loggers
    logging.getLogger("uvicorn.access").setLevel(logging.WARNING)
    logging.getLogger("sqlalchemy.engine").setLevel(logging.WARNING)

Silencing noisy loggers is essential. Libraries like SQLAlchemy and uvicorn log heavily at INFO level. If you do not raise their minimum level, your own application logs get buried under thousands of framework messages.

Quick reference

Concept	What to do	Why
Replace `print()`	Use `logging.getLogger(__name__)`	Levels, formatting, routing
Log level in production	`INFO` or `WARNING`	`DEBUG` floods logs with noise
Log format	JSON (`python-json-logger`)	Searchable in aggregators
Correlation IDs	`contextvars` + middleware	Trace one request across all logs
Third-party loggers	Set to `WARNING`	Prevent log noise from libraries
Log level configuration	Environment variable (`LOG_LEVEL`)	Change without redeploying

Done

Complete & Next