There is a predictable gap between what AI generates and what you actually deploy. AI gives you a Dockerfile that starts your application. Production requires a Dockerfile that starts your application securely, reports its own health, handles shutdown gracefully, and does not leak secrets. This lesson is the bridge between the two.
Non-root users
By default, every process inside a DockerWhat is docker?A tool that packages your application and all its dependencies into a portable container that runs identically on any machine. containerWhat is container?A lightweight, portable package that bundles your application code with all its dependencies so it runs identically on any machine. runs as root. If an attacker exploits a vulnerability in your APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses., a path traversal, a dependencyWhat is dependency?A piece of code written by someone else that your project needs to work. Think of it as a building block you import instead of writing yourself. with a remote code execution bug, they have root access inside the container. Container isolation is strong, but running as root makes kernel-level escape exploits possible.
The fix is three lines:
# After installing dependencies, before COPY . .
RUN adduser --disabled-password --no-create-home appuser
USER appuser
COPY . .adduser creates a system user with no password and no home directory. USER appuser switches all subsequent instructions (and the running container) to that user. --chown=appuser:appuser ensures the copied files are owned by appuser, not root.
What breaks when you switch to non-root
Some operations require root, and you need to do them before the USER instruction:
| Operation | Why it needs root | Solution |
|---|---|---|
apt-get install | System package installation | Run before USER appuser |
| Writing to system directories | Permission denied | Write to /app instead |
| Binding to port 80 | Privileged port | Use port 8000, map with Docker |
pip install (globally) | Writes to /usr/local | Use a venv, install before USER |
The pattern: install system packages and dependencies as root, create the user, switch to the user, then copy application code.
USER appuser before RUN pip install, which fails because appuser cannot write to /usr/local. Or it forgets --chown on the COPY instruction, so the app cannot read its own files. Always check the order: install as root, then switch.Secrets handling
This is the most dangerous mistake AI makes with Dockerfiles.
# NEVER DO THIS
ENV DATABASE_URL=postgresql://admin:s3cret@db.prod.example.com/myapp
ENV API_KEY=sk-live-abc123ENV instructions are baked into the image. Anyone who runs docker inspect or docker history on the image can read them. If you push the image to a registryWhat is registry?A server that stores and distributes packages or container images - npm registry for JavaScript packages, Docker Hub for container images., even a private one, the secrets are stored in plain text in every layer.
# Also never do this
ARG DATABASE_URL
RUN echo $DATABASE_URL > /app/configARG values appear in docker history too. They are build-time values, not secrets.
How to pass secrets safely
Secrets should never exist in the image. They should be injected at runtimeWhat is runtime?The environment that runs your code after it's written. Some languages need a runtime installed on the machine; others (like Go) bake it into the binary..
# compose.yaml - secrets as environment variables at runtime
services:
app:
build: .
environment:
- DATABASE_URL=${DATABASE_URL} # read from host or .env file
env_file:
- .env.production # or from a file# Command line - secrets as runtime env vars
docker run -e DATABASE_URL="postgresql://..." myappIn production orchestrators (Kubernetes, ECS, Fly.io), secrets are injected through the platform's secret manager. The containerWhat is container?A lightweight, portable package that bundles your application code with all its dependencies so it runs identically on any machine. never sees them until it starts running.
| Method | Safe? | Why |
|---|---|---|
ENV in Dockerfile | No | Baked into image layers |
ARG in Dockerfile | No | Visible in docker history |
Runtime -e flag | Yes | Only in the running container |
env_file in Compose | Yes | Not part of the image |
| Platform secret manager | Yes | Encrypted, access-controlled |
Health checks
A health checkWhat is health check?An API endpoint that verifies your application and its dependencies are working, so monitoring tools can alert you when something fails. tells DockerWhat is docker?A tool that packages your application and all its dependencies into a portable container that runs identically on any machine. (and orchestrators) how to determine if your application is actually working, not just running.
HEALTHCHECK \
CMD curl -f http://localhost:8000/health || exit 1This runs curl against your health endpointWhat is endpoint?A specific URL path on a server that handles a particular type of request, like GET /api/users. every 30 seconds. If it fails 3 times in a row, Docker marks the containerWhat is container?A lightweight, portable package that bundles your application code with all its dependencies so it runs identically on any machine. as unhealthy. Orchestrators like Docker Swarm or Kubernetes use this signal to restart the container or route traffic elsewhere.
The health endpoint
Your APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. needs a /health endpoint. A minimal one checks that the server responds. A better one checks that the database connection is alive:
# FastAPI health endpoint
@app.get("/health")
async def health():
try:
await db.execute("SELECT 1")
return {"status": "healthy"}
except Exception:
raise HTTPException(status_code=503, detail="Database unavailable")The health check parameters control behavior:
| Parameter | Default | Purpose |
|---|---|---|
--interval | 30s | Time between checks |
--timeout | 30s | Max time to wait for the check command |
--start-period | 0s | Grace period for startup (checks run but failures don't count) |
--retries | 3 | Consecutive failures before marking unhealthy |
--start-period is important for applications that take a few seconds to start. Without it, the health check might mark your container as unhealthy before it has finished loading.
wget (not installed on slim images) or curl (also not always available). On python:3.12-slim, neither is installed by default. You can install curl in your Dockerfile, or use Python:> HEALTHCHECK CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')" || exit 1
>Graceful shutdownWhat is graceful shutdown?Finishing all in-progress requests and closing connections cleanly before your server exits, instead of cutting off users mid-response.
When DockerWhat is docker?A tool that packages your application and all its dependencies into a portable container that runs identically on any machine. stops a containerWhat is container?A lightweight, portable package that bundles your application code with all its dependencies so it runs identically on any machine., it sends SIGTERM. Your application should catch this signal and shut down cleanly, close database connections, finish in-flight requests, flush logs.
This is why the JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it. array CMD format matters:
# Good - uvicorn receives SIGTERM directly
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
# Bad - SIGTERM goes to /bin/sh, not uvicorn
CMD uvicorn main:app --host 0.0.0.0 --port 8000The shell form wraps your command in /bin/sh -c "...". When Docker sends SIGTERM, the shell receives it, but most shells do not forward the signal to child processes. After a 10-second timeout (configurable with --stop-timeout), Docker sends SIGKILL, which forcefully terminates everything. Your application never gets a chance to shut down cleanly.
Production readiness checklist
Here is the checklist to evaluate any Dockerfile before it goes to production. Use this instead of asking AI to "make it production-ready", AI will add some items but miss others.
| Category | Check | Done? |
|---|---|---|
| Base image | Pinned to specific version (python:3.12-slim) | |
| Security | Non-root user (adduser + USER) | |
| Security | No secrets in ENV or ARG | |
| Security | .dockerignore excludes .env, .git, tests | |
| Performance | Multi-stage build (no build tools in final image) | |
| Performance | Layer caching (requirements.txt before COPY . .) | |
| Performance | --no-cache-dir on pip | |
| Reliability | HEALTHCHECK instruction | |
| Reliability | JSON array CMD (signal handling) | |
| Reliability | --start-period on health check for slow-starting apps | |
| Ops | EXPOSE documents the port | |
| Ops | Labels for metadata (LABEL maintainer=...) |
A complete production Dockerfile
Here is everything together in one file:
# Stage 1: Builder
FROM python:3.12-slim AS builder
WORKDIR /app
RUN apt-get update && apt-get install -y \
libpq-dev gcc \
&& rm -rf /var/lib/apt/lists/*
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Stage 2: Runtime
FROM python:3.12-slim
WORKDIR /app
RUN apt-get update && apt-get install -y \
libpq5 curl \
&& rm -rf /var/lib/apt/lists/*
RUN adduser --disabled-password --no-create-home appuser
COPY /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY . .
USER appuser
EXPOSE 8000
HEALTHCHECK \
CMD curl -f http://localhost:8000/health || exit 1
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]This Dockerfile is production-ready. It is multi-stage (small image), runs as non-root (secure), has a health checkWhat is health check?An API endpoint that verifies your application and its dependencies are working, so monitoring tools can alert you when something fails. (reliable), uses JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it. array CMD (graceful shutdownWhat is graceful shutdown?Finishing all in-progress requests and closing connections cleanly before your server exits, instead of cutting off users mid-response.), and contains no secrets. Compare this to what AI generates, the gap is usually 5-8 items from the checklist.
# Production-ready Dockerfile for a FastAPI application
# Stage 1: Builder
FROM python:3.12-slim AS builder
WORKDIR /app
RUN apt-get update && apt-get install -y \
libpq-dev gcc \
&& rm -rf /var/lib/apt/lists/*
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Stage 2: Runtime
FROM python:3.12-slim
WORKDIR /app
RUN apt-get update && apt-get install -y \
libpq5 curl \
&& rm -rf /var/lib/apt/lists/*
RUN adduser --disabled-password --no-create-home appuser
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY --chown=appuser:appuser . .
USER appuser
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]