Your single-stage Dockerfile works, but the image is 800 MB. Most of that weight comes from build tools, compilers, header files, pip's download cache, that your application never uses at runtimeWhat is runtime?The environment that runs your code after it's written. Some languages need a runtime installed on the machine; others (like Go) bake it into the binary.. Multi-stage builds let you install everything in a build stage, then copy only the finished output into a clean runtime stage.
Why image size matters
A bloated image is not just a storage problem. It affects every part of your workflow.
| Problem | Impact |
|---|---|
| Slow CI/CD pipelines | Pushing 1.2 GB per deploy wastes minutes |
| Slow container startup | Larger images take longer to pull from registries |
| Larger attack surface | Build tools (gcc, make) in production give attackers more to exploit |
| Higher cloud costs | Registry storage and bandwidth are billed by the GB |
The goal is to ship an image with exactly what your application needs to run, and nothing else.
The single-stage problem
Here is a typical Dockerfile for a Python APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. that depends on psycopg2 (the PostgreSQL driver, which has C extensions):
FROM python:3.12
WORKDIR /app
RUN apt-get update && apt-get install -y \
libpq-dev gcc \
&& rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]This image weighs 1.1 GB. It contains gcc, libpq-dev, header files, and the full Debian toolchain. Your running application uses none of these, they were only needed at build time.
Multi-stage buildWhat is multi-stage build?A Dockerfile technique using multiple FROM instructions to separate build tools from the final lean production image. pattern
A multi-stage build uses two (or more) FROM instructions. The first stage builds everything. The second stage starts fresh and copies only the results.
# Stage 1: Builder
FROM python:3.12-slim AS builder
WORKDIR /app
RUN apt-get update && apt-get install -y \
libpq-dev gcc \
&& rm -rf /var/lib/apt/lists/*
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Stage 2: Runtime
FROM python:3.12-slim
WORKDIR /app
RUN apt-get update && apt-get install -y \
libpq5 \
&& rm -rf /var/lib/apt/lists/*
COPY /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]The result: 180 MB instead of 1.1 GB. The builder stage is discarded entirely, it does not appear in the final image.
Why virtual environments matter in DockerWhat is docker?A tool that packages your application and all its dependencies into a portable container that runs identically on any machine.
You might have heard that virtual environments are pointless in containers because the containerWhat is container?A lightweight, portable package that bundles your application code with all its dependencies so it runs identically on any machine. is already isolated. That is half true. You do not need venvs for isolation, but they are extremely useful for multi-stage builds.
Without a venv, pip install scatters packages across /usr/local/lib/python3.12/site-packages/, mixed in with system packages. Copying only your application's dependencies to the runtimeWhat is runtime?The environment that runs your code after it's written. Some languages need a runtime installed on the machine; others (like Go) bake it into the binary. stage becomes a surgical operation.
With a venv at /opt/venv, all your dependencies live in one directory. Copying them to the next stage is a single COPY --from=builder /opt/venv /opt/venv instruction.
# In the builder stage
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pip install now goes into /opt/venv
RUN pip install --no-cache-dir -r requirements.txt/usr/local/lib directory, dragging in system packages and partially defeating the purpose.Slim vs alpine base images
| Image | Size | C library | Compatibility |
|---|---|---|---|
python:3.12-slim | ~130 MB | glibc | Excellent, same as standard Debian |
python:3.12-alpine | ~50 MB | musl | Problematic, many Python packages fail |
Alpine uses musl libc instead of glibc. Most pre-compiled Python wheels are built against glibc, so on Alpine, pip has to compile them from source. This means you need to install build tools in Alpine too, erasing the size advantage. Worse, some packages have subtle runtimeWhat is runtime?The environment that runs your code after it's written. Some languages need a runtime installed on the machine; others (like Go) bake it into the binary. bugs when compiled against musl.
The recommendation: use python:3.12-slim unless you have a specific reason to use Alpine and have tested all your dependencies against musl.
Layer caching in depth
DockerWhat is docker?A tool that packages your application and all its dependencies into a portable container that runs identically on any machine. caches each layer and reuses it as long as the inputs have not changed. The caching rule is simple: if any layer changes, all subsequent layers are rebuilt.
Layer 1: FROM python:3.12-slim ← cached (base image)
Layer 2: COPY requirements.txt . ← cached (deps unchanged)
Layer 3: RUN pip install ... ← cached (deps unchanged)
Layer 4: COPY . . ← REBUILT (code changed)
Layer 5: CMD ... ← REBUILT (follows changed layer)This is why COPY requirements.txt . comes before COPY . .. If you reverse the order:
Layer 1: FROM python:3.12-slim ← cached
Layer 2: COPY . . ← REBUILT (code changed)
Layer 3: RUN pip install ... ← REBUILT (follows changed layer)Every code change triggers a full pip install. On a project with heavy dependencies like pandas or scikit-learn, that is 3-5 minutes of rebuilding that could have been cached.
Comparing image sizes
Here is a real comparison for a FastAPI app with fastapi, uvicorn, sqlalchemy, psycopg2, and alembic:
| Approach | Image size |
|---|---|
python:3.12 (single stage, full image) | 1.2 GB |
python:3.12-slim (single stage) | 350 MB |
python:3.12-slim (multi-stage with venv) | 180 MB |
python:3.12-alpine (multi-stage, if it compiles) | 120 MB |
The multi-stage slim build gives you 85% size reduction with zero compatibility risk. That is the default choice for Python APIs.
Quick reference
| Concept | What to do |
|---|---|
| Multi-stage | Use FROM ... AS builder + final FROM |
| Venv in builder | RUN python -m venv /opt/venv for clean COPY |
| Runtime deps | Install only runtime libs (e.g., libpq5 not libpq-dev) |
| Base image | Use python:3.12-slim by default |
| Cache order | Copy requirements.txt before COPY . . |
# Multi-stage build for a FastAPI application
# Stage 1: Builder
FROM python:3.12-slim AS builder
WORKDIR /app
RUN apt-get update && apt-get install -y \
libpq-dev gcc \
&& rm -rf /var/lib/apt/lists/*
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Stage 2: Runtime (no build tools)
FROM python:3.12-slim
WORKDIR /app
RUN apt-get update && apt-get install -y \
libpq5 \
&& rm -rf /var/lib/apt/lists/*
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]