Shipping Python APIs/
Lesson

Your single-stage Dockerfile works, but the image is 800 MB. Most of that weight comes from build tools, compilers, header files, pip's download cache, that your application never uses at runtimeWhat is runtime?The environment that runs your code after it's written. Some languages need a runtime installed on the machine; others (like Go) bake it into the binary.. Multi-stage builds let you install everything in a build stage, then copy only the finished output into a clean runtime stage.

Why image size matters

A bloated image is not just a storage problem. It affects every part of your workflow.

ProblemImpact
Slow CI/CD pipelinesPushing 1.2 GB per deploy wastes minutes
Slow container startupLarger images take longer to pull from registries
Larger attack surfaceBuild tools (gcc, make) in production give attackers more to exploit
Higher cloud costsRegistry storage and bandwidth are billed by the GB

The goal is to ship an image with exactly what your application needs to run, and nothing else.

02

The single-stage problem

Here is a typical Dockerfile for a Python APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. that depends on psycopg2 (the PostgreSQL driver, which has C extensions):

FROM python:3.12

WORKDIR /app

RUN apt-get update && apt-get install -y \
    libpq-dev gcc \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

This image weighs 1.1 GB. It contains gcc, libpq-dev, header files, and the full Debian toolchain. Your running application uses none of these, they were only needed at build time.

03

Multi-stage buildWhat is multi-stage build?A Dockerfile technique using multiple FROM instructions to separate build tools from the final lean production image. pattern

A multi-stage build uses two (or more) FROM instructions. The first stage builds everything. The second stage starts fresh and copies only the results.

# Stage 1: Builder
FROM python:3.12-slim AS builder

WORKDIR /app

RUN apt-get update && apt-get install -y \
    libpq-dev gcc \
    && rm -rf /var/lib/apt/lists/*

RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Stage 2: Runtime
FROM python:3.12-slim

WORKDIR /app

RUN apt-get update && apt-get install -y \
    libpq5 \
    && rm -rf /var/lib/apt/lists/*

COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

COPY . .

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

The result: 180 MB instead of 1.1 GB. The builder stage is discarded entirely, it does not appear in the final image.

04

Why virtual environments matter in DockerWhat is docker?A tool that packages your application and all its dependencies into a portable container that runs identically on any machine.

You might have heard that virtual environments are pointless in containers because the containerWhat is container?A lightweight, portable package that bundles your application code with all its dependencies so it runs identically on any machine. is already isolated. That is half true. You do not need venvs for isolation, but they are extremely useful for multi-stage builds.

Without a venv, pip install scatters packages across /usr/local/lib/python3.12/site-packages/, mixed in with system packages. Copying only your application's dependencies to the runtimeWhat is runtime?The environment that runs your code after it's written. Some languages need a runtime installed on the machine; others (like Go) bake it into the binary. stage becomes a surgical operation.

With a venv at /opt/venv, all your dependencies live in one directory. Copying them to the next stage is a single COPY --from=builder /opt/venv /opt/venv instruction.

# In the builder stage
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# pip install now goes into /opt/venv
RUN pip install --no-cache-dir -r requirements.txt
AI pitfall
AI almost never uses virtual environments in Docker. It installs packages globally, which makes multi-stage builds much harder. If you ask AI for a multi-stage build, it often copies the entire /usr/local/lib directory, dragging in system packages and partially defeating the purpose.
05

Slim vs alpine base images

ImageSizeC libraryCompatibility
python:3.12-slim~130 MBglibcExcellent, same as standard Debian
python:3.12-alpine~50 MBmuslProblematic, many Python packages fail

Alpine uses musl libc instead of glibc. Most pre-compiled Python wheels are built against glibc, so on Alpine, pip has to compile them from source. This means you need to install build tools in Alpine too, erasing the size advantage. Worse, some packages have subtle runtimeWhat is runtime?The environment that runs your code after it's written. Some languages need a runtime installed on the machine; others (like Go) bake it into the binary. bugs when compiled against musl.

The recommendation: use python:3.12-slim unless you have a specific reason to use Alpine and have tested all your dependencies against musl.

06

Layer caching in depth

DockerWhat is docker?A tool that packages your application and all its dependencies into a portable container that runs identically on any machine. caches each layer and reuses it as long as the inputs have not changed. The caching rule is simple: if any layer changes, all subsequent layers are rebuilt.

Layer 1: FROM python:3.12-slim           ← cached (base image)
Layer 2: COPY requirements.txt .cached (deps unchanged)
Layer 3: RUN pip install ...cached (deps unchanged)
Layer 4: COPY . .REBUILT (code changed)
Layer 5: CMD ...REBUILT (follows changed layer)

This is why COPY requirements.txt . comes before COPY . .. If you reverse the order:

Layer 1: FROM python:3.12-slim           ← cached
Layer 2: COPY . .REBUILT (code changed)
Layer 3: RUN pip install ...REBUILT (follows changed layer)

Every code change triggers a full pip install. On a project with heavy dependencies like pandas or scikit-learn, that is 3-5 minutes of rebuilding that could have been cached.

07

Comparing image sizes

Here is a real comparison for a FastAPI app with fastapi, uvicorn, sqlalchemy, psycopg2, and alembic:

ApproachImage size
python:3.12 (single stage, full image)1.2 GB
python:3.12-slim (single stage)350 MB
python:3.12-slim (multi-stage with venv)180 MB
python:3.12-alpine (multi-stage, if it compiles)120 MB

The multi-stage slim build gives you 85% size reduction with zero compatibility risk. That is the default choice for Python APIs.

08

Quick reference

ConceptWhat to do
Multi-stageUse FROM ... AS builder + final FROM
Venv in builderRUN python -m venv /opt/venv for clean COPY
Runtime depsInstall only runtime libs (e.g., libpq5 not libpq-dev)
Base imageUse python:3.12-slim by default
Cache orderCopy requirements.txt before COPY . .
javascript
# Multi-stage build for a FastAPI application
# Stage 1: Builder
FROM python:3.12-slim AS builder

WORKDIR /app

RUN apt-get update && apt-get install -y \
    libpq-dev gcc \
    && rm -rf /var/lib/apt/lists/*

RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Stage 2: Runtime (no build tools)
FROM python:3.12-slim

WORKDIR /app

RUN apt-get update && apt-get install -y \
    libpq5 \
    && rm -rf /var/lib/apt/lists/*

COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

COPY . .

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]