Shipping Python APIs - Testing in CI

Create a free account to save your progress

Earn XP, track streaks, and sync your dashboard across devices.

Lesson

The previous lesson covered how to set up a GitHub Actions workflow for Python. Now we go deeper into the testing layer itself. Running pytest in CI is a start, but a production-quality pipelineWhat is pipeline?A sequence of automated steps (install, lint, test, build, deploy) that code passes through before reaching production. tests across multiple environments, enforces coverage standards, and includes checks that AI consistently skips.

Test matrixWhat is test matrix?A CI configuration that runs the same tests in parallel across multiple environments (e.g., Node 18, 20, 22).

A test matrix runs your entire test suite across multiple configurations in parallel. For Python projects, the two most useful matrix dimensions are Python version and operating system.

yaml

jobs:
  test:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        python-version: ["3.11", "3.12", "3.13"]
        os: [ubuntu-latest, macos-latest]
      fail-fast: false

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
          cache: "pip"
      - run: pip install -r requirements.txt -r requirements-dev.txt
      - run: pytest

This creates six parallel jobs, three Python versions times two operating systems. Each job runs on a fresh VM with its own isolated environment.

Setting	What it does
`matrix.python-version`	Tests against multiple Python versions
`matrix.os`	Tests against multiple operating systems
`fail-fast: false`	Lets all jobs finish even if one fails

The fail-fast: false setting is important. By default, GitHub Actions cancels all remaining matrix jobs the moment one fails. With fail-fast: false, you see all failures at once instead of fixing them one at a time.

AI pitfall

AI hardcodes a single Python version (usually 3.11) and ubuntu-latest. If your library supports 3.11 through 3.13 or your users run macOS, you will not discover compatibility issues until they file bug reports.

Parallel jobs for speed

Beyond the matrix, you can split your CI into parallel jobs by responsibility. Linting, type checking, and testing are independent, there is no reason to run them sequentially.

yaml

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: "pip"
      - run: pip install ruff
      - run: ruff check .
      - run: ruff format --check .

  typecheck:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: "pip"
      - run: pip install -r requirements.txt -r requirements-dev.txt
      - run: mypy src/

  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.11", "3.12"]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
          cache: "pip"
      - run: pip install -r requirements.txt -r requirements-dev.txt
      - run: pytest --cov=src --cov-report=xml

Three jobs start simultaneously. Linting finishes in seconds. Type checking finishes in 10-30 seconds. Tests might take a minute or more. Total wall-clock time equals the slowest job, not the sum of all jobs.

Coverage reports with pytest-cov

pytest-cov is a pytest plugin that measures code coverageWhat is code coverage?A metric showing what percentage of your code is exercised by your tests, measured by lines, branches, or functions., which lines of your source code are executed during tests and which are not.

# Install
pip install pytest-cov

# Run with coverage
pytest --cov=src --cov-report=term-missing

The --cov-report=term-missing flag shows exactly which lines are not covered:

Name                    Stmts   Miss  Cover   Missing
-----------------------------------------------------
src/auth.py                45      3    93%   67-69
src/routes/users.py        82     12    85%   44-48, 91-97
src/database.py            34      0   100%
-----------------------------------------------------
TOTAL                     161     15    91%

Enforcing a coverage threshold

You can make CI fail if coverage drops below a threshold:

yaml

- run: pytest --cov=src --cov-fail-under=80

This fails the job if overall coverage is below 80%. It prevents the slow erosion that happens when every new feature adds untested code.

Flag	Purpose
`--cov=src`	Measure coverage for the `src/` directory
`--cov-report=term-missing`	Show uncovered lines in terminal
`--cov-report=xml`	Generate XML report for upload to Codecov
`--cov-fail-under=80`	Fail if coverage drops below 80%

AI pitfall

AI-generated test pipelines never include coverage thresholds. Without a threshold, coverage silently drops from 90% to 60% to 30% over months. By the time anyone notices, there are hundreds of untested lines and no one remembers what they do.

Type checking with mypy

Mypy is a static type checker for Python. It reads your type annotations and verifies that function calls, return values, and variable assignments are consistent.

yaml

- name: Type check
  run: mypy src/ --strict

The --strict flag enables all optional checks: disallowing Any types, requiring return type annotations, checking untyped function definitions. It is aggressive, but it catches the bugs that matter most in production, the ones where a function returns None when the caller expects a dict.

Why mypy belongs in CI, not just your editor

Your editor's mypy plugin only checks the file you have open. CI runs mypy across the entire codebase in one pass. This catches cross-file issues: you change a function signature in auth.py, and mypy flags every caller in routes/, services/, and tests/ that passes the wrong arguments.

src/routes/users.py:34: error: Argument "role" to "create_user" has incompatible type "str"; expected "UserRole"
src/services/email.py:12: error: Missing return statement
tests/test_auth.py:56: error: "None" has no attribute "id"

AI pitfall

AI almost never includes mypy in CI workflows. It adds pytest and sometimes ruff, but type checking is consistently skipped. In a FastAPI project with Pydantic models, this means your CI cannot catch type mismatches between your API schemas and your database models, exactly the kind of bug that causes 500 errors in production.

Integration tests with service containers

Unit tests mockWhat is mock?A fake replacement for a real dependency in tests that records how it was called so you can verify interactions. the database. Integration tests use a real one. GitHub Actions lets you spin up service containers, DockerWhat is docker?A tool that packages your application and all its dependencies into a portable container that runs identically on any machine. containers that run alongside your test job.

yaml

jobs:
  integration:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_USER: test
          POSTGRES_PASSWORD: test
          POSTGRES_DB: testdb
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: "pip"
      - run: pip install -r requirements.txt -r requirements-dev.txt
      - name: Run integration tests
        env:
          DATABASE_URL: postgresql://test:test@localhost:5432/testdb
        run: pytest tests/integration/ -v

The services block starts a PostgreSQL containerWhat is container?A lightweight, portable package that bundles your application code with all its dependencies so it runs identically on any machine. before your steps run. The health-cmd ensures the database is ready before tests start. The DATABASE_URL environment variableWhat is environment variable?A value stored outside your code that configures behavior per deployment, commonly used for secrets like API keys and database URLs. tells your application where to connect.

Service containers for Redis

The same pattern works for Redis, RabbitMQ, or any service available as a Docker image:

yaml

services:
  redis:
    image: redis:7
    ports:
      - 6379:6379
    options: >-
      --health-cmd "redis-cli ping"
      --health-interval 10s
      --health-timeout 5s
      --health-retries 5

What AI skips in test pipelines

Here is a direct comparison of what AI generates versus what a production pipelineWhat is pipeline?A sequence of automated steps (install, lint, test, build, deploy) that code passes through before reaching production. needs:

What AI generates	What production needs
`pytest` (unit tests only)	Unit tests, integration tests, E2E tests
No coverage measurement	`pytest-cov` with `--cov-fail-under` threshold
No type checking	`mypy --strict` across the full codebase
No linting	`ruff check .` and `ruff format --check .`
Single Python version	Matrix across supported versions
No database in CI	Service containers for PostgreSQL, Redis
Sequential steps	Parallel jobs for lint, typecheck, test

Quick reference

Pattern	Purpose
`strategy.matrix`	Test across multiple Python versions/OS
`fail-fast: false`	See all failures, not just the first
`pytest --cov=src`	Measure test coverage
`--cov-fail-under=80`	Fail CI if coverage drops
`mypy src/ --strict`	Catch type errors across the codebase
`services: postgres:`	Spin up a real database for integration tests
Parallel jobs	Run lint, typecheck, test simultaneously

Done

Complete & Next

Create a free account to save your progress

Essential to know

Test matrixWhat is test matrix?A CI configuration that runs the same tests in parallel across multiple environments (e.g., Node 18, 20, 22). Ask AI for more

Parallel jobs for speed

Coverage reports with pytest-cov

Enforcing a coverage threshold

Type checking with mypy

Why mypy belongs in CI, not just your editor

Integration tests with service containers

Service containers for Redis

What AI skips in test pipelines

Quick reference

Test matrixWhat is test matrix?A CI configuration that runs the same tests in parallel across multiple environments (e.g., Node 18, 20, 22).