Shipping Python APIs - PostgreSQL with Python

Create a free account to save your progress

Earn XP, track streaks, and sync your dashboard across devices.

Lesson

Your AI-generated FastAPI app works locally. It connects to PostgreSQL, runs queries, returns JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it.. Then you deploy it, 30 users hit it simultaneously, and the database starts rejecting connections. The problem is not your queries, it is how the AI set up the connection. This lesson covers the drivers, connection strings, and pooling patterns you need to read and evaluate in AI-generated code.

The two PostgreSQL drivers

Python talks to PostgreSQL through driver libraries. There are two that matter.

psycopg2: the synchronous standard

psycopg2 has been the default Python PostgreSQL driver for over 20 years. It is synchronous, each database call blocks the thread until PostgreSQL responds. This is fine for scripts, CLIWhat is cli?Short for Command Line Interface. A tool you use by typing commands in the terminal instead of clicking buttons. tools, and WSGI applications (Flask, Django) where each request runs in its own thread.

import psycopg2

conn = psycopg2.connect("postgresql://user:pass@localhost:5432/mydb")
cursor = conn.cursor()
cursor.execute("SELECT id, name FROM users WHERE active = %s", (True,))
rows = cursor.fetchall()
cursor.close()
conn.close()

The %s placeholder is critical. It tells psycopg2 to parameterize the query, which prevents SQL injectionWhat is sql injection?An attack where user input is inserted directly into a database query, letting the attacker read, modify, or delete data. Parameterized queries prevent it.. Never use f-strings or string concatenation to build SQLWhat is sql?A language for querying and managing data in relational databases, letting you insert, read, update, and delete rows across tables. queries, ever.

AI pitfall

AI frequently generates queries with f-strings: cursor.execute(f"SELECT * FROM users WHERE id = {user_id}"). This is a SQL injection vulnerability. The AI does this because f-strings are the "modern" Python string formatting and the AI defaults to the most common pattern. Always look for parameterized queries with %s placeholders.

asyncpg: the async alternative

asyncpg is a pure-Python async driver built specifically for PostgreSQL. It speaks the PostgreSQL binaryWhat is binary?A ready-to-run file produced by the compiler. You can send it to any computer and it just works - no install needed. protocolWhat is protocol?An agreed-upon set of rules for how two systems communicate, defining the format of messages and the expected sequence of exchanges. directly (instead of going through libpq like psycopg2), making it 2-5x faster for most queries.

import asyncpg

conn = await asyncpg.connect("postgresql://user:pass@localhost:5432/mydb")
rows = await conn.fetch("SELECT id, name FROM users WHERE active = CODE_BLOCK", True)
await conn.close()

Two differences jump out. First, everything uses await, this is an async driver, so it works with FastAPI's event loopWhat is event loop?The mechanism that lets Node.js handle many operations on a single thread by delegating slow tasks and processing their results when ready. instead of blocking it. Second, it uses INLINE_CODE, $2 numbered placeholders instead of %s. This is PostgreSQL's native parameter syntax.

Which one to use

Factor	psycopg2	asyncpg
Sync/async	Synchronous (blocking)	Asynchronous (non-blocking)
Speed	Good	2-5x faster
Maturity	20+ years, massive ecosystem	Newer, but production-stable
Framework fit	Flask, Django	FastAPI, Starlette
SQLAlchemy support	Full (sync engine)	Full (async engine via `create_async_engine`)
Parameter style	`%s` positional	`INLINE_CODE` numbered

For FastAPI projects, use asyncpg (usually through SQLAlchemy's async engine). For Flask or Django projects, use psycopg2. If AI gives you psycopg2 in a FastAPI project, it will work but it blocks the event loop on every query, defeating the entire purpose of using an async framework.

Connection strings

Every PostgreSQL connection needs a connection string (also called a DSN). The format is a URI:

postgresql://username:password@hostname:port/database_name

Real examples:

# Local development (no password, default port)
"postgresql://postgres@localhost:5432/myapp_dev"

# Docker Compose (service name as hostname)
"postgresql://appuser:secret@db:5432/myapp"

# Production (managed database)
"postgresql://appuser:s3cur3p4ss@db-prod.us-east-1.rds.amazonaws.com:5432/myapp"

Connection strings contain credentials. They must come from environment variables, never from source code.

import os

DATABASE_URL = os.environ["DATABASE_URL"]

AI pitfall

AI almost always hardcodes the connection string directly in the code: engine = create_engine("postgresql://postgres:postgres@localhost/mydb"). This works locally but leaks credentials if the code is pushed to a public repository. It also means you cannot change the database without editing code.

Connection pooling

This is where AI-generated code causes the most production outages.

The problem: one connection per request

PostgreSQL has a hard limit on simultaneous connections (default: 100). Each connection consumes roughly 10 MB of memory on the server. Opening a connection takes 20-50ms because of the TCP handshakeWhat is handshake?The initial exchange between a client and server that establishes a connection and agrees on communication rules before data starts flowing. and PostgreSQL authenticationWhat is authentication?Verifying who a user is, typically through credentials like a password or token..

Without pooling, each incoming HTTPWhat is http?The protocol browsers and servers use to exchange web pages, API data, and other resources, defining how requests and responses are formatted. request opens a new connection, runs a query, and closes the connection. At 50 concurrent requests, that is 50 simultaneous connections. At 100, you hit the PostgreSQL limit. At 200, requests start failing with FATAL: too many connections for role.

# What AI generates - a new connection per request
@app.get("/users/{user_id}")
async def get_user(user_id: int):
    conn = await asyncpg.connect(DATABASE_URL)  # new connection every time
    user = await conn.fetchrow("SELECT * FROM users WHERE id = CODE_BLOCK", user_id)
    await conn.close()
    return user

This code works for one request. It collapses under load.

The solution: connection poolWhat is connection pool?A set of pre-opened database connections that your app reuses instead of opening and closing a new one for every request.

A connection pool maintains a set of reusable connections. When a request needs a database connection, it borrows one from the pool. When the request finishes, the connection goes back to the pool, it is not closed, just returned. The next request reuses it.

# asyncpg pool - created once at startup
pool = None

@app.on_event("startup")
async def startup():
    global pool
    pool = await asyncpg.create_pool(
        DATABASE_URL,
        min_size=5,    # keep 5 connections ready
        max_size=20    # never open more than 20
    )

@app.get("/users/{user_id}")
async def get_user(user_id: int):
    async with pool.acquire() as conn:  # borrow from pool
        user = await conn.fetchrow("SELECT * FROM users WHERE id = CODE_BLOCK", user_id)
    return user  # connection returned to pool automatically

The async with pool.acquire() pattern is key. It borrows a connection, uses it, and returns it to the pool when the with block exits, even if an exception occurs. No connections leak.

SQLAlchemy engine pooling

If you use SQLAlchemy (and you should for most projects), the engine handles pooling internally:

from sqlalchemy.ext.asyncio import create_async_engine

engine = create_async_engine(
    "postgresql+asyncpg://user:pass@localhost:5432/mydb",
    pool_size=5,       # persistent connections
    max_overflow=10,   # temporary connections under burst load
    pool_recycle=3600, # recycle connections after 1 hour
    pool_timeout=30    # wait 30s for a connection before raising an error
)

Parameter	What it does	Good default
`pool_size`	Number of connections kept permanently open	5
`max_overflow`	Extra connections allowed during burst traffic	10
`pool_recycle`	Seconds before a connection is replaced (avoids stale connections)	3600
`pool_timeout`	Seconds to wait for a free connection before raising an error	30

With pool_size=5 and max_overflow=10, your application uses at most 15 connections. That leaves room for other services, admin tools, and migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync. scripts to also connect to the same database.

Where AI creates connection leaks

The most common AI mistake is creating the engine inside the function:

# AI-generated - creates a NEW engine (and pool) per request
@app.get("/users")
async def get_users():
    engine = create_async_engine(DATABASE_URL)  # new pool every time!
    async with engine.begin() as conn:
        result = await conn.execute(text("SELECT * FROM users"))
    return result.fetchall()

Each call creates a new pool with 5 connections that are never cleaned up. After 20 requests, you have 100 orphaned connections. After 40, PostgreSQL refuses new connections. The fix is always the same: create the engine once at moduleWhat is module?A self-contained file of code with its own scope that explicitly exports values for other files to import, preventing name collisions. level, and pass it to endpointWhat is endpoint?A specific URL path on a server that handles a particular type of request, like GET /api/users. functions through dependencyWhat is dependency?A piece of code written by someone else that your project needs to work. Think of it as a building block you import instead of writing yourself. injection or a shared module.

AI pitfall

If you ask AI to "add a database query to this endpoint," it will often put the entire setup, engine creation, session factory, everything, inside the endpoint function. The AI optimizes for self-contained code snippets, not for application architecture. Always check that the engine is created exactly once.

Done

Complete & Next

Create a free account to save your progress

Essential to know

The two PostgreSQL drivers

psycopg2: the synchronous standard

asyncpg: the async alternative

Which one to use

Connection strings

Connection pooling

The problem: one connection per request

The solution: connection poolWhat is connection pool?A set of pre-opened database connections that your app reuses instead of opening and closing a new one for every request. Ask AI for more

SQLAlchemy engine pooling

Where AI creates connection leaks

The solution: connection poolWhat is connection pool?A set of pre-opened database connections that your app reuses instead of opening and closing a new one for every request.