You asked AI to generate your database models. It produced Python classes that look clean, have type annotations, and even include relationships. The code runs without errors. But the schemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required. it created will cause slow queries at 1,000 rows, data integrity violations at 10,000, and orphaned records at 100,000. This lesson teaches you to read SQLAlchemy models the way a database engineer does.
Declarative models
SQLAlchemy's ORMWhat is orm?Object-Relational Mapping - a library that lets you interact with a database using your programming language's objects instead of writing raw SQL. lets you define database tables as Python classes. The modern approach uses DeclarativeBase:
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
from sqlalchemy import String, Text, DateTime, Boolean, Integer, ForeignKey
from datetime import datetime
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str] = mapped_column(String(255), unique=True, nullable=False)
name: Mapped[str] = mapped_column(String(100), nullable=False)
bio: Mapped[str | None] = mapped_column(Text, nullable=True)
is_active: Mapped[bool] = mapped_column(Boolean, default=True)
created_at: Mapped[datetime] = mapped_column(
DateTime, default=datetime.utcnow
)Each class attribute maps to a database column. The Mapped[type] annotation tells both Python and SQLAlchemy what type the column holds. mapped_column() configures the database-level details: type, constraints, defaults.
Column types that matter
| SQLAlchemy type | PostgreSQL type | When to use |
|---|---|---|
Integer | INTEGER | IDs, counts, small numbers |
BigInteger | BIGINT | Large IDs, timestamps as integers |
String(n) | VARCHAR(n) | Short text with a known max length (names, emails) |
Text | TEXT | Long text with no practical limit (comments, articles) |
Boolean | BOOLEAN | True/false flags |
DateTime | TIMESTAMP | Dates and times |
Float | FLOAT | Approximate numbers (avoid for money) |
Numeric(10, 2) | NUMERIC(10,2) | Exact decimals (money, precise measurements) |
String(50) for fields that need Text, comment bodies, descriptions, bio fields. A 50-character limit on a comment field means users cannot write more than a tweet. AI picks short String lengths because its training data contains many tutorial examples with arbitrary limits. Always check that the column type matches the real-world data it will store.Relationships
Relationships define how tables connect. This is where AI-generated code gets the most things wrong.
One-to-many
A user has many posts. Each post belongs to one user.
from sqlalchemy.orm import relationship
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
email: Mapped[str] = mapped_column(String(255), unique=True, nullable=False)
name: Mapped[str] = mapped_column(String(100), nullable=False)
posts: Mapped[list["Post"]] = relationship(
back_populates="author",
cascade="all, delete-orphan"
)
class Post(Base):
__tablename__ = "posts"
id: Mapped[int] = mapped_column(primary_key=True)
title: Mapped[str] = mapped_column(String(200), nullable=False)
content: Mapped[str] = mapped_column(Text, nullable=False)
user_id: Mapped[int] = mapped_column(
ForeignKey("users.id"), nullable=False, index=True
)
author: Mapped["User"] = relationship(back_populates="posts")Three things to check in every relationship:
1. ForeignKey has an indexWhat is index?A data structure the database maintains alongside a table so it can find rows by specific columns quickly instead of scanning everything.. The user_id column has index=True. Without it, every query that joins users and posts does a full table scanWhat is full table scan?Reading every row in a database table because no suitable index exists for the query - becomes very slow on large tables. on the posts table. At 10,000 posts, that query takes seconds instead of milliseconds. PostgreSQL does not automatically index foreign keys (unlike MySQL).
2. nullable=False on required foreign keys. If user_id is nullable, you can create posts that belong to no user, orphan records. If your business logic requires every post to have an author, enforce it at the database level.
3. cascade="all, delete-orphan" on the parent side. This tells SQLAlchemy what happens when you delete a user. With delete-orphan, deleting a user also deletes all their posts. Without it, deleting a user leaves orphan posts with a user_id pointing to a user that no longer exists.
index=True on foreign key columns. It also often omits cascade entirely, which means "save-update, merge" by default, deleting a parent leaves orphan children in the database. If you see relationship() without cascade on the parent side, it is almost certainly a bug.Many-to-many
Posts can have many tags. Tags can belong to many posts. This requires an association table.
from sqlalchemy import Table, Column
post_tags = Table(
"post_tags",
Base.metadata,
Column("post_id", ForeignKey("posts.id"), primary_key=True),
Column("tag_id", ForeignKey("tags.id"), primary_key=True),
)
class Tag(Base):
__tablename__ = "tags"
id: Mapped[int] = mapped_column(primary_key=True)
name: Mapped[str] = mapped_column(String(50), unique=True, nullable=False)
posts: Mapped[list["Post"]] = relationship(
secondary=post_tags, back_populates="tags"
)
class Post(Base):
__tablename__ = "posts"
# ... other columns ...
tags: Mapped[list["Tag"]] = relationship(
secondary=post_tags, back_populates="posts"
)The secondary=post_tags parameter tells SQLAlchemy to use the association table for the joinWhat is join?A SQL operation that combines rows from two or more tables based on a shared column, letting you query related data in one request.. Both sides of the relationship reference the same table, and back_populates links them together.
Evaluating AI-generated schemas
When AI generates a set of models, run through this checklist:
The model review checklist
| Check | What to look for | Why it matters |
|---|---|---|
| Foreign key indexes | index=True on every ForeignKey column | Without it, joins do full table scans |
| Nullable constraints | nullable=False on required fields | Prevents garbage data from entering the database |
| String lengths | String(n) appropriate for the data | Too short truncates user input, too long wastes storage |
| Cascade behavior | cascade="all, delete-orphan" on parent relationships | Prevents orphan records when parents are deleted |
| Unique constraints | unique=True on fields that must be unique (email, username, slug) | Prevents duplicate entries |
| Default values | default= or server_default= for fields with sensible defaults | Reduces required fields in API calls |
| Timestamps | created_at and updated_at on every table | Essential for debugging, auditing, and sorting |
Common AI mistakes in models
Here is a model AI might generate for a blog application:
class Comment(Base):
__tablename__ = "comments"
id: Mapped[int] = mapped_column(primary_key=True)
content: Mapped[str] = mapped_column(String(50))
post_id: Mapped[int] = mapped_column(ForeignKey("posts.id"))
user_id: Mapped[int] = mapped_column(ForeignKey("users.id"))Four issues in four columns:
String(50)for comment content. Comments needText, not a 50-character varchar. This will silently truncate comments or raise errors depending on the database engine.- No
index=Trueonpost_idoruser_id. Listing comments for a post (WHERE post_id = ?) will scan every row in the comments table. - No
nullable=False. Bothpost_idanduser_iddefault to nullable, meaning you can create comments attached to nothing. - No timestamps. When was this comment created? There is no way to sort comments chronologically.
The fixed version:
class Comment(Base):
__tablename__ = "comments"
id: Mapped[int] = mapped_column(primary_key=True)
content: Mapped[str] = mapped_column(Text, nullable=False)
post_id: Mapped[int] = mapped_column(
ForeignKey("posts.id"), nullable=False, index=True
)
user_id: Mapped[int] = mapped_column(
ForeignKey("users.id"), nullable=False, index=True
)
created_at: Mapped[datetime] = mapped_column(
DateTime, default=datetime.utcnow, nullable=False
)Same structure, but now the database enforces data integrity instead of hoping the application layer handles it.
SessionWhat is session?A server-side record that tracks a logged-in user. The browser holds only a session ID in a cookie, and the server looks up the full data on each request. management
SQLAlchemy uses sessions to manage database operations. A session is a workspace that tracks changes to your objects and flushes them to the database in a single transactionWhat is transaction?A group of database operations that either all succeed together or all fail together, preventing partial updates..
from sqlalchemy.orm import Session
from sqlalchemy import select
def get_user_posts(db: Session, user_id: int):
stmt = select(Post).where(Post.user_id == user_id)
return db.scalars(stmt).all()
def create_post(db: Session, user_id: int, title: str, content: str):
post = Post(user_id=user_id, title=title, content=content)
db.add(post)
db.commit()
db.refresh(post) # reload from DB to get generated id, timestamps
return postIn FastAPI, sessions are typically managed through dependencyWhat is dependency?A piece of code written by someone else that your project needs to work. Think of it as a building block you import instead of writing yourself. injection:
from sqlalchemy.orm import sessionmaker
SessionLocal = sessionmaker(bind=engine)
def get_db():
db = SessionLocal()
try:
yield db
finally:
db.close()
@app.get("/posts")
def list_posts(db: Session = Depends(get_db)):
return db.scalars(select(Post)).all()The yield and finally pattern guarantees the session closes even if the endpointWhat is endpoint?A specific URL path on a server that handles a particular type of request, like GET /api/users. raises an exception. Without it, sessions leak and eventually exhaust the connection poolWhat is connection pool?A set of pre-opened database connections that your app reuses instead of opening and closing a new one for every request..
db = SessionLocal() inside endpoint functions without a finally: db.close() block. The session works for successful requests but leaks on errors. The dependency injection pattern with yield is the correct approach, it handles cleanup automatically.