Pydantic models validate data coming in. But they also control data going out. When your APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. returns a response, you don't want to send the user's password hash, internal database IDs, or admin flags. SerializationWhat is serialization?Converting data from a program's internal format into a string or byte sequence that can be stored or sent over a network., converting models to dicts and JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it., is how you control exactly what your API exposes.
This is the most common security gap in AI-generated FastAPI code. AI creates one model for everything, input, database, and output, and returns it directly. The result: your API leaks every field in the model, including ones that should never leave the server.
Converting models to data
Pydantic v2 provides two methods for serializationWhat is serialization?Converting data from a program's internal format into a string or byte sequence that can be stored or sent over a network.:
from pydantic import BaseModel
class User(BaseModel):
id: int
name: str
email: str
password_hash: str
is_admin: bool = False
user = User(
id=1, name="Alice", email="alice@example.com",
password_hash="$2bCODE_BLOCK2$abc...", is_admin=False
)
# To a Python dict
user.model_dump()
# {'id': 1, 'name': 'Alice', 'email': 'alice@example.com',
# 'password_hash': '$2bCODE_BLOCK2$abc...', 'is_admin': False}
# To a JSON string
user.model_dump_json()
# '{"id":1,"name":"Alice","email":"alice@example.com",
# "password_hash":"$2bCODE_BLOCK2$abc...","is_admin":false}'Both methods accept parameters to filter the output:
# Exclude specific fields
user.model_dump(exclude={"password_hash", "is_admin"})
# {'id': 1, 'name': 'Alice', 'email': 'alice@example.com'}
# Include only specific fields
user.model_dump(include={"id", "name"})
# {'id': 1, 'name': 'Alice'}
# Exclude fields with None values
user.model_dump(exclude_none=True)
# Exclude fields that still have their default value
user.model_dump(exclude_defaults=True).dict() and .json(). These are deprecated in v2. The modern equivalents are .model_dump() and .model_dump_json(). If you see the old names in AI-generated code, update them.Field aliasing
Sometimes your Python field names don't match the JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it. keys your APIWhat is api?A set of rules that lets one program talk to another, usually over the internet, by sending requests and getting responses. needs to send or receive. Pydantic handles this with aliases.
from pydantic import BaseModel, Field
class MongoDocument(BaseModel):
id: str = Field(alias="_id")
name: str
created_at: str = Field(alias="createdAt")
# Create from aliased keys (like a MongoDB document)
doc = MongoDocument(_id="abc123", name="Test", createdAt="2024-01-01")
# Serialize with Python names (default)
doc.model_dump()
# {'id': 'abc123', 'name': 'Test', 'created_at': '2024-01-01'}
# Serialize with aliased names (for API response)
doc.model_dump(by_alias=True)
# {'_id': 'abc123', 'name': 'Test', 'createdAt': '2024-01-01'}This is essential when your API needs camelCase JSON responses but your Python code uses snake_case. You can set this globally:
from pydantic import BaseModel, ConfigDict
class CamelModel(BaseModel):
model_config = ConfigDict(
alias_generator=lambda s: "".join(
word.capitalize() if i else word
for i, word in enumerate(s.split("_"))
),
populate_by_name=True # Allow both alias and field name
)
class UserResponse(CamelModel):
user_name: str
email_address: str
created_at: str
user = UserResponse(user_name="alice", email_address="a@b.com", created_at="2024-01-01")
user.model_dump(by_alias=True)
# {'userName': 'alice', 'emailAddress': 'a@b.com', 'createdAt': '2024-01-01'}Separate input and output models
This is the most important pattern in this lesson. Never use the same model for input and output. Here's why:
# BAD - one model for everything
class User(BaseModel):
id: int
name: str
email: str
password_hash: str
is_admin: bool
created_at: str
@app.post("/users")
async def create_user(user: User):
# Problem 1: client must send id, password_hash, is_admin, created_at
# Problem 2: response includes password_hash
return userThe fix: separate models for separate purposes.
# Input model - what the client sends
class UserCreate(BaseModel):
name: str = Field(min_length=1, max_length=100)
email: str
password: str = Field(min_length=8)
# Database model - internal representation
class UserDB(BaseModel):
id: int
name: str
email: str
password_hash: str
is_admin: bool = False
created_at: str
# Output model - what the API returns
class UserOut(BaseModel):
id: int
name: str
created_at: strThree models, three purposes. The input model has validation constraints. The database model has all fields. The output model only includes what the client should see, no password hash, no admin flag, no email (unless you want to expose it).
Response models in FastAPI
FastAPI's response_model parameter on route decorators is the enforcement mechanism. It filters the response automatically:
@app.post("/users", response_model=UserOut)
async def create_user(user: UserCreate):
# Hash password, save to DB, get back a UserDB...
db_user = UserDB(
id=1, name=user.name, email=user.email,
password_hash="$2bCODE_BLOCK2$...", is_admin=False,
created_at="2024-01-01T00:00:00"
)
# Return the full DB object - FastAPI strips it down to UserOut
return db_userEven though the function returns a UserDB (which has password_hash and is_admin), FastAPI filters the response through UserOut. The client only receives id, name, and created_at.
How this works internally:
- Your function returns data (a dict, a model, or any object)
- FastAPI validates that data against
response_model - Any fields not in
response_modelare stripped - The filtered data is serialized to JSONWhat is json?A text format for exchanging data between systems. It uses key-value pairs and arrays, and every programming language can read and write it. and sent
response_model entirely and returns the database model directly. The endpoint works, the tests pass, but the API leaks password_hash, is_admin, and every internal field to anyone who calls it. Always check: does the endpoint have a response_model, and does that model exclude sensitive fields?Response model gotchas
There are two subtle issues you'll hit with response_model:
Extra fields that leak
If your return data has a field with the same name as a response_model field but different meaning, the wrong data leaks through:
class UserOut(BaseModel):
id: int
name: str
email: str # Intentionally included
class UserDB(BaseModel):
id: int
name: str
email: str
password_hash: str
internal_notes: str # Stripped by response_model -- goodThis works fine. But what if someone adds an email field to UserOut that should show a masked version? The raw email from UserDB leaks through because response_model just does field matching, not transformation.
Return type annotationWhat is type annotation?Explicitly labeling a variable or function parameter with its type in TypeScript (e.g., name: string). vs response_model
In modern FastAPI, you can use the return type annotation instead of response_model:
# Using response_model (explicit)
@app.get("/users/{id}", response_model=UserOut)
async def get_user(id: int):
...
# Using return type (modern style)
@app.get("/users/{id}")
async def get_user(id: int) -> UserOut:
...Both work. But AI sometimes uses both at the same time, and they can conflict if they specify different models. When both are present, response_model wins. Prefer one or the other, not both.
Excluding fields with model config
Instead of separate models, you can use Pydantic's built-in exclusion. This is less common but useful for simple cases:
from pydantic import BaseModel, ConfigDict
class UserOut(BaseModel):
model_config = ConfigDict(
json_schema_extra={
"examples": [{"id": 1, "name": "Alice", "created_at": "2024-01-01"}]
}
)
id: int
name: str
created_at: strFor most APIs, separate input/output models are the cleanest approach. They're explicit, easy to audit, and make it obvious what data flows where.
response_model_exclude and response_model_include parameters let you fine-tune filtering per endpoint. But they're fragile, if you rename a field in the model, the string in response_model_exclude doesn't update. Separate models are safer for long-term maintenance.