Writing a migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync. is straightforward. Running one safely in production, on a live database with real users, is where things get serious. This lesson covers the habits and patterns that separate careful engineers from ones who cause outages.
The golden rule: never edit a merged migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync.
Once a migration has run in any shared environment, including a teammate's laptop, not just production, treat it as immutable. If you need to change something, create a new migration.
# Bad workflow
git commit migrations/001_create_users.sql
git push
# Teammates run it
# You realise you forgot a column
# You edit 001_create_users.sql and push again
# Teammates: migration already ran, tool won't re-run it
# Result: your schema and theirs are now different
# Good workflow
git commit migrations/001_create_users.sql
git push
# Teammates run it
# You realise you forgot a column
# You create 002_add_missing_column.sql
# Teammates run the new migration
# Result: everyone has the same schemaSafe schemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required. changes in production
The most common cause of migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync.-related downtime is a "breaking changeWhat is breaking change?A modification to an API that causes existing code using it to stop working, such as renaming a field or changing a response format.": removing or renaming something while the running application still depends on it.
Why breaking changes are dangerous
-- You run this migration while the app is live
ALTER TABLE users DROP COLUMN old_field;
-- The deployed app code still reads user.old_field
-- Every request that touches users now crashesThere is a small but real window between when the migration runs and when new code is deployed where your app can fail. For high-traffic apps, that window can cause hundreds of errors.
The expand-contract pattern
Think of it like replacing a load-bearing wall: you put up the new wall first, transfer the load, then remove the old one. Applied to databases:
Expand: add the new structure alongside the old:
-- Migration 1: add the new column, copy existing data
ALTER TABLE users ADD COLUMN email_address TEXT;
UPDATE users SET email_address = email;Overlap: deploy code that writes to both columns and reads from the new one:
// Transitional app code: write both, read from new column
await db.query(
'INSERT INTO users (email, email_address) VALUES (?, ?)',
[value, value]
);
const user = await db.query('SELECT email_address FROM users WHERE id = ?', [id]);Contract: once the old column is unused, drop it in a new migration:
-- Migration 2: safe to drop now
ALTER TABLE users DROP COLUMN email;Multi-step deployment checklist
| Step | Action | Safe to do? |
|---|---|---|
| 1 | Add new column / table (additive) | Yes, old code ignores it |
| 2 | Deploy code that uses the new structure | Yes, new structure exists |
| 3 | Stop writing to the old structure | Yes, new code is live |
| 4 | Drop old column / table | Yes, nothing uses it anymore |
Handling large tables
On a table with millions of rows, a simple ALTER TABLE can lock the entire table for minutes, making your app unresponsive.
Add a column without a default
-- Slow: rewrites every row to fill in the default value
ALTER TABLE huge_table ADD COLUMN status TEXT DEFAULT 'active' NOT NULL;
-- Fast: adds the column as NULL, then backfill in batches
ALTER TABLE huge_table ADD COLUMN status TEXT;
UPDATE huge_table SET status = 'active' WHERE id BETWEEN 1 AND 10000;
UPDATE huge_table SET status = 'active' WHERE id BETWEEN 10001 AND 20000;
-- Continue until all rows are backfilledCreate indexes concurrently (PostgreSQL)
-- Locks the table while building the index
CREATE INDEX idx_users_email ON users(email);
-- Builds the index without locking - takes longer but app stays up
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);The production migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync. checklist
Before you run any migration in production, work through this list:
# 1. Back up the database
pg_dump myapp > backup-$(date +%Y%m%d-%H%M%S).sql
# 2. Test the migration on a staging environment seeded with production data
pg_restore backup.sql -d staging
npm run migrate # on staging
# 3. Review the exact SQL that will run
cat migrations/latest.sql
# Look for: DROP statements, UPDATE without WHERE, large table changes
# 4. Know your rollback plan before you start
# Option A: npm run migrate:rollback
# Option B: psql myapp < backup.sql
# 5. Estimate impact - check table size for slow operations
SELECT pg_size_pretty(pg_total_relation_size('users'));During the migration:
# Run the migration
npm run migrate
# Verify it recorded correctly
psql -d myapp -c "SELECT name, applied_at FROM migrations ORDER BY id DESC LIMIT 3;"
# Smoke-test the app
curl https://myapp.com/healthAfter the migration, monitor your error logs for at least 10 minutes before declaring success.
Handling migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync. conflicts on a team
When two developers create migrations at the same time, you can end up with duplicate timestamps:
Developer A: 20240115120000_add_avatar.sql
Developer B: 20240115120000_add_bio.sqlThe fix is simple, whoever merges second renames their file:
# Developer B, after pulling Developer A's work:
mv migrations/20240115120000_add_bio.sql \
migrations/20240115120001_add_bio.sql
git add migrations/
git commit --amend # or a new commitCommon mistakes to avoid
-- Mistake 1: Adding a foreign key without an index
-- Bad
ALTER TABLE posts ADD COLUMN user_id INTEGER REFERENCES users(id);
-- Every query filtered by user_id is now a full table scan
-- Good: add the index in the same migration
ALTER TABLE posts ADD COLUMN user_id INTEGER REFERENCES users(id);
CREATE INDEX idx_posts_user_id ON posts(user_id);
-- Mistake 2: Dropping a table without a safety net
-- Bad
DROP TABLE old_data; -- gone forever
-- Good: rename it first, drop it a week later after confirming nothing broke
ALTER TABLE old_data RENAME TO old_data_backup_20240115;Quick reference
| Situation | Recommended approach |
|---|---|
| Forgot a column in a deployed migration | Create a new migration to add it |
| Renaming a column in production | Use expand-contract (add, copy, deploy, drop) |
| Adding a column to a huge table | Add nullable, then backfill in batches |
| Creating an index on a live PostgreSQL DB | CREATE INDEX CONCURRENTLY |
| Timestamp conflict with a teammate | Rename your file with a later timestamp |
| Something goes wrong post-deploy | npm run migrate:rollback or restore from backup |