Migration Best Practices: Production Safety

Create a free account to save your progress

Earn XP, track streaks, and sync your dashboard across devices.

Lesson

Writing a migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync. is straightforward. Running one safely in production, on a live database with real users, is where things get serious. This lesson covers the habits and patterns that separate careful engineers from ones who cause outages.

The golden rule: never edit a merged migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync.

Once a migration has run in any shared environment, including a teammate's laptop, not just production, treat it as immutable. If you need to change something, create a new migration.

# Bad workflow
git commit migrations/001_create_users.sql
git push
# Teammates run it
# You realise you forgot a column
# You edit 001_create_users.sql and push again
# Teammates: migration already ran, tool won't re-run it
# Result: your schema and theirs are now different

# Good workflow
git commit migrations/001_create_users.sql
git push
# Teammates run it
# You realise you forgot a column
# You create 002_add_missing_column.sql
# Teammates run the new migration
# Result: everyone has the same schema

The migration tool decides whether to run a migration based on its filename (or a checksum). If you edit the content after it has already been recorded as "done", the tool will not notice, but your schema will quietly diverge from everyone else's.

Safe schemaWhat is schema?A formal definition of the structure your data must follow - which fields exist, what types they have, and which are required. changes in production

The most common cause of migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync.-related downtime is a "breaking changeWhat is breaking change?A modification to an API that causes existing code using it to stop working, such as renaming a field or changing a response format.": removing or renaming something while the running application still depends on it.

Why breaking changes are dangerous

-- You run this migration while the app is live
ALTER TABLE users DROP COLUMN old_field;
-- The deployed app code still reads user.old_field
-- Every request that touches users now crashes

There is a small but real window between when the migration runs and when new code is deployed where your app can fail. For high-traffic apps, that window can cause hundreds of errors.

The expand-contract pattern

Think of it like replacing a load-bearing wall: you put up the new wall first, transfer the load, then remove the old one. Applied to databases:

Expand: add the new structure alongside the old:

-- Migration 1: add the new column, copy existing data
ALTER TABLE users ADD COLUMN email_address TEXT;
UPDATE users SET email_address = email;

Overlap: deploy code that writes to both columns and reads from the new one:

// Transitional app code: write both, read from new column
await db.query(
  'INSERT INTO users (email, email_address) VALUES (?, ?)',
  [value, value]
);
const user = await db.query('SELECT email_address FROM users WHERE id = ?', [id]);

Contract: once the old column is unused, drop it in a new migration:

-- Migration 2: safe to drop now
ALTER TABLE users DROP COLUMN email;

Multi-step deployment checklist

Step	Action	Safe to do?
1	Add new column / table (additive)	Yes, old code ignores it
2	Deploy code that uses the new structure	Yes, new structure exists
3	Stop writing to the old structure	Yes, new code is live
4	Drop old column / table	Yes, nothing uses it anymore

Handling large tables

On a table with millions of rows, a simple ALTER TABLE can lock the entire table for minutes, making your app unresponsive.

Add a column without a default

-- Slow: rewrites every row to fill in the default value
ALTER TABLE huge_table ADD COLUMN status TEXT DEFAULT 'active' NOT NULL;

-- Fast: adds the column as NULL, then backfill in batches
ALTER TABLE huge_table ADD COLUMN status TEXT;

UPDATE huge_table SET status = 'active' WHERE id BETWEEN 1 AND 10000;
UPDATE huge_table SET status = 'active' WHERE id BETWEEN 10001 AND 20000;
-- Continue until all rows are backfilled

In PostgreSQL 11+, adding a column with a constant default no longer rewrites the table. But for older versions and other databases, this batching approach is the safe path.

Create indexes concurrently (PostgreSQL)

-- Locks the table while building the index
CREATE INDEX idx_users_email ON users(email);

-- Builds the index without locking - takes longer but app stays up
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

The production migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync. checklist

Before you run any migration in production, work through this list:

# 1. Back up the database
pg_dump myapp > backup-$(date +%Y%m%d-%H%M%S).sql

# 2. Test the migration on a staging environment seeded with production data
pg_restore backup.sql -d staging
npm run migrate  # on staging

# 3. Review the exact SQL that will run
cat migrations/latest.sql
# Look for: DROP statements, UPDATE without WHERE, large table changes

# 4. Know your rollback plan before you start
# Option A: npm run migrate:rollback
# Option B: psql myapp < backup.sql

# 5. Estimate impact - check table size for slow operations
SELECT pg_size_pretty(pg_total_relation_size('users'));

During the migration:

# Run the migration
npm run migrate

# Verify it recorded correctly
psql -d myapp -c "SELECT name, applied_at FROM migrations ORDER BY id DESC LIMIT 3;"

# Smoke-test the app
curl https://myapp.com/health

After the migration, monitor your error logs for at least 10 minutes before declaring success.

Handling migrationWhat is migration?A versioned script that changes your database structure (add a column, create a table) so every developer and server stays in sync. conflicts on a team

When two developers create migrations at the same time, you can end up with duplicate timestamps:

Developer A: 20240115120000_add_avatar.sql
Developer B: 20240115120000_add_bio.sql

The fix is simple, whoever merges second renames their file:

# Developer B, after pulling Developer A's work:
mv migrations/20240115120000_add_bio.sql \
   migrations/20240115120001_add_bio.sql

git add migrations/
git commit --amend  # or a new commit

Common mistakes to avoid

-- Mistake 1: Adding a foreign key without an index
-- Bad
ALTER TABLE posts ADD COLUMN user_id INTEGER REFERENCES users(id);
-- Every query filtered by user_id is now a full table scan

-- Good: add the index in the same migration
ALTER TABLE posts ADD COLUMN user_id INTEGER REFERENCES users(id);
CREATE INDEX idx_posts_user_id ON posts(user_id);

-- Mistake 2: Dropping a table without a safety net
-- Bad
DROP TABLE old_data;  -- gone forever

-- Good: rename it first, drop it a week later after confirming nothing broke
ALTER TABLE old_data RENAME TO old_data_backup_20240115;

A renamed table feels embarrassingly cautious until the one time it saves you. Keep the backup table around for at least a sprint cycle before dropping it.

Quick reference

Situation	Recommended approach
Forgot a column in a deployed migration	Create a new migration to add it
Renaming a column in production	Use expand-contract (add, copy, deploy, drop)
Adding a column to a huge table	Add nullable, then backfill in batches
Creating an index on a live PostgreSQL DB	`CREATE INDEX CONCURRENTLY`
Timestamp conflict with a teammate	Rename your file with a later timestamp
Something goes wrong post-deploy	`npm run migrate:rollback` or restore from backup