The migration was almost done when someone noticed the missing field. One new column. That was all it would take to save hours of patching later.
Adding a new column to a production database sounds simple. It isn’t. The wrong approach can lock tables, block writes, or drop queries into a timeout death spiral. The right approach is fast, safe, and forward-compatible.
Start with the schema. Define the new column with explicit types. Avoid nullable defaults unless there’s a clear use case. In systems with strict uptime requirements, run an online schema change to prevent blocking. Tools like pt-online-schema-change or native ALTER TABLE algorithms in modern databases can handle this with minimal risk.
Next, deploy in stages. Add the new column first—empty. Update the application to write to both the old and new columns if performing a migration. Backfill in small batches to avoid contention. For large datasets, make the backfill resumable and idempotent.