The migration broke at 2:13 a.m., right after the deploy. All tests passed, but the database choked when it hit the new column. The logs told the story: missing defaults, invalid null constraints, and a lock that froze writes.
Adding a new column sounds simple. It isn’t. At scale, the wrong approach can take down live systems. The right strategy starts with understanding the schema, load, and query patterns.
First, choose the correct data type. Mismatches cause costly type casts that slow queries. Use constraints and indexes with caution—adding them in one migration can lock the table longer than expected. When possible, create the column without constraints, backfill in small batches, then add indexes and foreign keys in separate steps.
Second, protect the write path. In PostgreSQL and MySQL, adding a new column with a default can rewrite the whole table in one transaction. For large tables, that’s downtime. Use a null column first, update rows in chunks, then set the default.