The migration script failed at 2 a.m., and the logs pointed at one thing: a missing new column.
Adding a new column is one of the most common changes in database schema evolution. It looks simple, but done wrong, it can stall deployments, block writes, or take hours on large tables. The key is understanding when and how to perform the change with zero downtime.
When you add a new column, the database must update its metadata and often rewrite data pages to store the new field. On small tables, this is fast. On large, high-traffic tables, it can lock reads and writes. The fix is to use an additive, non-blocking operation. MySQL and PostgreSQL both allow adding nullable columns without heavy locks. For PostgreSQL, use ALTER TABLE ... ADD COLUMN without NOT NULL or with a default set in a separate step to avoid table rewrites. In MySQL with InnoDB, online DDL can add columns without locking if you specify ALGORITHM=INPLACE and LOCK=NONE.
Another key point: many teams add new columns with default values in one step, forcing the database to backfill millions of rows immediately. Instead, first add the column as nullable with no default. Then backfill in small batches using application-level code or background jobs. When complete, add constraints and defaults in a later migration.