The migration was almost done when the problem exploded: the schema needed a new column.
Adding a new column should be simple. Most engineers know it isn’t. On a local machine, it’s a single statement. In production, it can be high-stakes. You have to think about locks, null defaults, data backfills, and zero-downtime deployment. Every decision comes with trade-offs that impact performance and stability.
In PostgreSQL, ALTER TABLE ADD COLUMN is fast when adding a nullable column without a default. But adding a column with a non-null default rewrites the whole table, locking writes until it finishes. MySQL behaves differently but can still stall under heavy load. Even small changes can ripple across read replicas and break application code if not synchronized.
To handle this safely, use migrations that split the change into stages. First, add the nullable column. Then update application code to handle it. Populate the data in batches. Finally, enforce constraints and add indexes when the backfill is complete. For large datasets, batch size and throttle rate can determine whether the system stays responsive.