The database migration went wrong. A table needed a new column, but the deployment froze, blocking every write. Downtime climbed by the second. You knew the fix, but the schema change pipeline was slow.
Adding a new column in production is simple in theory. In practice, it can be a fracture point for performance, locks, and version drift between environments. Schema changes touch live data. A careless ALTER TABLE can force a full table rewrite, spiking CPU and blocking rows. On large datasets, that means outages.
The safest approach is to plan the new column before it’s needed in application code. First, add the column as nullable with no default. This minimizes lock times. Second, deploy code that writes to and reads from the new column in parallel with the old schema logic. Third, backfill the missing data in controlled batches. Only after the backfill completes should you enforce constraints or make the column non-nullable with a default.