The migration broke at exactly 02:13. A missing new column.
Schema changes are routine until they aren’t. Adding a new column to a production database seems simple. One DDL statement. But in live systems with terabytes of data and constant writes, a naive ALTER can lock tables, stall requests, and trigger cascading failures.
To add a new column without downtime, plan the operation like any other high-risk deployment. Start by confirming the exact column definition—name, data type, default value, constraints. Verify it against staging data. Ensure backward compatibility with existing queries and services. This step prevents mismatches between code and schema.
In most relational databases, adding a nullable new column without a default is fast. The database updates metadata only. But if the new column has a default value, older engines may rewrite the entire table. On PostgreSQL 11+ and MySQL’s instant DDL for certain types, this rewrite can be avoided. Know your engine’s capabilities before execution.
For non-nullable columns, use a phased migration. Add the column as nullable, backfill in controlled batches, then apply the NOT NULL constraint after completion. This avoids long locks and supports rolling deploys where old and new versions of your application run together.