The query ran, and nothing happened. Then the logs revealed the truth—there was no new column.
Adding a new column in a live database is simple in theory, but in production it’s a point of failure. Schema changes can lock tables. They can block writes. They can trigger rollbacks that bring down an application. The difference between safe and dangerous migrations comes down to how the new column is created, deployed, and rolled out.
The first step is to define the column with a default that avoids a full table rewrite. In PostgreSQL, adding a column with a constant default in older versions rewrites the entire table. On large datasets, this can take hours. In MySQL, adding a column with NOT NULL and a default may also lock the table. Always check the engine’s behavior before running migrations.
For zero-downtime changes, add the column as nullable or without a default, then backfill in small batches. Write scripts that update rows incrementally. Use transactions and rate limits. Monitor replication lag if you run a cluster.