The migration broke at 2:14 a.m. The logs showed a missing column. Not a missing value. A missing schema field the app needed to stay alive.
A new column sounds simple. It isn’t. Adding one in production means touching the database, the application code, and the deployment flow—without breaking anything. Done wrong, it causes downtime, data corruption, or locked queries that bring the system to a halt.
The first rule: understand the lifecycle of the column. Start with schema design in your migration script. Define the column with the correct data type, nullability, and default. If the default is computationally heavy, backfill in batches instead of in the migration itself.
For relational databases, use transactional DDL when available. For systems like PostgreSQL, many column additions are fast, but adding defaults that require rewriting the table can be costly. MySQL may lock the table depending on version and storage engine. In distributed databases, schema changes may need rolling updates across nodes.
The second rule: deploy in phases. Add the new column, but don’t write or read from it immediately. Let the schema propagate. Then ship the code that writes data into the field. Later, switch reads to the new column. This prevents mismatches between old and new deployments in a rolling update.