The migration was done. The tests passed. But the missing piece stared back from the schema: a new column that had to go live without breaking production.
Adding a new column seems simple. It rarely is. Schema changes are high‑risk operations, especially in real systems that serve live traffic at scale. A botched migration can lock tables, slow queries, or cascade failures across services. Getting it right means understanding the database, the application code, and the deployment pipeline as a single system.
Before you add a new column, decide its type and nullability. Choose sensible defaults. In many relational databases, adding a nullable column is fast because it updates metadata instead of rewriting rows. Adding a non‑nullable column with a default usually rewrites the table, which can block operations on large datasets. On MySQL and PostgreSQL, this difference is often the line between a zero‑downtime deploy and a major outage.
If the application depends on the column immediately, deploy in phases. First, add the column as nullable. Deploy code that writes to it without reading from it. Backfill data in controlled batches to avoid I/O spikes. Only after backfill finishes should you enforce NOT NULL constraints and update read paths. This phased rollout lets you detect and fix edge cases while the system stays responsive.