The migration ran clean until the schema froze. You needed a new column, but the database was in production and downtime was not an option. Every decision from there was a trade-off between speed, safety, and clarity.
Adding a new column is deceptively simple. In SQL, it’s just ALTER TABLE ADD COLUMN. But the real work begins when that command touches millions of rows under live traffic. Schema changes impact query plans, cache layers, indexes, and application logic. A new column can cascade changes across the stack—APIs, ETL pipelines, background jobs, and user interfaces.
The first choice is whether the new column is nullable. Nullability lets you deploy the schema before the code that writes to it, enabling safer, staged rollouts. Non-null with a default sounds clean, but on large tables it can lock and rewrite data, spiking load and risking timeouts. For PostgreSQL or MySQL, even a simple default may impact performance if not applied carefully.
Next is backfilling data. Backfill in small batched updates to prevent replication lag and slow queries. Monitor metrics closely during this phase. Ship read paths that handle both old and new values so the system remains consistent during the transition.