The migration failed at 2:03 a.m. The log showed a missing column. The database stopped, the service fell over, and no one could deploy until it was fixed.
Adding a new column should be simple. In practice, it can break production if not done with care. Schema changes in modern systems carry operational risk. Every query, API, and background job that touches the table depends on those columns being stable.
A new column in SQL starts with ALTER TABLE. This locks the table in some databases, blocking reads and writes. For high-traffic applications, that means downtime. PostgreSQL can add nullable columns without a full rewrite, but setting a default on large tables is still expensive. MySQL may rebuild the table. Cloud-managed services often hide some complexity but not the performance hit.
To add a new column without disruption, plan the rollout. First, add it as nullable with no default. Deploy code that writes to the new column only for new rows. Backfill in small batches to avoid overload. Finally, set constraints or defaults when the data is in place.
In distributed environments, schema and application updates must be compatible across versions. Deploy the column first, then the code. If you reverse the order, older code writing to a schema without that column will fail.