Adding a new column sounds simple, but at scale it can stall production traffic, cause timeouts, or corrupt writes. The risk comes from how databases handle schema changes. Some engines can add a column instantly if it’s nullable or has a default. Others require a full table rewrite. Understanding the exact behavior of your database is the difference between a safe change and a midnight rollback.
When you add a new column in PostgreSQL, certain operations are metadata-only, but others—especially when setting a non-null constraint with default values—rewrite the table. In MySQL, ALTER TABLE often copies data into a new structure, holding locks that block reads and writes. In distributed systems, these locks can cascade into outages.
For zero-downtime schema changes, break the process into stages. First, add the new column as nullable with no default. Deploy the application code that starts writing to it. Backfill data in small batches, monitoring replication lag. Finally, enforce constraints or defaults in a later migration. This staged approach limits downtime and allows fast rollback.