The table stopped growing. Every query was slow. Logs showed nothing unusual. The culprit was simple: the schema was missing a new column.
Adding a new column sounds trivial, but it is one of the most common causes of downtime in production databases. The impact depends on scale, engine type, and workload. In PostgreSQL, adding a column with a default value rewrites the entire table. In MySQL, the statement may block writes. In distributed stores, schema changes must propagate to every node. The cost can be seconds or hours.
The safest approach is to add a new column with a NULL default, backfill in small batches, then apply constraints or defaults later. This minimizes locks and keeps the application responsive. Always coordinate schema changes with code releases to avoid undefined behavior. New columns should be tested for index requirements early. Adding an index after the fact can be as risky as the column itself.
In many systems, feature flags can hide the column until it is populated. This allows for staged rollouts and instant fallback. Schema migration tools like Liquibase, Flyway, or built-in frameworks in ORMs can automate deployment, but they must be configured for zero-downtime operations.