The schema was locked, the service hot, and a new column had to go in now. No downtime. No broken queries. No failed deploys.
Adding a new column sounds simple, but many systems break here. A direct ALTER TABLE on a large dataset can block writes, trigger long migrations, and tip your cluster over the edge. The right approach depends on the database engine, the workload, and the live constraints you face.
In PostgreSQL, adding a nullable column with no default is instant. Setting a default on it is not. In MySQL, the cost depends on row format and the ALTER algorithm you use. Modern versions offer ALGORITHM=INPLACE or ALGORITHM=INSTANT, but they have limits. Know them before you run them in prod.
For high-traffic systems, staged rollouts work best. Add the new column as nullable and default-free. Deploy application code that writes to both the old and new columns, or materializes derived data. Backfill values with a controlled batch job, throttled to match available I/O. Once complete, shift reads to the new column. Then enforce constraints and update indexes.