The migration stalled at 3:17 a.m. The logs were clean, but the report was wrong. The fix was a single command: add a new column.
Creating a new column in a database sounds simple—until you account for schema evolution, live traffic, and deployment windows. The wrong approach can lock tables, block writes, or crash services at scale. The right approach makes the change invisible to customers while keeping data consistent.
A new column can be added with ALTER TABLE, but in production, it’s never just that. You must decide on the column type, default values, nullability, and indexing. For large tables, adding a column synchronously can trigger full table rewrites. That means higher CPU usage, slower queries, and replication lag.
Use online DDL tools when possible. MySQL has ALGORITHM=INPLACE to avoid full table copies. PostgreSQL can add certain column types instantly if they have no default values. In distributed systems, consider rolling migrations—deploying a schema change in stages, then updating application code to write to and read from the new column only after it exists everywhere.