The schema broke at midnight. Deploy logs said nothing. The only clue was a failed migration flagged by a single line: add new column.
A new column is simple until it isn’t. It can crash an API, corrupt a production dataset, or lock a table under heavy write load. In systems at scale, altering schema is a live‑fire operation. The change must be atomic, safe, and reversible.
When adding a new column in PostgreSQL or MySQL, the safest path is ALTER TABLE with defaults handled in code, not in the migration. Adding a column with a non‑null constraint and default value in one step will rewrite the entire table, blocking writes. In high‑traffic systems, this is unacceptable. Instead, create the column as nullable, backfill data in batches, then apply constraints in a separate operation.
In distributed systems, think about replicas. Schema changes must roll out without breaking replication lag thresholds. Plan for backward‑compatible reads during deployment. If you use ORMs, confirm they can tolerate unknown fields until the change is complete.