Adding a new column should be simple. In SQL, it’s a single ALTER TABLE statement. In production, it’s a minefield. The wrong step locks rows, blocks writes, or crashes critical queries. The schema change that seems easy in a test environment can take down the live database if you don’t plan it with precision.
A new column changes storage patterns, index behavior, and query plans. On large tables, the alter can escalate into a full table rewrite. This means high I/O, longer locks, and potential replication lag. Engineers must know the impact on downstream systems, especially OLAP pipelines, caches, and services reading from replicas.
For MySQL, ALTER TABLE ... ADD COLUMN often copies the entire table unless you use an online DDL tool. PostgreSQL can add a column without a full rewrite when it’s nullable with a default of NULL. But adding a default value that's not NULL can rewrite every row. Understanding these rules can cut downtime from hours to seconds.
Backfills are the next trap. After adding the column, you might need historical data. A naive backfill in one transaction will lock tables and bloat WAL or binlogs. Instead, batch updates. Limit the impact with small commit sizes and adaptive throttling. Always monitor replication delay and query throughput during the backfill.