Adding a new column should be simple. In production, it can be dangerous. Schema changes affect performance, locking, and downstream systems. One mistake can block writes, create replication lag, or break queries in services that depend on that table.
The process starts with understanding the table’s size and usage. Millions of rows? Heavy writes? Each factor changes the approach. In some cases, you add the new column as nullable with no default to avoid rewriting the entire table. Then you backfill data in small, controlled batches. This reduces lock time and keeps throughput steady.
In systems with strong availability requirements, you run schema migrations online. Tools like gh-ost or pt-online-schema-change create a shadow table, sync data, and cut over with minimal downtime. On cloud platforms, managed migration features can help, but you still need to monitor closely for query performance regressions.