The query had been slow all week, but the real problem started when the schema changed. You needed a new column. Not later. Now.
Adding a new column sounds simple. In production, it is not. Data size, downtime risk, and migration speed all hinge on how you execute the schema change. On small tables, ALTER TABLE ADD COLUMN works without trouble. On massive, heavily used tables, it can lock writes and block reads. That’s when a bad migration can take your service offline.
The first step is deciding the column type and default value. Avoid setting a default that forces the database to backfill every row at once. Instead, add the column as nullable. Then backfill in batches. Update indexes only after data is populated. This reduces lock time and contention.
For PostgreSQL, tools like pg_online_schema_change or logical replication can help add a column with near-zero downtime. In MySQL, pt-online-schema-change is the standard choice. Both approaches create a shadow table, sync data, and swap tables with minimal interruption. Always run the operation in staging with production-scale data to catch surprises.