Adding a new column sounds simple. It is not. In high-traffic systems, schema changes can lock tables, block writes, and trigger cascading failures. A careless migration can halt critical services for minutes or hours. Downtime is not an option.
The right approach starts with planning the schema change as a staged migration. First, add the new column in a non-blocking way. Use database-specific features like ADD COLUMN with defaults disabled, or its async equivalent if supported. Avoid adding a NOT NULL constraint in the initial step. This prevents table rewrites while still preparing the schema for use.
Once the column exists, backfill its values in small batches. This ensures queries stay fast and locks remain minimal. Use an id-based cursor to page through rows without hammering the database cache. Monitor query times and system metrics. Do not guess.
After data backfill, update the application code to read and write to the new column. Deploy this change with feature flags or dark writes. This lets you verify correctness before switching traffic fully.