Adding a new column sounds simple. In practice, it can block deploys, slow queries, and cause downtime if you get it wrong. The goal is to add the column, keep the system responsive, and avoid locking large tables.
First, choose the data type with care. Adding a new column with a default value on a massive table can rewrite every row. That can hold a lock for minutes or hours. Instead, create the new column as nullable, then backfill in small batches. This approach reduces contention and keeps your service live.
Second, consider the order of operations.
- Deploy the schema change with the new column as nullable.
- Deploy application code that can read and write the new column but does not require it.
- Run background jobs to populate the column in small, controlled increments.
- Once the backfill completes, apply constraints or defaults if needed.
Use migrations that are reversible. When a new column causes unexpected load or errors, you must be able to roll back quickly. Version your database changes alongside your application code, and test against a copy of production data before the real migration.