Adding a new column sounds simple. In practice, it can break production, stall deploys, or lock a table for minutes under load. The difference between smooth execution and downtime is in how you plan the change.
For relational databases like PostgreSQL, MySQL, and MariaDB, adding a column with a default value can require a full table rewrite. On large datasets, this means high I/O, long locks, and potential timeouts. The safest path is often to add the column as nullable first, backfill data in controlled batches, and then add constraints in a separate migration. This avoids table-wide locks on busy systems.
Use migrations that are idempotent and can run without blocking requests. For example, when adding a created_at TIMESTAMP column to a table handling thousands of writes per second:
- Add the column without a default.
- Deploy code that can handle NULL values.
- Backfill in small chunks with
UPDATE ... LIMIT ...patterns. - When complete, set the default and mark it
NOT NULLin a final migration.
In distributed environments, align schema changes with feature flags. Roll out the application changes first to ensure compatibility with both the old and new schema. Only then perform the migration. This order prevents read/write failures in services that haven't yet deployed the updated code.