Adding a new column should be simple. In practice, it drives many of the most painful deployment issues. A single misstep can create downtime, data loss, or constraint errors that ripple through dependent services. Teams need patterns that keep schema changes safe, atomic, and reversible.
The first step is defining the new column with the right type and defaults. Avoid setting non-null constraints on creation unless the table is small or the column has a safe default. In most production databases, adding a non-null column to a large table will lock writes for too long.
For PostgreSQL, use ADD COLUMN without constraints, then backfill in small batches. Apply constraints last, after verifying data integrity. For MySQL, test on a staging replica with production-size data to measure lock time. If the engine supports it, use algorithms like INPLACE or ONLINE to avoid blocking.
Backfilling is where performance risks are highest. Run controlled updates using limits and pauses between batches. Monitor replication lag if read replicas are in place. Any background job that writes to a table during this process must handle both the old and new column states.