The migration broke at 2:03 a.m., right after the new column went live. Logs exploded with errors. The fix was simple but the impact was brutal.
Adding a new column to a production database is not just a schema change. It is a change in behavior, performance, and trust. Done right, it unlocks new features and faster code. Done wrong, it stalls deploys, triggers rollbacks, and keeps you awake at night.
Before you add a new column, define its type and nullability with precision. Avoid default values that mask bad data. Check constraints before creating them, especially on large tables. In PostgreSQL and MySQL, adding a non-null column without a default will rewrite the table. That can lock writes for minutes or hours on massive datasets. Use NULL with backfill instead to avoid downtime.
Always stage the change. Add the column in one migration, populate it in batches, and then add constraints or indexes in later steps. This reduces locking and keeps read/write operations healthy. Monitor replication lag if you run read replicas, since schema changes can cause drift.