The migration failed at exactly 02:14. Logs showed nothing but a broken query. The culprit: adding a new column.
Creating a new column in a live database should be simple. In practice, it can lock tables, block writes, or trigger downtime. At scale, schema changes need precision. How you add your new column determines whether your application stays online.
Start by defining the exact data type and constraints. Avoid implicit defaults on large tables—they force a full table rewrite. Use NULL as a starting point, then backfill in controlled batches. For massive datasets, run the backfill offline or via a worker queue to avoid load spikes.
If your database supports it, use ALTER TABLE ... ADD COLUMN with non-locking options. PostgreSQL can add a nullable column instantly, but defaults require a table scan. MySQL behaves differently depending on engine and version. Understand your database engine’s execution path before you run the migration.