The migration broke at 02:13. The logs said nothing. The schema had shifted, and the missing piece was a new column.
Adding a new column sounds simple, but in production it can grind everything to a halt if not planned. Schema changes touch multiple layers: database structure, ORM mappings, API contracts, caching, and downstream consumers. A single misalignment can corrupt data or break services.
When adding a new column in SQL, the first decision is nullability. Adding a nullable column is often instant. Adding with a default value forces a full table rewrite in some systems, which can lock writes. For large datasets, use an online schema change tool. Migrate in phases: deploy schema changes that are non-blocking, deploy code that writes the new column, backfill data in batches, then make it required when safe.
Maintain backward compatibility. Update your migrations to be idempotent. If you use PostgreSQL, add the column with ALTER TABLE ... ADD COLUMN, then backfill in small transactions. For MySQL, use pt-online-schema-change or the built-in ALGORITHM=INPLACE if available. Test on a staging database with realistic data size.