The error was small but the impact cut through the entire release pipeline: a missing new column in production that existed in dev.
A new column change in a database should be the simplest migration. Yet it is one of the most common sources of runtime breakage. Teams handle schema changes differently. Some push them before code deploys. Others hide them behind backward-compatible queries. Without a clear process, rollouts stall or fail under load.
Adding a new column requires more than an ALTER TABLE. It means considering type safety, null defaults, indexing, and query performance. An integer column with a default value might cause a table rewrite. A text column without constraints can become a source of inconsistent data. Adding indexes as part of the same migration can lock tables unexpectedly.
Downtime comes from locking operations. To avoid it, break large migrations into smaller steps. First, add the new column in a non-blocking way if the database supports it. Then backfill in batches. Only after data is consistent should application code read or write to the new column.