The root cause was simple: a new column had to be added without breaking production.
Adding a new column in a database sounds trivial. It rarely is. Even a single change to a table can ripple through queries, indexes, and application code. The goal is zero downtime, zero data loss, and no surprises for downstream systems.
First, define the column with the right type and constraints. Resist the urge to guess; inspect current usage and expected growth. If the column will store high-traffic data, confirm that indexes will not cause write performance issues. Keep new indexes separate from the initial alter statement to prevent unnecessary locks during deployment.
Next, plan for backfilling. An empty new column often needs default values for historical rows. Do this in batches to avoid load spikes. Verify each batch before moving to the next, and monitor replication lag if your database uses read replicas.