The culprit was a schema change no one had mapped: a new column in a production table.
A new column is not just another field in a database. It changes shape, flow, and assumptions. One misstep and downstream services process bad data, APIs fail type checks, and reports drift from reality. Treat every schema change as a live system event, not a static update.
Adding a new column starts in definition. Pick a clear, immutable name. Avoid overloading existing semantics. Decide the exact type and constraints. Will it allow nulls? Will it have defaults? Each choice impacts migrations and integrations.
Run migrations in controlled steps. Backfill data where needed. Test both with old and new code running against the updated schema. Feature-flag reads and writes to ensure compatibility. Document the change in your schema history. In systems with replication or event-driven pipelines, consider versioned events that include and exclude the new column until all consumers are upgraded.