Adding a new column sounds simple. In production, it can be risky and costly if done wrong. Rows count in millions. Every schema change touches disk. Locks can block writes. Cached queries can break. To avoid downtime, you need a plan.
First, define the new column with a clear type and default that won’t lock the whole table during creation. In systems like PostgreSQL, adding a column with a constant default rewrites the table. Avoid that. Use NULL defaults, then backfill in small batches.
Second, manage data backfills with minimal load. Schedule them off-peak. Use indexed lookups if updates need joins. Throttle batch sizes to keep replication in sync.
Third, review ORM migrations. Auto-generated SQL often assumes safety on small tests. In production, run explicit, sane SQL. Wrap schema changes in feature flags where possible, so code paths can switch over cleanly once the column is ready.