The schema was perfect until the product team asked for one more field. The request was simple: add a new column. The impact was not.
Adding a new column in production is never just an ALTER TABLE away. It can lock rows, block writes, trigger table rewrites, and in the wrong environment, take your system down for hours. The operation feels trivial in dev, but in scale environments, a careless migration can cause cascading latency and break core services.
Plan the migration. In PostgreSQL, adding a nullable column with a default value forces a table rewrite and can impact millions of rows. In MySQL, the cost depends on storage engine and indexing. Always benchmark the operation on a true copy of production data. Check for replication lag, storage size, and index rebuild times before you deploy.
Use online schema change tools when possible, like pt-online-schema-change or gh-ost. They create shadow tables, copy data in chunks, and swap with minimal downtime. Roll out in maintenance windows or with progressive rollout gates to catch issues early.