Adding a new column to a production database is simple in syntax but complex in consequence. Schema changes can lock tables, trigger replag, or break downstream systems. The ALTER TABLE command is fast for empty columns but can stall under billions of rows. Choosing the right approach depends on engine internals, data volume, and uptime requirements.
For relational databases like PostgreSQL and MySQL, adding a nullable column with a default value is a common first step. Avoid writing the default to every existing row if possible—use metadata-only changes where supported. In PostgreSQL, ALTER TABLE ... ADD COLUMN ... can be instant when no backfill is required. MySQL’s behavior depends on storage engine, version, and row format.
If you need to populate the new column with computed or migrated data, break the process into phases. First, deploy the schema change. Then, run background jobs to fill data in batches. Monitor replication lag and transaction logs to catch performance regressions early.
For distributed databases and column stores, adding a new column often involves updating schema metadata without rewriting the entire dataset. Systems like BigQuery or Snowflake can handle this instantly, but watch for downstream code paths that expect non-null values.