Deployment strategies

Deployment strategy is a function of three things: how bad is a bad deploy, how fast can you detect it, and how cheap is rollback. Pick the simplest strategy whose risk profile matches the workload.

The strategies, in order of risk

Recreate (a.k.a. naive / big-bang)

Stop the old, start the new. There’s a downtime window. Don’t bother with anything more sophisticated when:

It’s a batch job, not a service.
It’s an internal tool with a tolerated maintenance window.
It’s a stateful service that genuinely can’t run two versions at once (some database migrations).

Anything user-facing in 2026 should not be doing recreate.

Rolling

Update instances in batches (one at a time, 25% at a time). At any moment some instances run the old version, some run the new.

Cost: low — no extra infra.
Risk: medium — bad version reaches some users before you notice.
Constraint: services must support N and N+1 versions simultaneously. This is mostly a schema-and-API-versioning discipline, not a deployment one.
Rollback: re-roll the old version. Slow.

The default for stateless services on Kubernetes; the failure mode is when “support both versions” gets violated by an in-flight migration.

Blue-green

Two full environments. New version goes to “green” alongside “blue” running old. Cut traffic over (load balancer, DNS) when ready.

Cost: 2× infra during the window.
Risk: low — you’ve tested green with synthetic / canary traffic before flipping.
Rollback: instant — flip the LB back.
Constraint: stateful resources (databases, caches) must be backwards-compatible across the cut. This is where blue-green gets hard.

The right call when rollback speed matters more than infrastructure cost — payments, auth, anything where 5 minutes of bad traffic is unacceptable.

Canary

Gradually shift traffic to the new version: 1% → 5% → 25% → 100%, with bake time and metric checks at each step.

Cost: low (just routing).
Risk: lowest — bad versions only hit a small fraction before metrics catch them.
Rollback: fast — shift traffic back to 0%.
Constraint: needs real metrics (error rates, latency, business KPIs) hooked into the rollout decision. A canary you don’t measure is just a slow rolling deploy.

The right call for high-traffic services where 1% is statistically meaningful and 5 minutes of 100% bad traffic would be catastrophic.

Feature flags / dark launches

Decoupled from deploy. Code is deployed dark; the flag controls who sees it. Flip on for 1% of users, ramp up. The deploy is no longer the risk surface — the flag flip is.

Cost: a flag system (LaunchDarkly, GrowthBook, homemade).
Risk: lowest of all — you can ramp on a key user-segment basis, not just a traffic-percentage basis.
Rollback: instant — flip flag off.
Cost to maintain: stale flags accumulate; they need a removal discipline.

The right call for risky features behind already-deployed code. Pair with canary deploys for full coverage.

A/B testing — adjacent but different

A/B is for measurement, not deployment. Two versions run permanently as an experiment, traffic split by user, metric difference measured. Often piggybacks on the same flag infrastructure as feature flags. The deploy strategy is independent — A/B can sit on top of any of the above.

How they compose

The mature setup is canary deploy + feature flags + observability hooked into both. The deploy is gradual and metric-gated; risky behaviors are dark-launched behind flags; flags are owned, ramped, and removed on a schedule.

What you should never do: ship a recreate deploy of a stateful service with no flags, no canary, no rollback rehearsal, and no SLO dashboard. That’s the deploy that ends the quarter.

The Commonplace Book

Explorer