Olympus Docs
CookbookDeployment

Blue/green deployment

Zero-downtime upgrades for Olympus

Blue/green: two complete environments. Blue is live; green is the new version. Cut traffic over atomically; if anything's wrong, cut back.

Topology

                  ┌─ Load balancer (Caddy / Cloudflare LB) ─┐
                  │                                          │
            ┌─────▼─────┐                              ┌─────▼─────┐
            │   BLUE    │                              │   GREEN   │
            │  v1.4.2   │ ← all traffic                │  v1.5.0   │ ← idle, warm
            │           │                              │           │
            │ Hera Hydra│                              │ Hera Hydra│
            │ Kratos    │                              │ Kratos    │
            │ Athena    │                              │ Athena    │
            └─────┬─────┘                              └─────┬─────┘
                  │                                          │
                  └──────────► Postgres ◄────────────────────┘
                            (shared, primary)

Both environments share the same database. The database is upgraded in advance (backward-compatible migrations).

Process

1. Migrations first

# On primary
podman exec ciam-kratos kratos migrate sql up
podman exec ciam-hydra hydra migrate sql up postgres://...

Migrations must be backward-compatible with the currently-running version. Specifically:

  • Add columns nullable.
  • Add tables (no harm to readers).
  • Don't drop columns yet.

2. Deploy green

# Green host
git pull
# pin to new version
sed -i 's/HERA_TAG=.*/HERA_TAG=v1.5.0/' .env
podman-compose pull
podman-compose up -d
# Smoke test green directly
curl https://green.your-domain/healthz

3. Cut traffic

In your load balancer:

# Caddy
ciam.your-domain.com {
  reverse_proxy {
    to green-host:443
    # (was blue-host:443)
  }
}
caddy reload

Cloudflare LB: change the priority / weights via API.

4. Watch

Tail metrics, errors, logs. Look for:

  • Error rate spikes.
  • Latency regressions.
  • Auth failures specific to green.

5a. Rollback

If anything bad:

reverse_proxy to blue-host:443

Done in seconds.

5b. Cleanup

If green is happy after an hour:

  • Stop blue.
  • Run any forward-only migrations (drop old columns, etc.).
  • Mark green as new blue.

Sessions across blue/green

Both environments share the database. Kratos session created on blue is valid on green. The user doesn't have to re-login.

OAuth2 tokens are signed with the same Hydra keys (in shared DB). Tokens issued by blue verify on green.

Database upgrades

If you need a non-backward-compatible schema change:

  1. Plan a maintenance window (or accept brief unavailability).
  2. Stop blue.
  3. Run migration.
  4. Start green.

OR use the expand/contract pattern:

  1. Expand: add new column (nullable). Deploy v1.5 that writes to both old and new. Blue/green deploy.
  2. Backfill data: populate new column for existing rows.
  3. Contract: deploy v1.6 that reads only new column. Drop old column.

3 deploys but zero downtime.

Caveats

  • Shared DB load: both environments query the same DB. Spin up extra DB capacity before deploy.
  • Cache invalidation: if either environment caches DB results, deploys can show stale data briefly. Bound TTL.
  • WebSocket connections: persistent connections to blue don't transfer to green. Acceptable, clients reconnect.

Compare to rolling deploys

AspectBlue/GreenRolling
Resource cost2x (briefly)1x
RollbackInstant (LB cut)Slower (rollback rolls)
Database migrationsSameSame
Operational simplicityHigherLower

For Olympus's typical single-host deployments, blue/green is easy. For multi-host, rolling is more common.

On this page