Production notes
The Installation compose stack is fine for personal use and small teams. Production deployments (anything with a real domain, real users, or compliance obligations) want a few more considerations on top. This page covers them.
TLS via Caddy
Section titled “TLS via Caddy”Caddy auto-provisions Let’s Encrypt certificates for any real
hostname. In the compose file’s caddyfile config, swap the local
:80 block for your domain:
brittle.example.com { reverse_proxy brittle:3100}Then expose ports 80 and 443 on the host (instead of just 3100):
caddy: ports: - '80:80' - '443:443'Caddy fetches a cert on first start. Ports 80 and 443 need to be reachable from the public internet.
S3 artifact storage
Section titled “S3 artifact storage”The local volume is fine for a single VM. For anything multi-replica, or for setups where artifact bytes need to outlive the Hub container, switch to S3:
artifacts: store: s3://brittle-prod-artifacts/main s3: region: us-east-1 accessKeyId: ${AWS_ACCESS_KEY_ID} secretAccessKey: ${AWS_SECRET_ACCESS_KEY}For non-AWS providers:
- Cloudflare R2: set
endpoint,forcePathStyle: true. - MinIO: set
endpoint,forcePathStyle: true. - GCS in S3-compat mode: set
endpoint,forcePathStyle: false.
Plan egress capacity on your S3 provider. The Hub doesn’t proxy artifact bytes; reporters and the dashboard hit the bucket directly.
Secrets
Section titled “Secrets”Three secrets the Hub needs:
| Secret | Generate with | Rotation |
|---|---|---|
JWT_SECRET | openssl rand -hex 32 | Swap value + restart Hub. Existing session cookies invalidate. |
BRITTLE_AI_SECRET_KEY | openssl rand -hex 32 | Don’t rotate. Per-org AI keys in the DB are encrypted with this. Re-encryption tooling is not yet shipped. |
DATABASE_URL password | Postgres ALTER USER ... | Swap value + restart Hub. Active connections drop and re-establish. |
Project tokens (the ones reporters use) rotate from the dashboard. Revoke by minting new + deleting old.
Backups
Section titled “Backups”Two things to back up:
- Postgres. Standard
pg_dumpworks fine; restore withpg_restore. - Artifact bytes. If you’re on the local volume, snapshot the volume. If you’re on S3, enable bucket versioning + lifecycle policy for cold storage. The DB references artifacts by path / key, so restoring the DB without the matching artifacts gives you a dashboard with broken video/trace links.
A reasonable starting cadence is daily Postgres dumps, weekly artifact snapshots, retain for 30 days. Adjust based on how much session history you want to keep recoverable.
Upgrades
Section titled “Upgrades”The Hub image is published at:
ghcr.io/brittlehq/brittle:0.1.2is a pinned exact version.ghcr.io/brittlehq/brittle:0.1tracks the latest patch of 0.1.x.ghcr.io/brittlehq/brittle:latestmoves with every release.
For production, pin to a specific version (0.1.2 etc.). The
:latest tag is fine for the local stack but a bad idea for a
production deployment, because surprise upgrades happen on every
docker compose pull.
To upgrade:
docker compose pull brittledocker compose up -dMigrations are backwards-compatible within a minor version; you can roll back if needed.
Read the release notes before upgrading across a minor. They’re at github.com/brittlehq/brittle/releases.
Monitoring
Section titled “Monitoring”The Hub exposes one internal endpoint useful for monitoring:
GET /health. Returns 200 if Postgres is reachable and migrations are caught up. No auth required.
For deeper visibility (queue depth, AI job status), the
/api/internal/* endpoints are bearer-token-gated. Set
internal.token in the config to enable them; leave it unset to keep
them returning 503.
Logs go to stdout in JSON (pino format). Aggregate them with whatever you already have: Loki, Datadog, CloudWatch.
Next steps
Section titled “Next steps”- Hub configuration lists every field and env var.
- Self-host overview covers the deployment shape.