Skip to main content

Runbooks

Service Not Ready

  1. docker compose ps
  2. docker compose logs -f <service>
  3. ./scripts/dev/wait-for-services.sh
  4. Verify dependent endpoints (etcd, minio, storage-engine).

Gateway Command Failures

  1. Check gateway logs for command + auth context.
  2. Confirm command exists in gateway dispatch (handler/mod.rs).
  3. Check coordinator/storage-engine downstream health.

Wrong Hot/Cold/Federated Result Behavior

  1. Verify collection policy command state.
  2. Re-run targeted matrix/E2E scenario.
  3. Compare with compatibility matrix expectations.

Backup/Restore Incident

  1. List backups and inspect backup status.
  2. Run restore in dry-run/staging first.
  3. Validate data correctness before production traffic cutback.

Pre-Release Checklist

make fmt
make lint
make test
make test-integration
make test-compatibility
make e2e-all-local
make mql-exhaustive
make mql-coverage-audit
make test-architecture
make check-guardrails

Source of Truth

  • services/gateway/src/proxy/handler/mod.rs
  • services/coordinator/internal/server/server.go
  • services/controller/internal/grpc/backup_service.go
  • scripts/dev/wait-for-services.sh