ADR-0017: Guests are first-class identities with daemon-managed credentials

Status: Accepted
Date: 2026-05-25
Deciders: @karasu
Supersedes: —
Superseded by: —

Context

The manifesto and docs/01-vision.md describe kaged as "for one operator (or a small trusted group) per deployment." That "small trusted group" hedge has, until now, been unspecified. It needs a concrete mechanism.

The motivating use case is narrow but real: an operator wants to expose a slice of a project's capabilities to a non-operator — a client, a collaborator, a contractor. The classical example is the testimonial workflow on a static-site project: the photographer uploads a photo, supplies a name and a quote, and the project's primary agent runs the operator-authored workflow that turns the upload into an HTML edit, a build, and a deploy. The photographer is not a kaged operator. They should not be able to read the DSL, edit prompts, see other projects, or run shell commands. They should be able to do exactly the thing the operator gave them.

ADR-0007 covers operator authentication via a sidecar (or the loopback nonce in per-user mode, or --insecure). It is sufficient for operators. It is not sufficient for guests, for two reasons:

Operators cannot reasonably add every guest to their IdP. Telling a wedding photographer they need to be added to your Google Workspace before they can leave a testimonial is absurd. The sidecar pattern assumes principals already belong to an identity provider the operator administers; that's the operator's IdP, not a casual third party's.
Sidecar auth doesn't work in all deployment modes. ADR-0010 makes per-user and --insecure first-class. Guests should be reachable in all of them; tying guest auth to the sidecar means guests only work when the operator has stood up a sidecar.

The question is therefore not whether there's a second identity tier — there has to be — but where it lives, who manages it, and how credentials are issued.

The relevant constraint set:

No telemetry, no phone-home. The daemon is sealed. Anything that requires the daemon to send email, SMS, or webhooks re-litigates ADR-0007 Alternative E, which was rejected explicitly for that reason. That rejection still stands.
Works in all auth modes. Sidecar, loopback, and insecure must all be able to host guests. Guest auth is an additional, daemon-owned gate that does not weaken in --insecure.
Multi-project guests are the norm. A guest who works with the operator on three projects shouldn't have three logins. One credential, many project memberships. Per-project credentials are a UX disaster.
Operator distinguishes operator from guest cheaply. Every audit log entry, every request handler, every policy check needs to know which class a principal is in. The cheapest way is a stable prefix on the user_id.
The trust tier is honestly lower. Guests run operator-authored workflows in a constrained UI. They do not get debug checkpoints, shell tasks, DSL visibility, or session prompts. The credential security tier can be commensurate — password-with-hash is fine; we are not protecting state secrets.

Decision

kaged maintains a daemon-owned guest accounts table. Guest user_ids carry the guest: prefix. Initial credential provisioning happens via operator-distributed one-time invite URLs (no SMTP, SMS, or other outbound channels in the daemon). Guest sessions use a kaged_guest_session cookie, scoped and validated independently of the operator session cookie. Guest auth is a separate gate that operates in all three operator auth modes.

Concretely:

Two SQLite tables: guests (one row per human, system-level) and guest_invites (one row per outstanding invite token).
Credential format: handle (operator-set, unique, 3–32 chars, [a-z0-9_-]) + password (argon2id, 64 MB memory cost, t=3, p=1; via Bun.password.hash with explicit algorithm).
Invite flow: operator creates a guest record → daemon mints a one-time setup token → daemon returns a URL ({ui_origin}/g/setup?token={token}) → operator copies and distributes the URL via whatever channel they choose → guest visits the URL, sets a password, token is consumed and deleted → guest can log in at /g/login.
Session cookie: kaged_guest_session, HttpOnly, SameSite=Lax, Secure when served over HTTPS. Distinct from kaged_session. Validated against a separate per-guest session record in guest_sessions (so revocation is per-guest, not per-startup).
User-id format: guest:<ulid>. Stable across renames of the handle. Used in audit logs, ACLs, and every API context. The prefix is normative — code paths may switch on it.
--insecure interaction: the operator's path is wide open under --insecure (per ADR-0007 amendment); the guest path is still authenticated. A daemon running --insecure happily checks guest passwords. The two gates are independent.
No SMTP, no SMS, no webhooks in the daemon. Ever. Invite delivery is the operator's job, by hand, every time. A future plugin may bridge to an outbound channel for operators who want it; the core daemon does not.

Audit semantics

Every guest action is audited with the guest's user_id. New audit event types:

guest.created — operator created the guest record.
guest.invited — invite token minted (carries the token's prefix for correlation, not the full token).
guest.activated — guest set their initial password; status moved from pending to active.
guest.login, guest.login_failure — successful and failed logins.
guest.locked — account locked due to repeated failures.
guest.password_changed — guest changed their own password.
guest.deactivated — operator deactivated the guest (status disabled).

The audit log already records user_id per daemon.md; the prefix distinguishes guest events from operator events naturally.

Rate limiting and lockout

Per-account login attempts: 5 failures in 15 minutes → account locked for 30 minutes. Operator can unlock immediately via POST /api/v1/guests/:user_id/unlock.
Per-IP login attempts (defense in depth): 30 failures in 15 minutes from one IP → temporary IP-level 429 for 5 minutes. Logged but not persisted across daemon restart.
Invite token TTL: 7 days by default, configurable per-mint, single-use. Expired tokens are reaped daily.

Password reset

There is no self-service password reset (it would require an outbound channel). If a guest loses their password, the operator regenerates a setup token via POST /api/v1/guests/:user_id/reinvite. The guest's existing sessions are invalidated when this happens. This is the same mechanism as initial invite, used for recovery.

A logged-in guest can change their own password at /g/account via POST /api/v1/g/account/password (old + new).

Consequences

What this commits us to

A guests table, a guest_invites table, and a guest_sessions table in SQLite (per ADR-0005). Schema sketched in specs/guests.md (new spec).
An argon2id dependency. Bun ships this via Bun.password.hash({ algorithm: "argon2id" }); no new third-party dep required.
A /g/setup, /g/login, /g/logout route family and a /api/v1/g/* endpoint family scoped to guest auth (separate from /api/v1/auth/* for operators).
A normative guest: prefix that downstream code can switch on.
An invite-token mechanic that, like the operator launch token in loopback mode, is generated by the daemon and distributed by the operator out-of-band.
A separate cookie name and a separate session validation path.
Audit log events for the guest lifecycle.

What this forecloses

No SMTP/SMS/webhooks in the daemon. Ever. This is non-negotiable in the same way as ADR-0007's rejection of Alternative E. If outbound delivery is ever wanted, it comes via a plugin under ADR-0008, never the daemon binary.
No self-service password reset. A guest who loses their password contacts the operator. The operator regenerates an invite. This is documented as an explicit limitation.
No per-project credentials. A guest's credential is system-level. Scoping to projects happens via grants (ADR-0018), not via separate logins.
No federated guest identity in v1. Guests are local to one kaged deployment. A guest on operator A's kaged is not the same guest on operator B's kaged. Cross-deployment guest identity is out of scope.
No "guest can also be operator" in the same session. The two cookies are distinct; a browser holding both is treated as two principals by the daemon. Operators who want to test guest surfaces create a real guest account in their own deployment.

What becomes easier

Onboarding a client to a workflow: operator clicks Invite, copies the URL, sends it via their own channel; client clicks, sets a password, is in.
Multi-project guests: one login, then /g lists every project they have a grant in.
Audit: every guest action is unambiguously tagged with guest:<ulid>. No confusion with operator events.
Working under --insecure: the operator can keep their own access frictionless on a trusted LAN while still gating guest access with real passwords.

What becomes harder

Daemon now owns credential storage and the rotation policy. argon2id parameters need documenting and revisiting periodically. The daemon has to handle password change UX, rate limiting, lockout, and unlock — none of which it owned before.
Two cookie names increase the test surface and the auth-middleware complexity. Both paths must be exercised in CI.
The operator carries the support burden for password recovery. Documented as a deliberate trade.
There is now a small admin surface (/projects/:id/settings gains a guest-management section, plus a daemon-wide /config/guests for the global list — see specs/ui/README.md amendments).

Alternatives considered

Alternative A — Reuse the OAuth sidecar for guests

Why tempting: Single auth pathway. Zero new credential storage. Sidecar already does OIDC well.

Why rejected: Forces guests into the operator's IdP. The casual use case (a photographer leaving a testimonial) cannot tolerate that friction. The operator would have to administer accounts in their IdP for every external collaborator. Worse, it only works in sidecar auth mode — --insecure and loopback have no IdP to delegate to. The whole point of a separate guest tier is that it sits alongside operator auth, not inside it.

Alternative B — Per-project credentials

Why tempting: Smaller blast radius per credential. Easier mental model ("this login is for this project").

Why rejected: A guest who collaborates on three projects ends up with three logins. Password reuse is then almost guaranteed (humans, you know how they are), making the per-project boundary an illusion. The unified system-level guest account with per-project grants (per ADR-0018) is the better split: one credential, scoped access.

Alternative C — Magic-link login (no password)

Why tempting: No password to forget, no password reset flow. Each session starts with a fresh one-time URL.

Why rejected: Either the daemon sends the link (re-litigates ADR-0007 Alternative E — telemetry-shaped, rejected) or the operator distributes a link before every login (intolerable UX). Passwords are the lesser evil. The invite mechanism — operator-distributed one-time URL — is borrowed from magic-link semantics for the initial setup moment, where the friction is acceptable because it happens once.

Alternative D — API tokens / personal access tokens

Why tempting: Stateless. Standard pattern for machine clients.

Why rejected: Guests use browsers on phones. PATs are for headless machine access; humans need session cookies. A future API tier for programmatic guest access is plausible but is out of scope for v1.

Alternative E — Defer guests entirely to a separate "kaged-guests" project

Why tempting: Keeps the core daemon manifesto pure. Guests are a different product.

Why rejected: The operator's workflow is one workflow. Splitting the system at the deployment boundary creates two operational surfaces, two audit logs, two sets of credentials to manage. The guest tier is a feature of kaged, not a separate thing.

Open questions

Invite URL TTL default. 7 days is a guess. Operators with intermittent contact patterns may want longer; security-conscious ones may want shorter. The per-mint override exists; the default may need tuning post-deployment.
Handle uniqueness scope. Currently global within a deployment. Should it be per-project? Almost certainly not (the same cara is cara everywhere), but documenting the call.
Display name vs handle. Handle is [a-z0-9_-]. Should there be a separate display name field for the UI (e.g., "Cara McGee")? Lean yes; mark as optional on the schema, sketched in the spec.
Multi-device sessions. Should a guest be allowed multiple concurrent active sessions (phone + laptop)? Lean yes, with a /g/account page that shows active sessions and a revoke button per session.
Brute-force ceiling globally. The per-account and per-IP limits don't compose well against a distributed attack. A daemon-global "guest auth attempts per minute" circuit breaker may be worth adding. Open.

References

ADR-0005 — SQLite for the new tables
ADR-0007 — the operator auth contract these tables live alongside
ADR-0008 — the route for outbound delivery if anyone ever wants it
ADR-0010 — the deployment modes guest auth must operate in
ADR-0018 — how guests are scoped to projects
docs/01-vision.md — the "small trusted group" phrasing this ADR operationalises
docs/specs/guests.md — the implementation spec (to be written)
argon2: https://github.com/P-H-C/phc-winner-argon2
Bun password hashing: https://bun.sh/docs/api/hashing#bun-password
Original discussion: design conversation with colleagues, 2026-05-25