ADR-0022: Agents are recursive; tools and cage are per-agent
- Status: Accepted
- Date: 2026-05-26
- Deciders: @karasu
- Supersedes: —
- Superseded by: —
Context
The DSL today has two structural problems that have been worked around but never fixed, and the workarounds are starting to compound.
Problem 1: subagents are half-agents. A Subagent in docs/specs/project-dsl.md has model, system_prompt, cage, can_be_called_by, and parameters. It has no tools field. A subagent's tool surface is whatever the project's top-level tools: block enables, narrowed by the subagent's cage. Two consequences fall out of this: a subagent cannot be a specialist with its own curated toolset, and a subagent cannot itself have subagents. The "subagent" type is poorer than the "primary" type for no defensible reason.
Problem 2: cage and tools are project-level constructs masquerading as per-agent ones. Cage lives in two places (subagents.<name>.cage and cage_defaults at the top level) but never on the primary. Tools live exclusively at the project level. The primary "runs uncaged" by convention encoded only in docs/specs/agent.md prose, not in the schema. There is no path in the current DSL to express "this primary is a coordinator that has only issue and workflow tools" — the primary always inherits whatever the project enables.
Problem 3: project-reference subagents (added 2026-05-26) inline a nested project's primary as a callable subagent of the parent. The compiled tree carries a wrapped _compiled field alongside the original path / name / description / overrides reference fields. This is a different shape than a regular subagent — downstream consumers (the harness, the synthesized endpoint, the UI DSL viewer) have to branch on which shape they're holding. The recent federated-config.md amendment un-deferred this work but kept the wrapper shape as a temporary measure.
Problem 4: interconnect overlaps can_be_called_by. interconnect declares event-routed dispatch (scraper -> deployer: on_event(found_release)); can_be_called_by declares the capability to call. In practice the LLM running in the primary already decides what to dispatch to whom based on its reasoning. Declarative event routes add a parallel mechanism that no one is using and that complicates the call graph.
Problem 5: there is no "project-management" tool namespace. Issues (ADR-0020) and workflows (ADR-0019) are first-class concepts, but the agent-facing tool surface for them does not exist as a namespace. Currently issues are operator-UI-only and workflows are invoked via dedicated endpoints; an agent that wants to file or update an issue, or trigger a workflow as part of its reasoning, has no tool to call.
These problems point at one underlying simplification: the primary and a subagent should be the same kind of thing, and that kind of thing owns its own tools and cage. Project references, when flattened, are also the same kind of thing. The DSL's content reorganizes around a single recursive shape.
This ADR makes that reorganization the new ground floor.
Decision
The DSL collapses
PrimaryAgentandSubagentinto a single recursiveAgentSpec.AgentSpeccarries its owncageand its owntools. A project'sprimaryfield is anAgentSpec; a subagent of any agent is anAgentSpec; an agent'ssubagentsfield isRecord<string, AgentSpec>, recursively. Project-reference subagents flatten intoAgentSpecsubtrees at compile time.can_be_called_by,interconnect, andcage_defaultsare removed: the tree structure is the call graph, and there are no agent-level defaults below the project root.
The specific changes:
One agent shape.
AgentSpec = { model, system_prompt, description?, parameters?, cage, tools?, subagents? }. Used identically forProjectDsl.primaryand for every entry under anysubagentsmap at any depth.Recursive nesting.
AgentSpec.subagentsisRecord<string, AgentSpec>and may itself contain agents with their ownsubagents. Depth is bounded (16 levels, same as the existing project-reference depth limit). The tree is the call graph: a parent agent can call its direct children; sibling and cross-tree calls do not exist.cagemoves to the agent.AgentSpec.cageis required on every agent including the primary.cage_defaultsis removed from the project schema. Each agent declares its own cage; there is no inheritance between parent and child cages because cages are per-process and a child runs in its own sandbox context.toolsmoves to the agent.AgentSpec.toolsis an optional per-agent override map with the existingToolOverrideshape. The project-leveltools:block is removed. Tool resolution per agent: built-in defaults → role-based defaults (see below) → agent's owntools:block → cage filter at dispatch.Role-based defaults are positional. The agent reachable via
ProjectDsl.primary(the "root agent") getskaged.issue.*andkaged.workflow.*enabled by default. Every other agent in the tree starts with an empty tool set; the operator opts in per agent. There is no schema-level "primary vs subagent" distinction — the default is a property of position.can_be_called_byis removed. Sibling-to-sibling and grandparent-to-grandchild dispatch are no longer expressible. If two agents need to be called by a third, they become children of that third agent.interconnectis removed. Reasoning-driven dispatch via the syntheticagent-{key}tools is the only dispatch mechanism. The LLM in the parent decides when to call which child.Project references flatten into
AgentSpecsubtrees. Asubagents.<name>: { path: project:/..., name?, description?, overrides? }resolves at compile time to the nested project'sprimarybecoming this entry, with the nested project'ssubagentsbecoming this entry'ssubagents, recursively. After flattening there is no residualpath:or_compiledwrapper — onlyAgentSpec.New tool namespaces.
kaged.issue.*(create,update,comment,transition,list,get) andkaged.workflow.*(trigger,list,status). These tools are registered in the built-in tool registry perdocs/specs/agent-tooling.md. They have a principal-scope tag: the schema rejects them on non-root agents.Issue bubble-up is the only subagent-issue pattern. Subagents do not have
kaged.issue.*access. Issue context that a subagent needs to act on is part of the delegation message the parent sends. Subagent return values bubble back to the parent, which decides whether to update the issue. This keeps subagents domain-blind and the audit trail single-rooted.The synthesized endpoint is total truth.
GET /api/v1/projects/:id/dsl/synthesizedreturns the fully flattened, resolved DSL: project references expanded toAgentSpectrees, alltools:overrides applied, all defaults materialized, nopath:or wrapper fields remaining. This is the shape the harness consumes.
The unified shape
Before (current v1, with the new agent-level tools that this ADR adds — illustrating only the structural change):
version: 1
project: example
primary:
model: smart-generalist
system_prompt: project:/prompts/primary.md
cage_defaults:
fs: []
net: { allow: [] }
state: ephemeral
subagents:
scraper:
model: low-cost-fast
system_prompt: project:/prompts/scraper.md
cage:
fs: [{ mode: ro, path: data }]
net: { allow: ["*.example.com"] }
state: ephemeral
can_be_called_by: [primary]
deployer:
model: smart-careful
system_prompt: project:/prompts/deployer.md
cage: disabled
can_be_called_by: [primary, scraper]
child_project:
path: project:/sub/builder
overrides:
primary:
model: smart-careful
interconnect:
release_pipeline:
from: scraper
to: deployer
on: found_release
tools:
"dap.*": null
After:
version: 1
project: example
primary:
model: smart-generalist
system_prompt: project:/prompts/primary.md
cage: disabled # interim; see § Interim state
# tools: implicit — root agent gets kaged.issue.* and kaged.workflow.*
subagents:
scraper:
model: low-cost-fast
system_prompt: project:/prompts/scraper.md
cage:
fs: [{ mode: ro, path: data }]
net: { allow: ["*.example.com"] }
state: ephemeral
tools:
"file.read": { enabled: true }
"search.grep": { enabled: true }
deployer:
model: smart-careful
system_prompt: project:/prompts/deployer.md
cage: disabled
tools:
"file.*": { enabled: true }
"kaged.workflow.trigger": { enabled: true } # PARSE ERROR: only root may have kaged.*
child_project:
path: project:/sub/builder # flattened at compile time
overrides:
model: smart-careful
After flattening, the synthesized endpoint returns child_project as a plain AgentSpec with whatever the nested project's primary declared, with its own subagents map inlined, with no path: remaining. The shape is uniform top-to-bottom.
Consequences
What this commits us to
- A single
AgentSpectype in@kaged/dsl, used for every position in the agent tree. The schema mirror (Zod + JSON Schema) gets a recursive type alias. - A documented "root agent" concept in
docs/specs/project-dsl.md— the agent atProjectDsl.primary— with explicit default-tool rules attached to that position. - Two new tool namespaces in
docs/specs/agent-tooling.md:kaged.issue.*andkaged.workflow.*. Each tool definition declares a principal-scope (root-onlyinitially); the registry rejects registration on non-root agents. - A
synthesizedendpoint that returns a fully inlined tree. Implementation work inpackages/dsl(compiler) andpackages/daemon(handler) perdocs/specs/http-api.md. - A clear, single rule for the call graph: parent → direct children, period. Documented in
docs/specs/agent.mdand reflected in the Mastra supervisor wiring. - A documented issue bubble-up pattern in
docs/specs/issues.md— issues live with the root agent; subagents receive issue context as delegation framing; subagent returns are interpreted by the parent. - A regenerated set of example DSL files in
docs/dsl/examples/demonstrating the new shape. Existing examples are rewritten in place; the file list does not change.
What this forecloses
- No sibling-to-sibling dispatch. Two top-level subagents cannot call each other directly. If the operator wants that pattern, they nest one under the other or move both under a shared parent.
- No grandparent-to-grandchild dispatch. A root agent cannot reach across a child to invoke a grandchild. Each level of the tree mediates the level below it.
- No declarative event-routed dispatch.
interconnectis gone. If reasoning-driven dispatch is insufficient for a workflow, the workflow's prompt orchestrates the steps; the DSL no longer carries event hooks. - No project-level cage defaults. Each agent declares its own cage. The repetition this introduces in projects with many similarly-caged subagents is real and accepted as the cost of explicitness; a project-level helper macro is plausible v1.x.
- No project-level tool overrides. Each agent declares its own tool surface. A project-wide "disable DAP" requires touching each agent (or, more typically, having only one or two agents to which DAP would apply).
kaged.issue.*andkaged.workflow.*on non-root agents. Schema-level rejection. A subagent cannot file or transition issues. This is enforced; it is not "lean strongly against."- The
_compiledwrapper shape from the 2026-05-26 project-reference amendment. It existed only as long as project references had a different shape from regular subagents. With unification, the wrapper has no remaining purpose.
What becomes easier
- Reading a DSL file. One shape, recursive, no special cases for "primary" vs "subagent" vs "project-reference." The reader's mental model is a single tree.
- Writing a flavor-B project (primary coordinates; subagents work). The operator declares
primarywith notools:override (gets PM tools by default), then declaresprimary.subagents.<worker>with the work tools each needs. There is no schema flag or mode toggle to remember. - Writing a flavor-A project (primary does work directly). The operator overrides
primary.toolsto addfile.*,lsp.*, etc. Same shape, different content. - Reasoning about cage scope. Each agent is a node with its own cage. There is no inheritance to chase, no default to remember.
- Building the synthesized endpoint, the DSL viewer, the audit log, and any other consumer of compiled DSL. They all walk one shape.
- Adding new agent positions in the future. A "scheduled agent" or a "system agent" is structurally just another
AgentSpec.
What becomes harder
- Repeating boilerplate. A project with eight subagents that all need
fs: [{ mode: ro, path: data }]writes that line eight times. Documented tradeoff; addressed later with macros or templates if it becomes painful. - Migrating any in-repo DSL that uses sibling dispatch via
can_be_called_by. The dogfood.kaged/project.yamland every example indocs/dsl/examples/must be rewritten. Pre-alpha; no external operators to migrate. - Designing prompts for nested-subagent chains. A grandchild's prompt must assume the parent's framing without seeing the grandparent's framing — same constraint as today, but now possible at arbitrary depth. Prompt-authoring guidance updated in
docs/dsl/. - Reasoning about token cost in deep trees. Each delegation forwards filtered messages; deep trees forward more times. Existing audit infrastructure already records per-agent tokens; no new mechanism needed, but operators should be aware.
Spec amendments required
The decision above is the contract. The following spec amendments implement it. Each lands in its own PR per ADR-0003; each cites this ADR in its ## Amendments section.
| # | File | Change |
|---|---|---|
| 1 | docs/specs/project-dsl.md |
Define AgentSpec as the single recursive shape. Rewrite the primary and subagents sections to reference it. Add an ## AgentSpec section. Remove cage_defaults section. Remove interconnect section. Remove top-level tools: section (move to AgentSpec.tools). Rewrite the project-reference flattening section to describe AgentSpec subtree output. Update JSON Schema in Appendix A. Update top-level shape example. |
| 2 | docs/specs/agent-tooling.md |
Add kaged.issue.* and kaged.workflow.* namespaces with full tool definitions. Add a principal_scope: "root-only" | "any" field to ToolDefinition. Document tool resolution per agent (built-in → role-default → agent override → cage filter). Update namespace table. |
| 3 | docs/specs/agent.md |
Update the Mastra supervisor section: the harness walks the recursive AgentSpec tree, registering each child as a synthetic agent-{key} tool on its parent. No can_be_called_by checks. No event routing. Per-agent tool resolution. |
| 4 | docs/specs/federated-config.md |
Update the project-reference flattening section: output is an AgentSpec subtree, not a wrapped _compiled shape. Cycle detection unchanged. Depth limit unchanged. |
| 5 | docs/specs/http-api.md |
Update the GET /api/v1/projects/:id/dsl/synthesized section: response is the fully flattened DSL with no path: entries and all overrides applied. Document the shape contract. |
| 6 | docs/specs/issues.md |
Add an ## Issue bubble-up section: subagents do not access issues directly; delegation framing carries issue context; subagent returns feed primary's decision to update. |
| 7 | docs/specs/workflows.md |
Update the tool intersection logic: workflows compose against the root agent's tool surface, not the (removed) project-level tools: block. |
| 8 | docs/dsl/examples/ |
Rewrite every example file: single-subagent.yaml, defaults.yaml, insecure.yaml, portable.yaml, and any others present. Remove the defaults.yaml if cage_defaults was its only reason to exist; replace with a nested.yaml showing recursive subagents. |
| 9 | docs/dsl/README.md |
Update operator-facing prose to describe the recursive shape. Update the example index. |
| 10 | .kaged/project.yaml (dogfood) |
Update to the new shape. |
| 11 | JSON Schema at kaged.dev/schema/v1.json |
Republish with the recursive AgentSpec shape. See § Schema version below. |
After spec PRs land:
- Tests. Each amended spec triggers failing tests per ADR-0003: parser tests (
@kaged/dsl), tool registry tests (@kaged/agent-tooling), harness tests (@kaged/harness), daemon endpoint tests (@kaged/daemon), example-validation tests (CI walk ofdocs/dsl/examples/). - Code. Implementation follows tests. Approximate package surface:
@kaged/dsl(schema + compiler),@kaged/agent-tooling(new tool registrations + principal-scope check),@kaged/harness(recursive Mastra wiring),@kaged/daemon(synthesized endpoint shape, issue-tool dispatch). - STATUS.md sync. Per the AGENTS.md hard sync rule, code changes land with matching
STATUS.mdentries in the same diff.
Existing ADRs amended
This ADR does not supersede any existing ADR. It amends the following:
- ADR-0006 — DSL format unchanged (still YAML 1.2, still validated, still strict-mode). DSL content gets the recursive
AgentSpecshape. An amendment block is added to ADR-0006 noting the content reorganization. - ADR-0009 — Sandbox mechanism unchanged. Cage location in the DSL moves from
subagents.<name>.cage+cage_defaultsto per-agentAgentSpec.cage. The primary now has acagefield; supervisor work to actually cage the primary process is scheduled here as a follow-up (see § Interim state). An amendment block is added to ADR-0009. - ADR-0015 — Merge semantics unchanged. Project-reference flattening output changes from wrapped
_compiledto directAgentSpecsubtree. Section 7 ("Compiled Contextualization") is amended in the federated-config spec. - ADR-0019 — Workflow model unchanged. Tool intersection logic now operates against the root agent's tool surface instead of the removed project-level
tools:block. Amendment block added to ADR-0019 and toworkflows.md.
Interim state
Two pieces of this ADR cannot ship runtime support in the same chain as the spec amendments:
Primary cage runtime. The schema requires cage on every AgentSpec including the primary. The supervisor today does not cage the primary process — primary runs as the daemon's UID. Until the supervisor is extended to spawn the primary in its own bwrap context, the only legal value for the root agent's cage is disabled. The DSL parser emits a parse-time error for any other value. The full cage block is accepted on every non-root agent. A follow-up ADR (or amendment to ADR-0009) schedules the supervisor work; the eventual default for the root agent is "locked to project root, no network, ephemeral state."
Schema version. This is a breaking change to the DSL's content shape (cage_defaults removed, can_be_called_by removed, interconnect removed, project-level tools removed, recursive subagents added). The repo is pre-alpha and the established pattern from the named-object-map rewrite (2026-05-25) is "no migration support; new format enforced directly" at version: 1. We continue that pattern here: schema stays at version: 1, no migration tooling, dogfood and examples are rewritten in the same PR chain.
Alternatives considered
Alternative A — Keep primary and Subagent as separate types; add tools and recursive subagents to both
Why tempting: smaller schema change. Preserves the conceptual distinction between "the entry point" and "a worker."
Why rejected: the distinction is only meaningful by position. Two types that are structurally identical except for which fields are technically optional are a smell, not a design. Maintaining the dual-type system means every consumer (parser, harness, viewer, audit) branches on type when it could walk one shape. The position-based default-tool rule is cleaner than a type-based one.
Alternative B — Explicit kind: discriminator on each agent (simple vs coordinator)
Why tempting: a glance at the file tells you the operator's intent. Schema can validate "coordinator may not have work tools."
Why rejected: invents categories the operator didn't ask for and doesn't need. The shape of tools: already says everything: an agent with file tools is doing work; an agent with only kaged.* tools is coordinating. A kind: field is metadata duplicating the data. Per the discussion preceding this ADR: "we can invent C, D, E ideas without having to label them."
Alternative C — Keep can_be_called_by; allow sibling dispatch
Why tempting: existing examples use it. Real workloads sometimes look like "scraper finds X; deployer ships X" without needing a coordinator agent in between.
Why rejected: the tree's hierarchy is the call graph in every other recursive-agent system, and adding sibling capability requires every consumer to track a side-channel call graph in addition to the tree. The "scraper → deployer" pattern is well-modeled by a coordinator parent with both as children; the LLM in the coordinator does the routing the can_be_called_by edge used to do. The simplification is worth the small refactor cost in existing examples.
Alternative D — Keep interconnect for event-routed dispatch
Why tempting: deterministic event routes are easier to reason about than LLM-driven dispatch.
Why rejected: kaged is built on LLM reasoning. If a workload genuinely needs deterministic step-graph execution, that is a workflow (ADR-0019) — a separate, intentional construct. The DSL's agent graph is for reasoning-driven dispatch. Two parallel dispatch mechanisms split the audit story and the operator's mental model. Pick one; we pick reasoning.
Alternative E — Per-agent tools but keep project-level tools as defaults
Why tempting: avoids per-agent boilerplate when many agents share a tool surface.
Why rejected: the role-based defaults (root gets PM tools; subagents start empty) already handle the common cases. The cases that remain are genuinely heterogeneous — a scraper, a deployer, and a builder rarely want the same tool surface. The "project-level defaults that some agents inherit and others don't" model would put us back to needing inheritance rules, which is the complexity we're escaping.
Alternative F — Defer the recursive change; only do cage/tools per-agent in this ADR
Why tempting: smaller blast radius. Easier review.
Why rejected: the recursive change is what makes the per-agent shape coherent. Project references already need to flatten into something; the choice is "flatten into a wrapper" or "flatten into AgentSpec." If we ship per-agent tools/cage without the recursive shape, project references stay weird and the next ADR has to do the unification anyway. One coherent change beats two half-changes.
Open questions
These are not blockers for accepting the ADR, but each needs a decision during the spec-amendment phase:
Guest-facing agent topology. The current workflow spec assumes the workflow itself is the agent the guest interacts with. The conversation preceding this ADR raised whether there should be a dedicated guest-facing agent (a "concierge") in the project DSL that triages guest requests, picks workflows, and gathers inputs in natural language. This is genuinely a separate decision and belongs in a follow-up ADR (probably 0023). It does not change the shape decided here — a concierge would simply be another agent position the operator can declare. Left open.
kaged.subagent.invokeas an explicit tool. Mastra exposes child agents as syntheticagent-{key}tools today. Should the synthetic tools also be visible in the tool registry under akaged.subagent.*namespace for audit-log uniformity and configurability? Lean yes — it makes the operator-facing tool list complete — but defer to the agent-tooling spec amendment.Per-agent prompt-file naming convention. With recursive nesting, prompt files for deeply-nested agents proliferate. A naming convention (
prompts/<agent-path>.md?) would help. Operator preference, not a schema concern. Documented indocs/dsl/during amendment.Lint-time warning for child cage broader than parent cage. Recommended; not blocking. The parser can emit a warning when an agent's cage grants more than its parent's cage grants (e.g., parent is
fs: [{mode: ro, path: src}]and child iscage: disabled). Defer the precise warning text and conditions to the project-dsl.md amendment.
References
- ADR-0003 — doc-first then TDD; the process this ADR feeds into
- ADR-0006 — DSL format; amended by this ADR (content, not format)
- ADR-0009 — sandbox mechanism; amended by this ADR (cage location + primary cage scheduled)
- ADR-0011 — portability; preserved (paths still project-relative; aliases unchanged)
- ADR-0012 — Mastra supervisor pattern; the harness wiring this ADR adjusts
- ADR-0015 — federated config; amended by this ADR (project-reference flattening shape)
- ADR-0019 — workflows; amended by this ADR (tool intersection target)
- ADR-0020 — issues; this ADR adds the
kaged.issue.*tool namespace and the bubble-up pattern docs/specs/project-dsl.md— the spec carrying most of the changedocs/specs/agent-tooling.md— the spec gaining new namespacesdocs/specs/federated-config.md— the spec amended for flattening- Original discussion: design conversation with colleagues, 2026-05-26