kaged — @kaged/harness

✓ snapshotFromMessages > creates snapshot from messages and system messages [0.120ms]

✓ suspend > returns suspended result with payload [0.150ms]

✓ suspend > returns serialized state as JSON string [0.080ms]

✓ suspend > preserves model-requested reason [0.030ms]

✓ resume > returns resumed result without prompt edits [0.100ms]

✓ resume > detects prompt edits and reports changed agents [0.070ms]

✓ resume > invalid serialized state returns deserialize error [0.090ms]

✓ serializeSnapshot > produces valid JSON string [0.050ms]

✓ serializeSnapshot > round-trips through deserializeSnapshot [0.100ms]

✓ deserializeSnapshot > returns error for invalid JSON [0.050ms]

✓ deserializeSnapshot > returns error for valid JSON but wrong shape [0.030ms]

✓ harnessToLlmMessage > converts system message [0.130ms]

✓ harnessToLlmMessage > converts user message [0.030ms]

✓ harnessToLlmMessage > converts assistant message with text only [0.090ms]

✓ harnessToLlmMessage > converts assistant message with empty content [0.020ms]

✓ harnessToLlmMessage > converts assistant message with tool calls [0.060ms]

✓ harnessToLlmMessage > converts tool result message [0.040ms]

✓ estimateMessages > returns an EstimateResult with tiktoken algorithm for OpenAI model [170.89ms]

✓ estimateMessages > uses maxOutputTokens from modelMeta as reservedOutputTokens [0.110ms]

✓ estimateMessages > defaults to 4096 reservedOutputTokens when modelMeta is null [0.050ms]

✓ estimateMessages > empty messages list still counts system prompt [0.050ms]

✓ checkCompactionTrigger > not triggered when well under threshold [0.140ms]

✓ checkCompactionTrigger > triggered when over threshold [0.450ms]

✓ checkCompactionTrigger > triggered at exact threshold boundary [0.110ms]

✓ buildAlwaysKeepSet > rule 1: keeps system messages [0.090ms]

✓ buildAlwaysKeepSet > rule 2: keeps first user message only [0.040ms]

✓ buildAlwaysKeepSet > rule 3: keeps messages with metadata.always_keep = true [0.050ms]

✓ buildAlwaysKeepSet > rule 3: ignores metadata.always_keep = false [0.050ms]

✓ buildAlwaysKeepSet > rule 4: keeps messages matching operator-configured predicates [0.060ms]

✓ buildAlwaysKeepSet > rule 4: does not match falsy metadata values [0.040ms]

✓ buildAlwaysKeepSet > all four rules combined [0.060ms]

✓ buildAlwaysKeepSet > empty message list returns empty set [0.020ms]

✓ buildAlwaysKeepSet > messages without metadata are unaffected by rule 3 and 4 [0.030ms]

✓ coupleToolPairs > standalone messages produce single-message groups [0.170ms]

✓ coupleToolPairs > assistant with tool calls couples with subsequent tool results [0.090ms]

✓ coupleToolPairs > tool pair group stops at non-matching message [0.050ms]

✓ coupleToolPairs > assistant without tool calls is standalone [0.030ms]

✓ coupleToolPairs > assistant with empty toolCalls array is standalone [0.030ms]

✓ coupleToolPairs > multiple consecutive tool pairs are separate groups [0.050ms]

✓ coupleToolPairs > tool result for unknown call ID is standalone [0.030ms]

✓ coupleToolPairs > empty message list produces empty groups [0.020ms]

✓ coupleToolPairs > preserves chronological order [0.160ms]

✓ executeDrop > drops oldest non-keep messages until below lower threshold [1.63ms]

✓ executeDrop > never drops always-keep messages [0.700ms]

✓ executeDrop > drops tool-pair atomically (call + result together) [1.09ms]

✓ executeDrop > does not drop anything when already below threshold [0.240ms]

✓ executeDrop > preserves message order in output [0.580ms]

✓ executeDrop > drops oldest compactable groups first [0.720ms]

✓ validateToolPairIntegrity > valid when no tool calls present [0.150ms]

✓ validateToolPairIntegrity > valid when all tool pairs are intact [0.040ms]

✓ validateToolPairIntegrity > invalid when tool call has no matching result [0.030ms]

✓ validateToolPairIntegrity > invalid when tool result has no matching call [0.030ms]

✓ validateToolPairIntegrity > valid with multiple tool pairs [0.050ms]

✓ validateToolPairIntegrity > invalid when one of multiple results is missing [0.030ms]

✓ validateToolPairIntegrity > empty list is valid [0.010ms]

✓ executeSummarize > replaces compactable messages with summary system message [0.770ms]

✓ executeSummarize > preserves always-keep messages [0.210ms]

✓ executeSummarize > respects preserve_recent [0.300ms]

✓ executeSummarize > returns unchanged when all messages are always-keep [0.180ms]

✓ executeSummarize > passes window_messages cap to summarizeFn [0.300ms]

✓ executeSummarize > summary message is inserted at position of first compacted group [0.300ms]

✓ executeSummarize > tracks plugin cost from summarizeFn output [0.200ms]

✓ executeDelegate > returns strategy result on valid plugin response [0.540ms]

✓ executeDelegate > returns error when plugin throws [0.240ms]

✓ executeDelegate > returns validation_failed when plugin drops always-keep message [0.230ms]

✓ executeDelegate > returns validation_failed when plugin splits tool pair [0.200ms]

✓ executeDelegate > passes correct input to delegateFn [0.240ms]

✓ runCompactionPipeline > no compaction when below upper threshold [0.870ms]

✓ runCompactionPipeline > compacts when above upper threshold with drop strategy [1.99ms]

✓ runCompactionPipeline > always-keep messages survive drop [1.76ms]

✓ runCompactionPipeline > forced trigger (operator_manual) compacts even when below threshold [0.320ms]

✓ runCompactionPipeline > forced trigger (provider_overflow_retry) compacts even when below threshold [0.330ms]

✓ runCompactionPipeline > summarize strategy produces summary message [0.670ms]

✓ runCompactionPipeline > summarize falls back to drop when summarizeFn throws [0.950ms]

✓ runCompactionPipeline > summarize falls back to drop when summarizeFn not provided [0.610ms]

✓ runCompactionPipeline > summarize falls back to drop when summarize config missing [0.570ms]

✓ runCompactionPipeline > delegate strategy with valid plugin result [0.530ms]

✓ runCompactionPipeline > delegate falls back to drop when plugin throws [0.650ms]

✓ runCompactionPipeline > delegate falls back to summarize when configured [0.470ms]

✓ runCompactionPipeline > delegate falls back to drop when delegateFn not provided [0.590ms]

✓ runCompactionPipeline > delegate falls back to drop when delegate config missing [0.730ms]

✓ runCompactionPipeline > checkpoint strategy returns checkpointRequested with proposed record [1.10ms]

✓ runCompactionPipeline > checkpoint with summarize fallback proposes summarize record [0.430ms]

✓ runCompactionPipeline > checkpoint falls back to drop proposal when summarize fails [0.630ms]

✓ runCompactionPipeline > audit log receives triggered and completed events [0.620ms]

✓ runCompactionPipeline > no audit entries when not triggered [0.150ms]

✓ runCompactionPipeline > record contains correct window thresholds [0.560ms]

✓ runCompactionPipeline > record thresholdEstimate and afterEstimate are fractions [0.580ms]

✓ runCompactionPipeline > record pluginsFired is empty for drop strategy [0.530ms]

✓ runCompactionPipeline > tool pairs are atomically dropped in pipeline [1.20ms]

✓ runCompactionPipeline > pipeline with null modelMeta uses fallback estimator [0.200ms]

✓ runCompactionPipeline > delegate → summarize → drop triple fallback chain [0.640ms]

✓ runCompactionPipeline dry-run > dry-run always triggers even when below threshold [0.190ms]

✓ runCompactionPipeline dry-run > dry-run sets compacted = false [1.42ms]

✓ runCompactionPipeline dry-run > dry-run produces a valid CompactionRecordDraft [1.69ms]

✓ runCompactionPipeline dry-run > dry-run with summarize strategy executes the strategy [0.530ms]

✓ runCompactionPipeline dry-run > dry-run with checkpoint strategy propagates dryRun flag [0.550ms]

✓ runCompactionPipeline dry-run > non-dry-run does not set dryRun flag [0.550ms]

✓ runCompactionPipeline dry-run > dry-run returns proposed messages without modifying originals [1.51ms]

✓ isContextLengthError > returns false for null/undefined [0.150ms]

✓ isContextLengthError > detects Error with context_length message [0.070ms]

✓ isContextLengthError > detects Error with prompt-too-long message [0.040ms]

✓ isContextLengthError > detects Error with token limit message [0.060ms]

✓ isContextLengthError > detects Error with input-too-large message [0.030ms]

✓ isContextLengthError > detects Error with content-too-large message [0.040ms]

✓ isContextLengthError > detects Google RESOURCE_EXHAUSTED [0.030ms]

✓ isContextLengthError > detects exceeds model limit [0.040ms]

✓ isContextLengthError > detects HTTP 413 via errorStatus [0.020ms]

✓ isContextLengthError > detects HTTP 413 via status field [0.020ms]

✓ isContextLengthError > detects LLM AssistantMessage-shaped error object [0.020ms]

✓ isContextLengthError > detects plain string error [0.010ms]

✓ isContextLengthError > returns false for unrelated errors [0.040ms]

✓ isContextLengthError > returns false for empty Error [0.020ms]

✓ isContextLengthError > returns false for non-error primitives [0.020ms]

✓ drop strategy auto-summarize-at-threshold > drop strategy upgrades to summarize when summarizeFn and config present [0.620ms]

✓ drop strategy auto-summarize-at-threshold > drop strategy falls back to pure drop when summarize throws [1.08ms]

✓ drop strategy auto-summarize-at-threshold > drop strategy uses pure drop when summarizeFn not provided [0.550ms]

✓ drop strategy auto-summarize-at-threshold > drop strategy uses pure drop when summarize config missing [0.540ms]

✓ buildDelegationConfig > onDelegationStart writes audit entry with cage kind [0.270ms]

✓ buildDelegationConfig > onDelegationStart uses 'none' cage kind for unknown subagent [0.100ms]

✓ buildDelegationConfig > onDelegationComplete writes audit entry with success + duration [0.160ms]

✓ buildDelegationConfig > onDelegationComplete records error message on failure [0.100ms]

✓ buildDelegationConfig > works without auditLog (no-op logging) [0.070ms]

✓ buildDelegationConfig > forwards to user hooks and passes cage policy [0.150ms]

✓ buildDelegationConfig > messageFilter is undefined when no hook provided [0.030ms]

✓ buildDelegationConfig > messageFilter forwards to hook with cage policy when provided [0.160ms]

✓ createExporter: langfuse mode > returns langfuse-mode exporter when enabled [0.060ms]

✓ createExporter: structured_log fallback > returns structured_log-mode exporter when disabled [0.020ms]

✓ createExporter: span collection > langfuse exporter accepts spans without throwing [0.070ms]

✓ createExporter: span collection > structured_log exporter accepts spans without throwing [0.050ms]

✓ createAuditLog > creates an empty audit log [0.050ms]

✓ createAuditLog > records written entries [0.040ms]

✓ createAuditLog > preserves entry order [0.030ms]

✓ createAuditProcessor > logs pre-generation event on processInput [0.200ms]

✓ createObservabilityProcessor > exports span when processing input [0.110ms]

✓ audit processor is always registered (non-optional) > audit processor works independently of observability config [0.050ms]

✓ routeModel: single alias > resolves alias to provider and model [0.190ms]

✓ routeModel: single alias > credentials included in result [0.030ms]

✓ routeModel: single alias > unknown alias returns alias_not_found [0.030ms]

✓ routeModel: single alias > alias pointing to missing provider returns provider_not_found [0.030ms]

✓ routeModel: direct provider:model > accepts direct provider:model identifier [0.030ms]

✓ routeModel: direct provider:model > direct identifier with missing provider returns error [0.020ms]

✓ routeModel: provider credentials > base_url passed through when configured [0.030ms]

✓ routeWithFallback: fallback chains > uses first provider when healthy [0.120ms]

✓ routeWithFallback: fallback chains > falls back to second when first is down [0.060ms]

✓ routeWithFallback: fallback chains > all providers down returns all_providers_failed [0.050ms]

✓ routeWithFallback: fallback chains > single-target alias with fallback works like routeModel [0.040ms]

✓ isRouted: type guard > returns true for RouteResult [0.020ms]

✓ isRouted: type guard > returns false for RouteError [0.020ms]

✓ toMastraMessages > converts plain user message [0.130ms]

✓ toMastraMessages > converts plain assistant message without tool calls [0.030ms]

✓ toMastraMessages > converts system message [0.020ms]

✓ toMastraMessages > converts assistant message with tool calls to multipart content [0.070ms]

✓ toMastraMessages > converts assistant with tool calls but empty text content [0.040ms]

✓ toMastraMessages > converts toolResult message to tool role [0.040ms]

✓ toMastraMessages > converts full conversation with tool calls and results [0.060ms]

✓ toMastraMessages > handles multiple tool calls in one assistant message [0.040ms]

✓ toMastraMessages > assistant with empty toolCalls array is treated as plain message [0.020ms]

✓ jsonSchemaToZod > string schema produces z.string() [1.31ms]

✓ jsonSchemaToZod > number schema produces z.number() [0.290ms]

✓ jsonSchemaToZod > integer schema produces z.number() [0.080ms]

✓ jsonSchemaToZod > boolean schema produces z.boolean() [0.140ms]

✓ jsonSchemaToZod > string enum produces z.enum() [0.270ms]

✓ jsonSchemaToZod > array schema with items [0.420ms]

✓ jsonSchemaToZod > array schema without items allows anything [0.140ms]

✓ jsonSchemaToZod > object schema with required and optional properties [0.700ms]

✓ jsonSchemaToZod > empty object schema produces z.record() [0.230ms]

✓ jsonSchemaToZod > unknown type falls through to z.unknown() [0.060ms]

✓ jsonSchemaToZod > description is preserved on the zod schema [0.080ms]

✓ jsonSchemaToZod > nested object schemas [0.190ms]

✓ toolDefinitionToMastra > creates a tool with matching id and description [0.360ms]

✓ toolDefinitionToMastra > execute delegates to dispatch and returns data on success [1.32ms]

✓ toolDefinitionToMastra > execute throws on dispatch failure [0.240ms]

✓ toolDefinitionToMastra > execute throws default error when dispatch fails without error detail [0.210ms]

✓ toolDefinitionToMastra > execute returns { success: true } when dispatch succeeds with no data [0.360ms]

✓ toolDefinitionsToRecord > converts array of defs to a record keyed by name [0.300ms]

✓ toolDefinitionsToRecord > empty array produces empty record [0.050ms]

✓ compile: minimal project (primary only) > returns a compiled project with primary agent [0.170ms]

✓ compile: minimal project (primary only) > primary agent has resolved model identifier [0.030ms]

✓ compile: minimal project (primary only) > primary agent has instructions from prompt content [0.030ms]

✓ compile: minimal project (primary only) > primary agent has tools from input [0.030ms]

✓ compile: minimal project (primary only) > primary has empty subagents when none declared [0.020ms]

✓ compile: minimal project (primary only) > primary carries its cage policy [0.020ms]

✓ compile: minimal project (primary only) > prompt files are passed through [0.020ms]

✓ compile: project with subagents > subagents keyed by name on primary [0.070ms]

✓ compile: project with subagents > subagent has resolved model [0.030ms]

✓ compile: project with subagents > subagent has instructions from prompt [0.030ms]

✓ compile: project with subagents > subagent description is set when provided [0.030ms]

✓ compile: project with subagents > subagent cage policy is carried on the agent config [0.030ms]

✓ compile: project with subagents > disabled cage policy preserved [0.040ms]

✓ compile: project with subagents > multiple subagents compiled [0.040ms]

✓ compile: recursive nesting > nested subagents are compiled recursively [0.080ms]

✓ compile: recursive nesting > three-level nesting works [0.060ms]

✓ compile: primary parameters > parameters passed through when provided [0.030ms]

✓ compile: primary parameters > parameters undefined when not provided [0.030ms]

✓ compile: primary description > description passed through when provided [0.030ms]

✓ compile: plugins pass-through > plugins passed through on primary [0.070ms]

✓ compile: plugins pass-through > multiple plugins passed through [0.050ms]

✓ compile: plugins pass-through > plugins undefined when not provided [0.030ms]

✓ compile: plugins pass-through > plugins passed through on subagents [0.060ms]

✓ compile: plugins pass-through > plugins on nested subagents passed through recursively [0.080ms]

✓ compile: compaction pass-through > compaction passed through on primary [0.080ms]

✓ compile: compaction pass-through > compaction undefined when not provided [0.040ms]

✓ compile: compaction pass-through > compaction passed through on subagents [0.060ms]

✓ compile: compaction pass-through > delegate compaction passed through [0.040ms]

✓ compile: compaction pass-through > checkpoint compaction passed through [0.050ms]

✓ compile: compaction pass-through > plugins and compaction together on same agent [0.050ms]

✓ runPipeline: basic flow > returns output with no processors [0.060ms]

✓ runPipeline: basic flow > passthrough processor preserves messages [0.060ms]

✓ runPipeline: processor ordering > processors execute in registration order [0.090ms]

✓ runPipeline: message modification > processor can modify messages [0.050ms]

✓ runPipeline: message modification > processor can modify system messages [0.040ms]

✓ runPipeline: message modification > modifications chain through pipeline [0.080ms]

✓ runPipeline: abort > processor can abort the pipeline [0.060ms]

✓ runPipeline: abort > abort stops subsequent processors [0.050ms]

✓ runPipelineStep: step-aware processing > calls processInputStep when defined [0.150ms]

✓ runPipelineStep: step-aware processing > falls back to processInput when processInputStep not defined [0.050ms]

✓ runPipelineStep: step-aware processing > step abort works like pipeline abort [0.050ms]

✓ pruneToolOutputs > returns unchanged messages when nothing qualifies for pruning [0.240ms]

✓ pruneToolOutputs > returns unchanged when total savings below minimumSavings [0.110ms]

✓ pruneToolOutputs > prunes old large tool results beyond protection window [0.200ms]

✓ pruneToolOutputs > never prunes protected tools [0.210ms]

✓ pruneToolOutputs > preserves most recent tool outputs within protectTokens window [0.150ms]

✓ pruneToolOutputs > sets pruned metadata on pruned messages [0.100ms]

✓ pruneToolOutputs > skips tool results too small to produce savings [0.070ms]

✓ pruneToolOutputs > DEFAULT_PRUNE_CONFIG has expected values [0.030ms]

✓ isCoveredBy > wildcard covers everything [0.090ms]

✓ isCoveredBy > exact match [0.020ms]

✓ isCoveredBy > namespace glob covers namespace member [0.040ms]

✓ isCoveredBy > namespace glob does not cover different namespace [0.030ms]

✓ isCoveredBy > narrow glob covered by broader glob [0.020ms]

✓ isCoveredBy > exact name not covered by different exact [0.010ms]

✓ isCoveredBy > glob not covered by exact [0.020ms]

✓ isCoveredBy > non-glob prefix does not match [0.010ms]

✓ intersectTools > basic intersection: allow narrows parent [0.150ms]

✓ intersectTools > deny removes from intersection [0.100ms]

✓ intersectTools > unknown pattern emits diagnostic [0.050ms]

✓ intersectTools > step-level diagnostic includes stepId [0.050ms]

✓ intersectTools > exact tool in allow matched against exact parent [0.030ms]

✓ intersectTools > deny with glob removes matching tools [0.030ms]

✓ intersectTools > allow glob against parent glob [0.030ms]

✓ composeSystemPrompt > no step: root instructions only (no workflow prefix per ADR-0038) [0.040ms]

✓ composeSystemPrompt > with step: root + step section, no workflow section [0.020ms]

✓ composeSystemPrompt > step delimiter is byte-exact and no workflow delimiter is emitted [0.020ms]

✓ escapeBlockValue > escapes backslashes [0.080ms]

✓ escapeBlockValue > escapes double quotes [0.020ms]

✓ escapeBlockValue > escapes newlines [0.020ms]

✓ escapeBlockValue > escapes angle brackets as entities [0.020ms]

✓ escapeBlockValue > combined escaping [0.020ms]

✓ escapeBlockValue > plain string unchanged [0.010ms]

✓ renderInputLine > string value: quoted with escaping [0.080ms]

✓ renderInputLine > string with special chars [0.020ms]

✓ renderInputLine > url value: quoted [0.020ms]

✓ renderInputLine > integer value: bare [0.020ms]

✓ renderInputLine > number value: bare [0.060ms]

✓ renderInputLine > boolean value: bare [0.030ms]

✓ renderInputLine > file value: staging path with annotation [0.050ms]

✓ renderInputLine > file value: KB size [0.020ms]

✓ renderInputLine > file value: bytes size [0.020ms]

✓ renderInputLine > omitted value [0.020ms]

✓ renderInputLine > default string value [0.020ms]

✓ renderInputLine > default number value [0.020ms]

✓ renderInputLine > default boolean value [0.010ms]

✓ renderWorkflowInputsMessage > matches spec example structure [0.090ms]

✓ renderWorkflowInputsMessage > single input [0.040ms]

✓ renderWorkflowInputsMessage > mixed types [0.050ms]

✓ renderStepKickoffMessage > matches spec example structure [0.080ms]

✓ renderStepKickoffMessage > multiple bindings [0.040ms]

✓ compileWorkflowRun: stepless workflow > no top-level composed system prompt (removed per ADR-0038) [0.200ms]

✓ compileWorkflowRun: stepless workflow > effective tools are root ∩ allow [0.050ms]

✓ compileWorkflowRun: stepless workflow > steps array is empty for stepless [0.030ms]

✓ compileWorkflowRun: stepless workflow > no diagnostics for clean intersection [0.030ms]

✓ compileWorkflowRun: tool intersection > deny removes tools from effective set [0.060ms]

✓ compileWorkflowRun: tool intersection > empty intersection throws workflow_tools_empty [0.160ms]

✓ compileWorkflowRun: tool intersection > unknown allow pattern produces diagnostic but proceeds [0.060ms]

✓ compileWorkflowRun: tool intersection > kaged.issue.* and kaged.workflow.* available when in root tools [0.040ms]

✓ compileWorkflowRun: with steps > agent step inherits workflow tools when no step tools [0.140ms]

✓ compileWorkflowRun: with steps > kaged.step.complete injected into agent step tools [0.040ms]

✓ compileWorkflowRun: with steps > kaged.step.complete not duplicated if already in set [0.060ms]

✓ compileWorkflowRun: with steps > agent step with tools: narrower intersection [0.070ms]

✓ compileWorkflowRun: with steps > agent step with step-level deny [0.060ms]

✓ compileWorkflowRun: with steps > empty step-level intersection throws [0.080ms]

✓ compileWorkflowRun: with steps > agent step composed system prompt includes step section [0.060ms]

✓ compileWorkflowRun: with steps > confirm step: no tools, no composed prompt [0.050ms]

✓ compileWorkflowRun: with steps > task step: no tools, no composed prompt [0.060ms]

✓ compileWorkflowRun: with steps > multiple steps compiled in order [0.110ms]

✓ compileWorkflowRun: with steps > step diagnostics include stepId [0.060ms]

✓ compileWorkflowRun: three-level narrowing > root ⊇ workflow ⊇ step — each level narrows [0.080ms]

✓ compileWorkflowRun: subagent tools untouched > intersection applies to root agent only [0.070ms]

✓ validateStepOutputs — empty schema > empty args + empty schema → ok [0.160ms]

✓ validateStepOutputs — empty schema > unknown arg → error [0.070ms]

✓ validateStepOutputs — string type > valid string passes [0.220ms]

✓ validateStepOutputs — string type > non-string fails type check [0.030ms]

✓ validateStepOutputs — string type > missing required string fails [0.030ms]

✓ validateStepOutputs — string type > optional string can be omitted [0.030ms]

✓ validateStepOutputs — string type > max_length enforced [0.030ms]

✓ validateStepOutputs — string type > min_length enforced [0.030ms]

✓ validateStepOutputs — string type > pattern enforced [0.050ms]

✓ validateStepOutputs — string type > pattern match passes [0.030ms]

✓ validateStepOutputs — string type > enum enforced [0.030ms]

✓ validateStepOutputs — string type > enum match passes [0.030ms]

✓ validateStepOutputs — integer type > valid integer passes [0.030ms]

✓ validateStepOutputs — integer type > float fails [0.020ms]

✓ validateStepOutputs — integer type > non-number fails [0.020ms]

✓ validateStepOutputs — integer type > min enforced [0.030ms]

✓ validateStepOutputs — integer type > max enforced [0.030ms]

✓ validateStepOutputs — integer type > NaN fails [0.020ms]

✓ validateStepOutputs — integer type > Infinity fails [0.030ms]

✓ validateStepOutputs — number type > valid float passes [0.030ms]

✓ validateStepOutputs — number type > integer value also passes as number [0.020ms]

✓ validateStepOutputs — number type > string fails [0.020ms]

✓ validateStepOutputs — number type > min/max enforced [0.030ms]

✓ validateStepOutputs — boolean type > true passes [0.030ms]

✓ validateStepOutputs — boolean type > false passes [0.020ms]

✓ validateStepOutputs — boolean type > string 'true' fails [0.030ms]

✓ validateStepOutputs — url type > valid https URL passes [0.050ms]

✓ validateStepOutputs — url type > valid http URL passes [0.030ms]

✓ validateStepOutputs — url type > non-string fails [0.030ms]

✓ validateStepOutputs — url type > invalid URL fails [0.140ms]

✓ validateStepOutputs — url type > ftp scheme fails [0.030ms]

✓ validateStepOutputs — url type > URL with pattern enforced [0.060ms]

✓ validateStepOutputs — url type > URL exceeding max_length fails [0.060ms]

✓ validateStepOutputs — multiple outputs > all required present → ok [0.050ms]

✓ validateStepOutputs — multiple outputs > all fields present → ok [0.050ms]

✓ validateStepOutputs — multiple outputs > missing one required → error [0.040ms]

✓ validateStepOutputs — multiple outputs > unknown + valid → short-circuits on unknown [0.030ms]

✓ validateStepOutputs — multiple outputs > multiple type errors reported [0.030ms]

✓ validateStepOutputs — null values > null for required → missing_output [0.030ms]

✓ validateStepOutputs — null values > null for optional → skipped [0.030ms]

✓ resolveTemplate — input refs > string input resolved [0.200ms]

✓ resolveTemplate — input refs > integer input resolved to string [0.030ms]

✓ resolveTemplate — input refs > number input resolved [0.030ms]

✓ resolveTemplate — input refs > boolean input resolved [0.030ms]

✓ resolveTemplate — input refs > url input resolved [0.030ms]

✓ resolveTemplate — input refs > file input resolved to staging path [0.030ms]

✓ resolveTemplate — input refs > omitted input resolved to (omitted) [0.030ms]

✓ resolveTemplate — input refs > default input resolved to value [0.020ms]

✓ resolveTemplate — input refs > default numeric input resolved [0.030ms]

✓ resolveTemplate — input refs > missing input resolved to (unset) [0.020ms]

✓ resolveTemplate — step output refs > valid step output resolved [0.060ms]

✓ resolveTemplate — step output refs > numeric step output resolved to string [0.030ms]

✓ resolveTemplate — step output refs > missing step → (unset) [0.020ms]

✓ resolveTemplate — step output refs > missing output key → (unset) [0.020ms]

✓ resolveTemplate — step output refs > null output value → (unset) [0.020ms]

✓ resolveTemplate — mixed and edge cases > multiple refs in one template [0.030ms]

✓ resolveTemplate — mixed and edge cases > no refs → passthrough [0.020ms]

✓ resolveTemplate — mixed and edge cases > empty template → empty string [0.020ms]

✓ resolveTemplate — mixed and edge cases > whitespace inside braces tolerated [0.020ms]

✓ resolveTemplate — mixed and edge cases > unresolvable ref passes through unchanged [0.020ms]

✓ resolveTemplate — mixed and edge cases > adjacent refs [0.030ms]

✓ resolveStepBindings > resolves with entries into StepBindingValue array [0.120ms]

✓ resolveStepBindings > multiple entries resolved [0.040ms]

✓ resolveStepBindings > empty with entries → empty array [0.020ms]

✓ resolveStepBindings > rendered values are quoted [0.060ms]

✓ resolveStepBindings > step output refs in bindings [0.050ms]

✓ resolveStepBindings > literal text without refs rendered as-is [0.030ms]

✓ resolveTemplate — run_count refs (ADR-0048) > steps.<id>.run_count resolves to the named step's count [0.030ms]

✓ resolveTemplate — run_count refs (ADR-0048) > self.run_count resolves to the current step's count [0.020ms]

✓ resolveTemplate — run_count refs (ADR-0048) > run_count of an unentered step resolves to 0 [0.020ms]

✓ runPrimary — basic streaming > publishes message.start before any deltas [81.64ms]

✓ runPrimary — basic streaming > publishes message.delta for text and message.end for finish [7.83ms]

✓ runPrimary — basic streaming > concatenated deltas equal the streamed text [7.56ms]

✓ runPrimary — basic streaming > returns an AssistantMessage with combined text content and usage [5.94ms]

✓ runPrimary — finish reasons > length-stop maps to finishReason 'length' [5.38ms]

✓ runPrimary — finish reasons > error event in the stream produces stopReason 'error' and an errorMessage [6.23ms]

✓ runPrimary — abort > aborting before stream starts surfaces stopReason 'aborted' [4.49ms]

✓ runPrimary — event ordering invariants > first event is always message.start; last event is always message.end [4.81ms]

✓ runPrimary — event ordering invariants > every event after message.start carries the same messageId [4.99ms]

✓ runPrimary — stats enrichment > message.end carries stats with ttft, duration, tps for normal completion [4.57ms]

✓ runPrimary — stats enrichment > stats.cost is null for unknown model (no catalog entry) [4.18ms]

✓ runPrimary — stats enrichment > stats absent when error occurs before any content delta [4.58ms]

✓ runPrimary — stats enrichment > stats absent when abort fires before stream opens [5.06ms]

✓ runPrimary — stats enrichment > AssistantMessage gets ttft and duration populated [4.20ms]

✓ runPrimary — checkpoint tool signal > checkpointRequested populated when model calls kaged.checkpoint [28.82ms]

✓ runPrimary — checkpoint tool signal > checkpointRequested absent on normal completion [15.04ms]

✓ runPrimary — checkpoint tool signal > checkpointRequested with no reason has undefined detail [8.19ms]

✓ runPrimary — checkpoint tool signal > checkpoint tool call publishes tool_call and tool_result events [13.44ms]

✓ runPrimary — interaction tool signals > kaged.ask sets interactionRequested with kind 'ask' [6.73ms]

✓ runPrimary — interaction tool signals > kaged.form sets interactionRequested with kind 'form' [12.84ms]

✓ runPrimary — interaction tool signals > interactionRequested absent on normal completion [3.83ms]

✓ runPrimary — interaction tool signals > interaction and checkpoint are independent signals [6.25ms]

✓ runPrimary — interaction tool signals > kaged.ask tool call publishes tool_call and tool_result events [13.80ms]

✓ agentFromCompiledPrimary — no topology > creates agent with primary name and instructions [0.200ms]

✓ agentFromCompiledPrimary — no topology > creates agent without tools when no topology provided [0.190ms]

✓ agentFromCompiledPrimary — no topology > creates agent without subagents when primary has no subagents [0.140ms]

✓ agentFromCompiledPrimary — with tool topology > resolves primary tools via ToolRegistry [0.710ms]

✓ agentFromCompiledPrimary — with tool topology > glob pattern resolves multiple tools [0.330ms]

✓ agentFromCompiledPrimary — with tool topology > empty tools array produces no tools on agent [0.170ms]

✓ agentFromCompiledPrimary — internal tool overrides > override keeps the registry-resolved rich schema (not passthrough) [0.640ms]

✓ agentFromCompiledPrimary — internal tool overrides > override keeps the registry-resolved description over its fallback [0.430ms]

✓ agentFromCompiledPrimary — internal tool overrides > override falls back to its own schema when the registry has no entry [0.160ms]

✓ agentFromCompiledPrimary — internal tool overrides > override is exposed even when not present in the primary's tool list [0.260ms]

✓ agentFromCompiledPrimary — subagent topology > builds subagent agents map when topology is complete [0.650ms]

✓ agentFromCompiledPrimary — subagent topology > subagent gets its own tools resolved [0.430ms]

✓ agentFromCompiledPrimary — subagent topology > subagent without matching route is skipped [0.280ms]

✓ agentFromCompiledPrimary — subagent topology > no subagents built when topology lacks toolRegistry [0.180ms]

✓ agentFromCompiledPrimary — subagent topology > no subagents built when topology lacks toolDispatch [0.140ms]

✓ agentFromCompiledPrimary — subagent topology > multiple subagents each get independent tool sets [0.440ms]

✓ agentFromCompiledPrimary — recursive subagent nesting > nested subagents are built recursively when routes exist [0.400ms]

Type	Document
adr	ADR-0012: Agentic substrate is Mastra v1.x
adr	ADR-0013: Observability substrate is Langfuse, self-hosted, optional
adr	ADR-0014: All LLM providers route through @kaged/llm; Mastra integrates via a LanguageModelV2 shim
adr	ADR-0016: Streaming-first UI — live data and operator abort are non-negotiable
adr	ADR-0022: Agents are recursive; tools and cage are per-agent
adr	ADR-0023: Project-plugin lifecycle hooks, per-agent declaration, isolation as a core principle
adr	ADR-0024: Context compaction is kaged-owned, layered, observable, and operator-tunable
adr	ADR-0029: Structured operational logging
adr	ADR-0031: An assistant turn is an ordered transcript of parts, not a flattened bubble
spec	Spec: Agent Tooling
spec	Spec: Agent Harness
spec	Spec: LLM Provider Interface
spec	Release pipeline
spec	Spec: Workflows

@kaged/harness

Test results 399

Mentioned in