@kaged/harness

DSL compiler, processor pipeline, provider router, Mastra adapter, compaction engine, checkpoint bridge, and runPrimary runtime entry point

14
source files
12
test files
~8.7k
lines
✓ 266 pass
tests
pass
typecheck
clean
lint

Test results 266

snapshotFromMessages > creates snapshot from messages and system messages [0.920ms]
suspend > returns suspended result with payload [0.140ms]
suspend > returns serialized state as JSON string [0.060ms]
suspend > preserves model-requested reason [0.040ms]
resume > returns resumed result without prompt edits [0.100ms]
resume > detects prompt edits and reports changed agents [0.050ms]
resume > invalid serialized state returns deserialize error [0.070ms]
serializeSnapshot > produces valid JSON string [0.060ms]
serializeSnapshot > round-trips through deserializeSnapshot [0.080ms]
deserializeSnapshot > returns error for invalid JSON [0.040ms]
deserializeSnapshot > returns error for valid JSON but wrong shape [0.030ms]
harnessToLlmMessage > converts system message [0.130ms]
harnessToLlmMessage > converts user message [0.040ms]
harnessToLlmMessage > converts assistant message with text only [0.080ms]
harnessToLlmMessage > converts assistant message with empty content [0.030ms]
harnessToLlmMessage > converts assistant message with tool calls [0.060ms]
harnessToLlmMessage > converts tool result message [0.050ms]
estimateMessages > returns an EstimateResult with tiktoken algorithm for OpenAI model [163.65ms]
estimateMessages > uses maxOutputTokens from modelMeta as reservedOutputTokens [0.120ms]
estimateMessages > defaults to 4096 reservedOutputTokens when modelMeta is null [0.070ms]
estimateMessages > empty messages list still counts system prompt [0.050ms]
checkCompactionTrigger > not triggered when well under threshold [0.120ms]
checkCompactionTrigger > triggered when over threshold [0.460ms]
checkCompactionTrigger > triggered at exact threshold boundary [0.090ms]
buildAlwaysKeepSet > rule 1: keeps system messages [0.100ms]
buildAlwaysKeepSet > rule 2: keeps first user message only [0.040ms]
buildAlwaysKeepSet > rule 3: keeps messages with metadata.always_keep = true [0.050ms]
buildAlwaysKeepSet > rule 3: ignores metadata.always_keep = false [0.030ms]
buildAlwaysKeepSet > rule 4: keeps messages matching operator-configured predicates [0.090ms]
buildAlwaysKeepSet > rule 4: does not match falsy metadata values [0.050ms]
buildAlwaysKeepSet > all four rules combined [0.100ms]
buildAlwaysKeepSet > empty message list returns empty set [0.020ms]
buildAlwaysKeepSet > messages without metadata are unaffected by rule 3 and 4 [0.030ms]
coupleToolPairs > standalone messages produce single-message groups [0.200ms]
coupleToolPairs > assistant with tool calls couples with subsequent tool results [0.100ms]
coupleToolPairs > tool pair group stops at non-matching message [0.050ms]
coupleToolPairs > assistant without tool calls is standalone [0.030ms]
coupleToolPairs > assistant with empty toolCalls array is standalone [0.020ms]
coupleToolPairs > multiple consecutive tool pairs are separate groups [0.050ms]
coupleToolPairs > tool result for unknown call ID is standalone [0.030ms]
coupleToolPairs > empty message list produces empty groups [0.010ms]
coupleToolPairs > preserves chronological order [0.180ms]
executeDrop > drops oldest non-keep messages until below lower threshold [1.66ms]
executeDrop > never drops always-keep messages [0.740ms]
executeDrop > drops tool-pair atomically (call + result together) [0.770ms]
executeDrop > does not drop anything when already below threshold [0.140ms]
executeDrop > preserves message order in output [0.510ms]
executeDrop > drops oldest compactable groups first [0.710ms]
validateToolPairIntegrity > valid when no tool calls present [0.110ms]
validateToolPairIntegrity > valid when all tool pairs are intact [0.040ms]
validateToolPairIntegrity > invalid when tool call has no matching result [0.030ms]
validateToolPairIntegrity > invalid when tool result has no matching call [0.030ms]
validateToolPairIntegrity > valid with multiple tool pairs [0.040ms]
validateToolPairIntegrity > invalid when one of multiple results is missing [0.040ms]
validateToolPairIntegrity > empty list is valid [0.020ms]
executeSummarize > replaces compactable messages with summary system message [0.740ms]
executeSummarize > preserves always-keep messages [0.200ms]
executeSummarize > respects preserve_recent [0.300ms]
executeSummarize > returns unchanged when all messages are always-keep [0.230ms]
executeSummarize > passes window_messages cap to summarizeFn [0.280ms]
executeSummarize > summary message is inserted at position of first compacted group [0.290ms]
executeSummarize > tracks plugin cost from summarizeFn output [0.210ms]
executeDelegate > returns strategy result on valid plugin response [0.510ms]
executeDelegate > returns error when plugin throws [0.240ms]
executeDelegate > returns validation_failed when plugin drops always-keep message [0.230ms]
executeDelegate > returns validation_failed when plugin splits tool pair [0.200ms]
executeDelegate > passes correct input to delegateFn [0.240ms]
runCompactionPipeline > no compaction when below upper threshold [0.920ms]
runCompactionPipeline > compacts when above upper threshold with drop strategy [2.01ms]
runCompactionPipeline > always-keep messages survive drop [1.80ms]
runCompactionPipeline > forced trigger (operator_manual) compacts even when below threshold [0.310ms]
runCompactionPipeline > forced trigger (provider_overflow_retry) compacts even when below threshold [0.340ms]
runCompactionPipeline > summarize strategy produces summary message [0.690ms]
runCompactionPipeline > summarize falls back to drop when summarizeFn throws [0.980ms]
runCompactionPipeline > summarize falls back to drop when summarizeFn not provided [0.830ms]
runCompactionPipeline > summarize falls back to drop when summarize config missing [0.660ms]
runCompactionPipeline > delegate strategy with valid plugin result [0.500ms]
runCompactionPipeline > delegate falls back to drop when plugin throws [0.660ms]
runCompactionPipeline > delegate falls back to summarize when configured [0.460ms]
runCompactionPipeline > delegate falls back to drop when delegateFn not provided [0.600ms]
runCompactionPipeline > delegate falls back to drop when delegate config missing [0.640ms]
runCompactionPipeline > checkpoint strategy returns checkpointRequested with proposed record [1.11ms]
runCompactionPipeline > checkpoint with summarize fallback proposes summarize record [0.440ms]
runCompactionPipeline > checkpoint falls back to drop proposal when summarize fails [0.640ms]
runCompactionPipeline > audit log receives triggered and completed events [0.620ms]
runCompactionPipeline > no audit entries when not triggered [0.160ms]
runCompactionPipeline > record contains correct window thresholds [0.560ms]
runCompactionPipeline > record thresholdEstimate and afterEstimate are token counts [0.590ms]
runCompactionPipeline > record pluginsFired is empty for drop strategy [0.610ms]
runCompactionPipeline > tool pairs are atomically dropped in pipeline [1.07ms]
runCompactionPipeline > pipeline with null modelMeta uses fallback estimator [0.180ms]
runCompactionPipeline > delegate → summarize → drop triple fallback chain [0.680ms]
runCompactionPipeline dry-run > dry-run always triggers even when below threshold [0.220ms]
runCompactionPipeline dry-run > dry-run sets compacted = false [1.48ms]
runCompactionPipeline dry-run > dry-run produces a valid CompactionRecordDraft [1.50ms]
runCompactionPipeline dry-run > dry-run with summarize strategy executes the strategy [0.460ms]
runCompactionPipeline dry-run > dry-run with checkpoint strategy propagates dryRun flag [0.560ms]
runCompactionPipeline dry-run > non-dry-run does not set dryRun flag [0.570ms]
runCompactionPipeline dry-run > dry-run returns proposed messages without modifying originals [1.56ms]
isContextLengthError > returns false for null/undefined [0.100ms]
isContextLengthError > detects Error with context_length message [0.070ms]
isContextLengthError > detects Error with prompt-too-long message [0.050ms]
isContextLengthError > detects Error with token limit message [0.060ms]
isContextLengthError > detects Error with input-too-large message [0.030ms]
isContextLengthError > detects Error with content-too-large message [0.040ms]
isContextLengthError > detects Google RESOURCE_EXHAUSTED [0.030ms]
isContextLengthError > detects exceeds model limit [0.030ms]
isContextLengthError > detects HTTP 413 via errorStatus [0.020ms]
isContextLengthError > detects HTTP 413 via status field [0.020ms]
isContextLengthError > detects LLM AssistantMessage-shaped error object [0.020ms]
isContextLengthError > detects plain string error [0.010ms]
isContextLengthError > returns false for unrelated errors [0.040ms]
isContextLengthError > returns false for empty Error [0.020ms]
isContextLengthError > returns false for non-error primitives [0.030ms]
drop strategy auto-summarize-at-threshold > drop strategy upgrades to summarize when summarizeFn and config present [0.620ms]
drop strategy auto-summarize-at-threshold > drop strategy falls back to pure drop when summarize throws [1.10ms]
drop strategy auto-summarize-at-threshold > drop strategy uses pure drop when summarizeFn not provided [0.560ms]
drop strategy auto-summarize-at-threshold > drop strategy uses pure drop when summarize config missing [0.550ms]
buildDelegationConfig > onDelegationStart writes audit entry with cage kind [0.270ms]
buildDelegationConfig > onDelegationStart uses 'none' cage kind for unknown subagent [0.090ms]
buildDelegationConfig > onDelegationComplete writes audit entry with success + duration [0.150ms]
buildDelegationConfig > onDelegationComplete records error message on failure [0.140ms]
buildDelegationConfig > works without auditLog (no-op logging) [0.090ms]
buildDelegationConfig > forwards to user hooks and passes cage policy [0.150ms]
buildDelegationConfig > messageFilter is undefined when no hook provided [0.030ms]
buildDelegationConfig > messageFilter forwards to hook with cage policy when provided [0.150ms]
createExporter: langfuse mode > returns langfuse-mode exporter when enabled [0.060ms]
createExporter: structured_log fallback > returns structured_log-mode exporter when disabled [0.020ms]
createExporter: span collection > langfuse exporter accepts spans without throwing [0.070ms]
createExporter: span collection > structured_log exporter accepts spans without throwing [0.050ms]
createAuditLog > creates an empty audit log [0.040ms]
createAuditLog > records written entries [0.040ms]
createAuditLog > preserves entry order [0.030ms]
createAuditProcessor > logs pre-generation event on processInput [0.200ms]
createObservabilityProcessor > exports span when processing input [0.100ms]
audit processor is always registered (non-optional) > audit processor works independently of observability config [0.040ms]
routeModel: single alias > resolves alias to provider and model [0.180ms]
routeModel: single alias > credentials included in result [0.030ms]
routeModel: single alias > unknown alias returns alias_not_found [0.030ms]
routeModel: single alias > alias pointing to missing provider returns provider_not_found [0.050ms]
routeModel: direct provider:model > accepts direct provider:model identifier [0.030ms]
routeModel: direct provider:model > direct identifier with missing provider returns error [0.030ms]
routeModel: provider credentials > base_url passed through when configured [0.050ms]
routeWithFallback: fallback chains > uses first provider when healthy [0.130ms]
routeWithFallback: fallback chains > falls back to second when first is down [0.060ms]
routeWithFallback: fallback chains > all providers down returns all_providers_failed [0.060ms]
routeWithFallback: fallback chains > single-target alias with fallback works like routeModel [0.040ms]
isRouted: type guard > returns true for RouteResult [0.020ms]
isRouted: type guard > returns false for RouteError [0.020ms]
runPrimary — basic streaming > publishes message.start before any deltas [74.91ms]
runPrimary — basic streaming > publishes message.delta for text and message.end for finish [7.29ms]
runPrimary — basic streaming > concatenated deltas equal the streamed text [6.46ms]
runPrimary — basic streaming > returns an AssistantMessage with combined text content and usage [5.80ms]
runPrimary — finish reasons > length-stop maps to finishReason 'length' [5.00ms]
runPrimary — finish reasons > error event in the stream produces stopReason 'error' and an errorMessage [6.38ms]
runPrimary — abort > aborting before stream starts surfaces stopReason 'aborted' [4.21ms]
runPrimary — event ordering invariants > first event is always message.start; last event is always message.end [4.75ms]
runPrimary — event ordering invariants > every event after message.start carries the same messageId [5.17ms]
runPrimary — stats enrichment > message.end carries stats with ttft, duration, tps for normal completion [4.46ms]
runPrimary — stats enrichment > stats.cost is null for unknown model (no catalog entry) [4.23ms]
runPrimary — stats enrichment > stats absent when error occurs before any content delta [4.54ms]
runPrimary — stats enrichment > stats absent when abort fires before stream opens [3.61ms]
runPrimary — stats enrichment > AssistantMessage gets ttft and duration populated [4.05ms]
runPrimary — checkpoint tool signal > checkpointRequested populated when model calls kaged.checkpoint [16.67ms]
runPrimary — checkpoint tool signal > checkpointRequested absent on normal completion [4.53ms]
runPrimary — checkpoint tool signal > checkpointRequested with no reason has undefined detail [9.69ms]
runPrimary — checkpoint tool signal > checkpoint tool call publishes tool_call and tool_result events [6.95ms]
runPrimary — interaction tool signals > kaged.ask sets interactionRequested with kind 'ask' [6.33ms]
runPrimary — interaction tool signals > kaged.form sets interactionRequested with kind 'form' [5.76ms]
runPrimary — interaction tool signals > interactionRequested absent on normal completion [3.41ms]
runPrimary — interaction tool signals > interaction and checkpoint are independent signals [5.52ms]
runPrimary — interaction tool signals > kaged.ask tool call publishes tool_call and tool_result events [5.57ms]
toMastraMessages > converts plain user message [0.040ms]
toMastraMessages > converts plain assistant message without tool calls [0.020ms]
toMastraMessages > converts system message [0.030ms]
toMastraMessages > converts assistant message with tool calls to multipart content [0.080ms]
toMastraMessages > converts assistant with tool calls but empty text content [0.050ms]
toMastraMessages > converts toolResult message to tool role [0.040ms]
toMastraMessages > converts full conversation with tool calls and results [0.140ms]
toMastraMessages > handles multiple tool calls in one assistant message [0.070ms]
toMastraMessages > assistant with empty toolCalls array is treated as plain message [0.040ms]
jsonSchemaToZod > string schema produces z.string() [0.710ms]
jsonSchemaToZod > number schema produces z.number() [0.320ms]
jsonSchemaToZod > integer schema produces z.number() [0.090ms]
jsonSchemaToZod > boolean schema produces z.boolean() [0.130ms]
jsonSchemaToZod > string enum produces z.enum() [0.250ms]
jsonSchemaToZod > array schema with items [0.430ms]
jsonSchemaToZod > array schema without items allows anything [0.130ms]
jsonSchemaToZod > object schema with required and optional properties [0.280ms]
jsonSchemaToZod > empty object schema produces z.record() [0.220ms]
jsonSchemaToZod > unknown type falls through to z.unknown() [0.060ms]
jsonSchemaToZod > description is preserved on the zod schema [0.090ms]
jsonSchemaToZod > nested object schemas [0.200ms]
toolDefinitionToMastra > creates a tool with matching id and description [0.190ms]
toolDefinitionToMastra > execute delegates to dispatch and returns data on success [0.400ms]
toolDefinitionToMastra > execute throws on dispatch failure [0.230ms]
toolDefinitionToMastra > execute throws default error when dispatch fails without error detail [0.210ms]
toolDefinitionToMastra > execute returns { success: true } when dispatch succeeds with no data [0.200ms]
toolDefinitionsToRecord > converts array of defs to a record keyed by name [0.180ms]
toolDefinitionsToRecord > empty array produces empty record [0.040ms]
agentFromCompiledPrimary — no topology > creates agent with primary name and instructions [0.120ms]
agentFromCompiledPrimary — no topology > creates agent without tools when no topology provided [0.140ms]
agentFromCompiledPrimary — no topology > creates agent without subagents when primary has no subagents [0.090ms]
agentFromCompiledPrimary — with tool topology > resolves primary tools via ToolRegistry [0.510ms]
agentFromCompiledPrimary — with tool topology > glob pattern resolves multiple tools [0.260ms]
agentFromCompiledPrimary — with tool topology > empty tools array produces no tools on agent [0.090ms]
agentFromCompiledPrimary — internal tool overrides > override keeps the registry-resolved rich schema (not passthrough) [0.550ms]
agentFromCompiledPrimary — internal tool overrides > override keeps the registry-resolved description over its fallback [0.340ms]
agentFromCompiledPrimary — internal tool overrides > override falls back to its own schema when the registry has no entry [0.100ms]
agentFromCompiledPrimary — internal tool overrides > override is exposed even when not present in the primary's tool list [0.200ms]
agentFromCompiledPrimary — subagent topology > builds subagent agents map when topology is complete [0.340ms]
agentFromCompiledPrimary — subagent topology > subagent gets its own tools resolved [0.350ms]
agentFromCompiledPrimary — subagent topology > subagent without matching route is skipped [0.220ms]
agentFromCompiledPrimary — subagent topology > no subagents built when topology lacks toolRegistry [0.100ms]
agentFromCompiledPrimary — subagent topology > no subagents built when topology lacks toolDispatch [0.070ms]
agentFromCompiledPrimary — subagent topology > multiple subagents each get independent tool sets [0.450ms]
agentFromCompiledPrimary — recursive subagent nesting > nested subagents are built recursively when routes exist [0.270ms]
compile: minimal project (primary only) > returns a compiled project with primary agent [0.160ms]
compile: minimal project (primary only) > primary agent has resolved model identifier [0.040ms]
compile: minimal project (primary only) > primary agent has instructions from prompt content [0.030ms]
compile: minimal project (primary only) > primary agent has tools from input [0.030ms]
compile: minimal project (primary only) > primary has empty subagents when none declared [0.030ms]
compile: minimal project (primary only) > primary carries its cage policy [0.020ms]
compile: minimal project (primary only) > prompt files are passed through [0.020ms]
compile: project with subagents > subagents keyed by name on primary [0.060ms]
compile: project with subagents > subagent has resolved model [0.040ms]
compile: project with subagents > subagent has instructions from prompt [0.030ms]
compile: project with subagents > subagent description is set when provided [0.040ms]
compile: project with subagents > subagent cage policy is carried on the agent config [0.030ms]
compile: project with subagents > disabled cage policy preserved [0.030ms]
compile: project with subagents > multiple subagents compiled [0.050ms]
compile: recursive nesting > nested subagents are compiled recursively [0.070ms]
compile: recursive nesting > three-level nesting works [0.060ms]
compile: primary parameters > parameters passed through when provided [0.040ms]
compile: primary parameters > parameters undefined when not provided [0.030ms]
compile: primary description > description passed through when provided [0.030ms]
compile: plugins pass-through > plugins passed through on primary [0.070ms]
compile: plugins pass-through > multiple plugins passed through [0.050ms]
compile: plugins pass-through > plugins undefined when not provided [0.040ms]
compile: plugins pass-through > plugins passed through on subagents [0.060ms]
compile: plugins pass-through > plugins on nested subagents passed through recursively [0.060ms]
compile: compaction pass-through > compaction passed through on primary [0.060ms]
compile: compaction pass-through > compaction undefined when not provided [0.020ms]
compile: compaction pass-through > compaction passed through on subagents [0.040ms]
compile: compaction pass-through > delegate compaction passed through [0.040ms]
compile: compaction pass-through > checkpoint compaction passed through [0.030ms]
compile: compaction pass-through > plugins and compaction together on same agent [0.060ms]
runPipeline: basic flow > returns output with no processors [0.100ms]
runPipeline: basic flow > passthrough processor preserves messages [0.070ms]
runPipeline: processor ordering > processors execute in registration order [0.090ms]
runPipeline: message modification > processor can modify messages [0.060ms]
runPipeline: message modification > processor can modify system messages [0.040ms]
runPipeline: message modification > modifications chain through pipeline [0.080ms]
runPipeline: abort > processor can abort the pipeline [0.060ms]
runPipeline: abort > abort stops subsequent processors [0.050ms]
runPipelineStep: step-aware processing > calls processInputStep when defined [0.160ms]
runPipelineStep: step-aware processing > falls back to processInput when processInputStep not defined [0.060ms]
runPipelineStep: step-aware processing > step abort works like pipeline abort [0.060ms]
pruneToolOutputs > returns unchanged messages when nothing qualifies for pruning [0.260ms]
pruneToolOutputs > returns unchanged when total savings below minimumSavings [0.120ms]
pruneToolOutputs > prunes old large tool results beyond protection window [0.170ms]
pruneToolOutputs > never prunes protected tools [0.230ms]
pruneToolOutputs > preserves most recent tool outputs within protectTokens window [0.140ms]
pruneToolOutputs > sets pruned metadata on pruned messages [0.070ms]
pruneToolOutputs > skips tool results too small to produce savings [0.060ms]
pruneToolOutputs > DEFAULT_PRUNE_CONFIG has expected values [0.030ms]

Mentioned in

Type Document
adr ADR-0012: Agentic substrate is Mastra v1.x
adr ADR-0013: Observability substrate is Langfuse, self-hosted, optional
adr ADR-0014: All LLM providers route through @kaged/llm; Mastra integrates via a LanguageModelV2 shim
adr ADR-0016: Streaming-first UI — live data and operator abort are non-negotiable
adr ADR-0022: Agents are recursive; tools and cage are per-agent
adr ADR-0023: Project-plugin lifecycle hooks, per-agent declaration, isolation as a core principle
adr ADR-0024: Context compaction is kaged-owned, layered, observable, and operator-tunable
adr ADR-0029: Structured operational logging
adr ADR-0031: An assistant turn is an ordered transcript of parts, not a flattened bubble
spec Spec: Agent Tooling
spec Spec: Agent Harness
spec Spec: LLM Provider Interface
spec Spec: Workflows